Matlab软件包与Logistic回归.docx
《Matlab软件包与Logistic回归.docx》由会员分享,可在线阅读,更多相关《Matlab软件包与Logistic回归.docx(37页珍藏版)》请在冰豆网上搜索。
Matlab软件包与Logistic回归
Matlab软件包与Logistic回归
在回归分析中,因变量
可能有两种情形:
(1)
是一个定量的变量,这时就用通常的regress函数对
进行回归;(2)
是一个定性的变量,比如,
0或1,这时就不能用通常的regress函数对
进行回归,而是使用所谓的Logistic回归。
Logistic回归的基本思想是,不是直接对
进行回归,而是先定义一种概率函数
,令
要求
。
此时,如果直接对
进行回归,得到的回归方程可能不满足这个条件。
在现实生活中,一般有
。
直接求
的表达式,是比较困难的一件事,于是,人们改为考虑
一般的,
。
人们经过研究发现,令
即,
是一个Logistic型的函数,效果比较理想。
于是,我们将其变形得到:
然后,对
进行通常的线性回归。
例如,Logistic型概率函数
的图形如下:
ezplot('1/(1+300*exp(-2*x))',[0,10])
例1 企业到金融商业机构贷款,金融商业机构需要对企业进行评估。
例如,Moody公司就是NewYork的一家专门评估企业的贷款信誉的公司。
设:
下面列出美国66家企业的具体情况:
YX1X2X3
0-62.8-89.51.7
03.3-3.51.1
0-120.8-103.22.5
0-18.1-28.81.1
0-3.8-50.60.9
0-61.2-56.21.7
0-20.3-17.41.0
0-194.5-25.80.5
020.8-4.31.0
0-106.1-22.91.5
0-39.4-35.71.2
0-164.1-17.71.3
0-308.9-65.80.8
07.2-22.62.0
0-118.3-34.21.5
0-185.9-280.06.7
0-34.6-19.43.4
0-27.96.31.3
0-48.26.81.6
0-49.2-17.20.3
0-19.2-36.70.8
0-18.1-6.50.9
0-98.0-20.81.7
0-129.0-14.21.3
0-4.0-15.82.1
0-8.7-36.32.8
0-59.2-12.82.1
0-13.1-17.60.9
0-38.01.61.2
0-57.90.70.8
0-8.8-9.10.9
0-64.7-4.00.1
0-11.44.80.9
143.016.41.3
147.016.01.9
1-3.34.02.7
135.020.81.9
146.712.60.9
120.812.52.4
133.023.61.5
126.110.42.1
168.613.81.6
137.333.43.5
159.023.15.5
149.623.81.9
112.57.01.8
137.334.11.5
135.34.20.9
149.525.12.6
118.113.54.0
131.415.71.9
121.5-14.41.0
18.55.81.5
140.65.81.8
134.626.41.8
119.926.72.3
117.412.61.3
154.714.61.7
153.520.61.1
135.926.42.0
139.430.51.9
153.17.11.9
139.813.81.2
159.57.02.0
116.320.41.0
121.7-7.81.6
其中,
建立破产特征变量
的回归方程。
解:
在这个破产问题中,
我们讨论
,概率
。
设
=企业2年后具备还款能力的概率,即,
=企业不破产的概率。
因为66个数据有33个为0,33个为1,所以,取分界值0.5,令
由于我们并不知道企业在没有破产前概率
的具体值,也不可能通过
的数据把这个具体的概率值算出来,于是,为了方便做回归运算,我们取区间的中值,
。
数据表变为:
X1X2X3
0.25-62.8-89.51.7
0.253.3-3.51.1
0.25-120.8-103.22.5
0.25-18.1-28.81.1
0.25-3.8-50.60.9
0.25-61.2-56.21.7
0.25-20.3-17.41.0
0.25-194.5-25.80.5
0.2520.8-4.31.0
0.25-106.1-22.91.5
0.25-39.4-35.71.2
0.25-164.1-17.71.3
0.25-308.9-65.80.8
0.257.2-22.62.0
0.25-118.3-34.21.5
0.25-185.9-280.06.7
0.25-34.6-19.43.4
0.25-27.96.31.3
0.25-48.26.81.6
0.25-49.2-17.20.3
0.25-19.2-36.70.8
0.25-18.1-6.50.9
0.25-98.0-20.81.7
0.25-129.0-14.21.3
0.25-4.0-15.82.1
0.25-8.7-36.32.8
0.25-59.2-12.82.1
0.25-13.1-17.60.9
0.25-38.01.61.2
0.25-57.90.70.8
0.25-8.8-9.10.9
0.25-64.7-4.00.1
0.25-11.44.80.9
0.7543.016.41.3
0.7547.016.01.9
0.75-3.34.02.7
0.7535.020.81.9
0.7546.712.60.9
0.7520.812.52.4
0.7533.023.61.5
0.7526.110.42.1
0.7568.613.81.6
0.7537.333.43.5
0.7559.023.15.5
0.7549.623.81.9
0.7512.57.01.8
0.7537.334.11.5
0.7535.34.20.9
0.7549.525.12.6
0.7518.113.54.0
0.7531.415.71.9
0.7521.5-14.41.0
0.758.55.81.5
0.7540.65.81.8
0.7534.626.41.8
0.7519.926.72.3
0.7517.412.61.3
0.7554.714.61.7
0.7553.520.61.1
0.7535.926.42.0
0.7539.430.51.9
0.7553.17.11.9
0.7539.813.81.2
0.7559.57.02.0
0.7516.320.41.0
0.7521.7-7.81.6
于是,在Matlab软件包中编程如下,对
进行通常的线性回归:
X=[1,-62.8,-89.5,1.7;
1,3.3,-3.5,1.1;
1,-120.8,-103.2,2.5;
1,-18.1,-28.8,1.1;
1,-3.8,-50.6,0.9;
1,-61.2,-56.2,1.7;
1,-20.3,-17.4,1;
1,-194.5,-25.8,0.5;
1,20.8,-4.3,1;
1,-106.1,-22.9,1.5;
1,-39.4,-35.7,1.2;
1,-164.1,-17.7,1.3;
1,-308.9,-65.8,0.8;
1,7.2,-22.6,2.0;
1,-118.3,-34.2,1.5;
1,-185.9,-280,6.7;
1,-34.6,-19.4,3.4;
1,-27.9,6.3,1.3;
1,-48.2,6.8,1.6;
1,-49.2,-17.2,0.3;
1,-19.2,-36.7,0.8;
1,-18.1,-6.5,0.9;
1,-98,-20.8,1.7;
1,-129,-14.2,1.3;
1,-4,-15.8,2.1;
1,-8.7,-36.3,2.8;
1,-59.2,-12.8,2.1;
1,-13.1,-17.6,0.9;
1,-38,1.6,1.2;
1,-57.9,0.7,0.8;
1,-8.8,-9.1,0.9;
1,-64.7,-4,0.1;
1,-11.4,4.8,0.9;
1,43,16.4,1.3;
1,47,16,1.9;
1,-3.3,4,2.7;
1,35,20.8,1.9;
1,46.7,12.6,0.9;
1,20.8,12.5,2.4;
1,33,23.6,1.5;
1,26.1,10.4,2.1;
1,68.6,13.8,1.6;
1,37.3,33.4,3.5;
1,59,23.1,5.5;
1,49.6,23.8,1.9;
1,12.5,7,1.8;
1,37.3,34.1,1.5;
1,35.3,4.2,0.9;
1,49.5,25.1,2.6;
1,18.1,13.5,4;
1,31.4,15.7,1.9;
1,21.5,-14.4,1;
1,8.5,5.8,1.5;
1,40.6,5.8,1.8;
1,34.6,26.4,1.8;
1,19.9,26.7,2.3;
1,17.4,12.6,1.3;
1,54.7,14.6,1.7;
1,53.5,20.6,1.1;
1,35.9,26.4,2;
1,39.4,30.5,1.9;
1,53.1,7.1,1.9;
1,39.8,13.8,1.2;
1,59.5,7,2;
1,16.3,20.4,1;
1,21.7,-7.8,1.6];
a0=0.25*ones(33,1);a1=0.75*ones(33,1);
y0=[a0;a1];
Y=log((1-y0)./y0);
[b,bint,r,rint,stats]=regress(Y,X)
rcoplot(r,rint)
执行后得到结果:
b=
0.3914
-0.0069
-0.0093
-0.3263
bint=
0.00730.7755
-0.0105-0.0032
-0.0156-0.0030
-0.5253-0.1273
r=
-0.0037
1.0561
-0.2683
0.6733
0.5028
0.3179
0.7320
-0.7044
1.1361
0.2553
0.4955
-0.1593
-1.7643
1.1984
0.0662
-0.9937
1.3983
0.9988
0.9621
0.3072
0.4942
0.8161
0.3957
0.1141
1.2176
1.2225
0.8670
0.7468
0.8531
0.5777
0.8556
0.2588
0.9675
-0.6179
-0.3984
-0.5943
-0.4360
-0.7585
-0.4476
-0.5541
-0.5288
-0.3687
0.2194
0.9248
-0.3078
-0.7516
-0.4266
-0.9150
-0.0680
0.0653
-0.5082
-1.1506
-0.8882
-0.5701
-0.4191
-0.3540
-0.8289
-0.4239
-0.5720
-0.3449
-0.3153
-0.4396
-0.6967
-0.3640
-0.8616
-0.8919
rint=
-1.43201.4245
-0.39902.5113
-1.69751.1608
-0.78822.1349
-0.92221.9277
-1.14981.7856
-0.73322.1971
-2.06960.6609
-0.30702.5791
-1.20481.7154
-0.97301.9640
-1.56261.2441
-2.9063-0.6223
-0.24992.6466
-1.39251.5249
-1.7217-0.2657
-0.00512.8018
-0.46092.4585
-0.49092.4152
-1.15051.7649
-0.95561.9439
-0.64772.2799
-1.06481.8562
-1.32381.5521
-0.23402.6692
-0.21622.6613
-0.59112.3250
-0.71362.2073
-0.61172.3178
-0.88682.0421
-0.60442.3156
-1.19441.7120
-0.49142.4264
-2.08620.8504
-1.87291.0760
-2.05580.8671
-1.91081.0389
-2.21250.6955
-1.91861.0234
-2.02710.9190
-2.00340.9459
-1.83401.0967
-1.19511.6340
-0.31862.1681
-1.78191.1662
-2.22380.7205
-1.89811.0449
-2.36430.5342
-1.53191.3959
-1.33781.4683
-1.98340.9669
-2.58500.2839
-2.35560.5793
-2.04220.9020
-1.89291.0547
-1.81951.1116
-2.29610.6383
-1.89551.0476
-2.03550.8916
-1.81781.1280
-1.78761.1571
-1.91051.0313
-2.16200.7686
-1.83351.1055
-2.32370.6005
-2.35440.5707
stats=
0.569927.38410.00000.5526
即,得到:
值=0.5699(说明回归方程刻画原问题不是太好),F_检验值=27.3841>0.0000(这个值比较好),与显著性概率
相关的p值=0.5526>
,说明变量
之间存在线性相关关系。
回归方程为:
以及残差图:
通过残差图看出,残差连续的出现在0的上方,或者连续地出现在0的下方,这也暗示变量
之间存在线性相关。
编程计算它们的相关系数:
X=[1,-62.8,-89.5,1.7;
1,3.3,-3.5,1.1;
1,-120.8,-103.2,2.5;
1,-18.1,-28.8,1.1;
1,-3.8,-50.6,0.9;
1,-61.2,-56.2,1.7;
1,-20.3,-17.4,1;
1,-194.5,-25.8,0.5;
1,20.8,-4.3,1;
1,-106.1,-22.9,1.5;
1,-39.4,-35.7,1.2;
1,-164.1,-17.7,1.3;
1,-308.9,-65.8,0.8;
1,7.2,-22.6,2.0;
1,-118.3,-34.2,1.5;
1,-185.9,-280,6.7;
1,-34.6,-19.4,3.4;
1,-27.9,6.3,1.3;
1,-48.2,6.8,1.6;
1,-49.2,-17.2,0.3;
1,-19.2,-36.7,0.8;
1,-18.1,-6.5,0.9;
1,-98,-20.8,1.7;
1,-129,-14.2,1.3;
1,-4,-15.8,2.1;
1,-8.7,-36.3,2.8;
1,-59.2,-12.8,2.1;
1,-13.1,-17.6,0.9;
1,-38,1.6,1.2;
1,-57.9,0.7,0.8;
1,-8.8,-9.1,0.9;
1,-64.7,-4,0.1;
1,-11.4,4.8,0.9;
1,43,16.4,1.3;
1,47,16,1.9;
1,-3.3,4,2.7;
1,35,20.8,1.9;
1,46.7,12.6,0.9;
1,20.8,12.5,2.4;
1,33,23.6,1.5;
1,26.1,10.4,2.1;
1,68.6,13.8,1.6;
1,37.3,33.4,3.5;
1,59,23.1,5.5;
1,49.6,23.8,1.9;
1,12.5,7,1.8;
1,37.3,34.1,1.5;
1,35.3,4.2,0.9;
1,49.5,25.1,2.6;
1,18.1,13.5,4;
1,31.4,15.7,1.9;
1,21.5,-14.4,1;
1,8.5,5.8,1.5;
1,40.6,5.8,1.8;
1,34.6,26.4,1.8;
1,19.9,26.7,2.3;
1,17.4,12.6,1.3;
1,54.7,14.6,1.7;
1,53.5,20.6,1.1;
1,35.9,26.4,2;
1,39.4,30.5,1.9;
1,53.1,7.1,1.9;
1,39.8,13.8,1.2;
1,59.5,7,2;
1,16.3,20.4,1;
1,21.7,-7.8,1.6];
X1=X(:
2);X2=X(:
3);X3=X(:
4);
corrcoef(X1,X2)
corrcoef(X1,X3)
corrcoef(X2,X3)
执行后得到结果:
ans=
1.00000.6409
0.64091.0000
ans=
1.00000.0467
0.04671.0000
ans=
1.0000-0.3501
-0.35011.0000
可见corrcoef(X1,X2)=0.64,这说明,在做回归时,可以去掉
列。
根据经济意义,我们去掉
列,再进行回归。
X=[1,-62.8,-89.5,1.7;
1,3.3,-3.5,1.1;
1,-120.8,-103.2,2.5;
1,-18.1,-28.8,1.1;
1,-3.8,-50.6,0.9;
1,-61.2,-56.2,1.7;
1,-20.3,-17.4,1;
1,-194.5,-25.8,0.5;
1,20.8,-4.3,1;
1,-106.1,-22.9,1.5;
1,-39.4,-35.7,1.2;
1,-164.1,-17.7,1.3;
1,-308.9,-65.8,0.8;
1,7.2,-22.6,2.0;
1,-118.3,-34.2,1.5;
1,-185.9,-280,6.7;
1,-34.6,-19.4,3.4;
1,-27.9,6.3,1.3;
1,-48.2,6.8,1.6;
1,-49.2,-17.2,0.3;
1,-19.2,-36.7,0.8;
1,-18.1,-6.5,0.9;
1,-98,-20.8,1.7;
1,-129,-14.2,1.3;
1,-4,-15.8,2.1;
1,-8.7,-36.3,2.8;
1,-59.2,-12.8,2.1;
1,-13.1,-17.6,0.9;
1,-38,1.6,1.2;
1,-57.9,0.7,0.8;
1,-8.8,-9.1,0.9;
1,-64.7,-4,0.1;
1,-11.4,4.8,0.9;
1,43,16.4,1.3;
1,47,16,1.9;
1,-3.3,4,2.7;
1,35,20.8,1.9;
1,46.7,12.6,0.9;
1,20.8,12.5,2.4;
1,33,23.6,1.5;
1,26.1,10.4,2.1;
1,68.6,13.8,1.6;
1,37.3,33.4,3.5;
1,59,23.1,5.5;
1,49.6,23.8,1.9;
1,12.5,7,1.8;
1,37.3,34.1,1.5;
1,35.3,4.2,0.9;
1,49.5,25.1,2.6;
1,18.1,13.5,4;
1,31.4,15.7,1.9;
1,21.5,-14.4,1;
1,8.5,5.8,1.5;
1,40.6,5.8,1.8;
1,34.6,26.4,1.8;
1,19.9,26.7,2.3;
1,17.4,12.6,1.3;
1,54.7,14.6,1.7