1、判别分析实例汇总例:人文与发展指数是联合国开发计划署于 1990年5月发表的第一份人类发展报告中公布的。该报告建议,目前对人文发展的衡量指标应当以人生的三大要素为重点。 衡量人生的三大要素的指标分别为:实际人均 GDP指数、出生时的预期寿命指数、受教育程度指数(由成人识字率指数和综合总人学率指数按 2/3、1/3的权重加权而得),将一生三个指数合 成为一个指数就是人文发展指数。 今从2007年世界各国人文发展指数 (2005年)的排序中, 选取高发展水平、中等发展水平和低发展水平国家各 6个作为三组样品,另选四个国家作为 待判样品,资料如下表所示。试用判别分析过程对以下数据资料进行判别分析,
2、并据此对待选的四个国家进行判别归类。国家人均GDP(美 元)出生时的预期寿命(岁)成人识字率(%)初等、中等和 高等教育入 学率(%)第一类:高发 展水平国家美国4189077.999.593.3德国2946179.199.288希腊2338178.99699新加坡2966379.492.587.3意大利2852980.398.490.6韩国2202977.99996第二类:中等 发展水平国 家古巴600077.799.887.6罗马尼亚906071.997.376.8巴西840271.788.687.5泰国867769.692.671.2菲律宾51377192.681.1土耳其840771.
3、487.468.7第三类:低发 展水平国家尼泊尔155062.648.658.1尼日利亚112846.569.156.2喀麦隆229949.867.962.3巴基斯坦237064.649.940越南307173.790.363.9印度尼西亚384369.790.468.2待判组日本3126782.39985.9印度345263.76163.8中国675772.590.969.1南非1111050.882.477data develop;in put type gdp life rate zhrate;cards ;1 4189077.999.593.31 2946179.199.2881 23
4、38178.996991 2966379.492.587.31 2852980.398.490.61 2202977.999962 6000 77.799.887.62 9060 71.997.376.82 8402 71.788.687.52 8677 69.692.671.22 5137 7192.681.12 8407 71.487.468.73 1550 62.648.658.13 1128 46.569.156.23 2299 49.867.962.33 2370 64.649.9403 3071 73.790.363.93 3843 69.790.468.2.3126782.399
5、85.9.3452 63.76163.8.6757 72.590.969.1.1111050.882.477proc discrimsimplewcov dista neelist ; /*simple: 要求技术各类样品的简单描述统计量;选项WCOV要求计算类内协方差阵;选项DISTANCE要求计算马氏距离;选项 LIST要求输出重复替换归类结果。由于没有给出方法选项,所以系统按缺省时的正态分布进行有关参数的估计和归类。*/class type;var gdp life rate zhrate;run ;proc discrim poo匸test slpool =0.05 list ; /*
6、simple: */ class type;priors 1 =0.3 2 =0.4 3 =0.3 ;run ;proc discrim method =npar k = 2 list ; /*simple: */ class type;run ;proc can disc out =result ncan =2; /*simple: */classtype;vargdp life rate zhrate;run ;procgplot data =reult; |plotcan 1*ca n2=type;run ;procdiscrim data =resultdista neelist ;c
7、lasstype;var canl can2;run ;表1已知样本分类水平信息The DISCRIM ProcedureObservations Variables Classes1843DF To怙1DF Within ClassesDF Between Classes17152Class LevelInformal ionVariablePriortypeNameFrequencyWeightProportionProbabi1 ity1_16G.00000.$888380.3333332J66.00000.3333330.33333336G.00000.3333330.333333表2
8、样本统计量信息fithin-Class Covariance Matricestype = 1, DF = 5Variablegdpliferatezhrategdp49408532.97-1234.124172.07-11022.031 ife-1234.120.85-0.88-2.09rate4172.07-0.887.432.74zhrate-11022.03-2.092.7421.19type = 2, DF = 5Variablegdpliferatezhrategdp2642240.567-2026.1172419.950-6404.9571 ife-2026.1177.8868.
9、86113.946rate-2419.9508.86123.15114.327zhrate-6404.95713.94614.32764.438type = 3, DF = 5Variablegdpliferatezhrategdp976170.96677840.770012624.07334200.80331 ife7840.7700117.611073.166015.3730rate12624.073373.1660338.6067136.10872hrate4200.803315.3730136.108796.9017Simple StatisticsTot&l-SfthpleStand
10、ardMean V&riance Devi at ionSt.andardVariableNSunMeanV&r idnceDev i at i on油6174S53291694940B6337029life6473.5000078.916670,E49S70.9218rate6584.6000097.488887.434672J267zhrate6554.2000092.866B721J0GG74.6029VariableNSumM郭nVarianceStandardJeviftt iongdp045683?61426422411625life6438.3000072.216677.6856
11、72.8081rate6558.80000S3.0500028J51004.6116zhrate6472.9000078.8166764*437678.(N73type = 3VariableSumNaanVarianceStandard Devi at ionsdp61426123119761719BB.0196life&866.900006L15000117.6110010.9449rate&416.2000069.3B66733B.606G7IB.4019zhrite6348.70000E0J16679G.901679.04394Pooled Covariance Matrix Info
12、mationCovariance Matrix RankNatural Log of the Determinant of the Covariance Matrix428 剧 28表3类间距离及三类总体均值差异的显著性检验Pairwise Squared Distances BetweenGroups2-1 -D(:i|j) =(X-Xy cov (x -X )1 -11 J电|JSquared Distance to typeFrom type1231037.5E2S376.87B03237,58288010.81428375*5780310.914230F Statist gNDF=4,
13、 DDF二12far Squared Distance to typeFrom type1231022.6437345.5S562222.5973QG.54057845.58562B.648570Prob Mahsilanobls Distancefor Squared Distance to tj/peFrom type1231LQOOO.0001.00012.0001LOOOQ0.00498.00010.00431.0000Pa i rff i se Genera 1 ized Squared Di stances Between Groups2-1 _D (ilj) - CX - Xy
14、cov (x -X )i JiJ表3给出了类1与类2之间的马氏距离为 37.58288,类1与类3之间的马氏距离为75.97603,类2与类3之间的马氏距离为 10.91428.类与类之间总体均值的 F检验统计量值分布为 22.54978, 45.58562,22.54973,对应的检验概率分别为 0.0001, 0.0001, ChiSq、 4EQ翱893 20 0.00(18Since the Chi-Square vaIue is significant at the 0.05 I eve I, the within covarifines mat r i css will be use
15、d in th& discriminant funct ion* Reference: Morrison, DF (1976) Multivariate StatisticalMethods p2G2.表7表明3个类的先验概率分别为 0.3, 0.4 , 0.3,类内协方差阵行列式的自然对数不相等,表明类内协方差阵不相等, 而卡方统计量值为 46.068898,对应的概率是0.0008,在0.05的显著性水平下是显著的, 即类内协方差阵存在显著差异。 由于类内协方差阵不等, 所以判别函数应是二次函数。表8类间配对广义马氏距离The DISCRIM ProcedurePaiise Generai
16、zed Squared Distances Between Groups= (X - X )P GOV1 (X - R) + In I GOV | - 2 In PRIORGeneral ized Squared Distainc:总 to typeFrom type12 3124/2114316.0447626192230.6759324.629531Q2.516S73135466.3746128.97226由表8可知,类内广义马氏距离不再为 0,而且类间的广义马氏距离也不再相等,因而类内协方差和先验概率对后验概率的计算是起作用的。表9用Bayes判别法得到的判别分析部分结果Resubst
17、i tut ion Resultssing; Ckiadratic Discriminant Fund ionGenera I i;ed Squared Distance Functiun2 _ -1 _D (X)= (X-X V COY CX-X ) + In I COY IH l! -IK fiJ J J J )Posterior Probabi Iity of Membership in Each type2 2PrCJlX)二 exp(-.5 D (X) / SUM eacp(-.5 D (X)由表9可知,用BAYES判别法对待判样品的判别结果与距离判别法结果一致。本程序中的第三个过程要求进行非参数分析,即对类密度函数进行非
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1