1、概率论升级版1.计算数字特征的SAS程序(一)不需要输入频数的SAS程序(见课本138页)data ex;input x ;cards;1 2 2 3 3 3 4 5 6 7 8;proc univariate vardef=n;run;(vardef=n是求样本的方差及标准差,如果去掉这个程序段,求出来的是修正后的标准差及修正方差) The UNIVARIATE Procedure Variable: x Moments N (样本个数) 11 Sum Weights(权重值和) 11 Mean (均值) 4 Sum Observations (样本总和) 44 Std Deviation(

2、样本标准差) 2.13200716 Variance (样本的方差) 4.54545455 Skewness (偏度) 0.5065649 Kurtosis (峰度) -0.932 Uncorrected SS 226 Corrected SS (离均差平方和) 50 Coeff Variation 53.3001791 Std Error Mean . 注意此处的样本变异系数Coeff Variation 是以样本标准差做出来的,而在课本上则是以样本的修正标准差做出来的。 . Basic Statistical Measures Location Variability Mean 4.000

3、000 Std Deviation 2.13201 Median (中位数) 3.000000 Variance 4.54545 Mode (众数) 3.000000 Range (极差) 7.00000 Interquartile Range 4.00000 Tests for Location: Mu0=0 Test -Statistic- -p Value- Sign M 5.5 Pr = |M| 0.0010 Signed Rank S 33 Pr = |S| 0.0010 Quantiles (Definition 5) Quantile Estimate (分位数) 100% Ma

4、x 8 99% 8 95% 8 90% 7 75% Q3 6 50% Median 3 25% Q1 2 10% 2 5% 1 1% 1 0% Min 1 Extreme Observations -Lowest- -Highest- Value Obs Value Obs 1 1 4 7 2 3 5 8 2 2 6 9 3 6 7 10 3 5 8 11 (二)需要输入频数的SAS程序(见课本138页)data ex;input x f;cards;5.5 4 7.5 11 9.5 17 11.5 2313.5 18 15.5 14 17.5 10 19.5 3;proc univariat

5、e vardef=n;var x;freq f;run;(vardef=n是求样本的方差及标准差,如果去掉这个程序段,求出来的是修正后的标准差及修正方差,var x 表示计算的是x的数字特征,freq f 表示f为频数) The UNIVARIATE Procedure Variable: x Freq: f Moments N 100 Sum Weights 100 Mean 12.24 Sum Observations 1224 Std Deviation 3.43691722 Variance 11.8124 Skewness 0.09092138 Kurtosis -0.6788022

6、 Uncorrected SS 16163 Corrected SS 1181.24 Coeff Variation 28.0793891 Std Error Mean . Basic Statistical Measures Location Variability Mean 12.24000 Std Deviation 3.43692 Median 11.50000 Variance 11.81240 Mode 11.50000 Range 14.00000 Interquartile Range 6.00000 Tests for Location: Mu0=0 Test -Statis

7、tic- -p Value- Sign M 50 Pr = |M| = |S| F Model 1 0.00150000 0.00150000 0.03 0.8661 Error 8 0.39595000 0.04949375 Corrected Total 9 0.39745000 R-Square Coeff Var Root MSE x Mean 0.003774 9.327963 0.222472 2.385000 Source DF Anova SS Mean Square F Value Pr F a 1 0.00150000 0.00150000 0.03 0.8661 t Te

8、sts (LSD) for x NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 8 Error Mean Square 0.049494 Critical Value of t 2.30600 Comparisons significant at the 0.05 level are indicated by *. Difference a Between 95% Confid

9、ence Comparison Means(样本均值差) Limits(两个样本均值差的95%置信区间) 2 - 1(x2-x1) 0.02500 -0.30615 0.35615 1 - 2(x1-x2) -0.02500 -0.35615 0.30615 3:应用SAS作总体分布参数的假设检验 (1)一个正态总体均值作假设检验的SAS程序(课本186页) data ex;input x ;y=x-14; cards; 10.4 12 13.2 13.7 14.6 15.1 15.5 15.9 ;proc means mean std t prt;var y;run;程序运行的结果为: Me

10、an Std Dev (观测值的标准差) t Value Pr |t| -0.2000000 1.8822479 -0.30 0.7725 T的观测值为-0.30,而根据查表可知当自由度为8,t值为1.860,也就是落在了拒绝域,因此拒绝原假设,接受对立假设。 (2)两个正态总体均值作假设检验的SAS程序(课本179页,例题1.5,公式六,F分布)data xzh;do a=1 to 2;do i=1 to 8;input x ;output;end;end;cards;8.6 8.7 5.6 9.3 8.4 9.3 7.5 7.98 7.9 5.8 9.1 7.7 8.2 7.4 6.6;p

11、roc ttest;class a;var x;run;程序运行的结果为: Lower CL Upper CL Lower CL Upper CL Variable a N Mean Mean(均值) Mean Std Dev Std Dev Std Dev Std Err x 1 8 7.1534 8.1625 9.1716 0.7981 1.207 2.4567 0.4268 x 2 8 6.7426 7.5875 8.4324 0.6682 1.0106 2.0568 0.3573 x Diff (1-2) -0.619 0.575 1.7687 0.815 1.1132 1.7556

12、0.5566 T-Tests Variable Method Variances(方差) DF t Value Pr |t|(只要大于显著水平0.05便可) x Pooled Equal(方差相等时) 14 1.03 0.3191 x Satterthwaite Unequal(方差不相等时)13.6 1.03 0.3196 Equality of Variances Variable Method Num DF Den DF F Value Pr F(只要大于置信度,即0.05便接受原假设) x Folded F 7 7 1.43 0.6509 首先进行F检验,检验其方差是否相等,因为T检验

13、中提供了两种均值的检测方法,一种是方差相等一种是方差不等,然后再通过T检验看均值是否相等。在此题中F的观测值是1.43,而在95%的置信度中F的值为3.79,即落在接受域中,接受方差相等,再看T检验,T值为1.03,而查表知T的0.05置信度的值为1.761,因此接受原假设,认为均值相等。4.应用SAS作正态性检验(详见课本193页)SAS程序为data ex;input x ;cards;10.4 12 13.1 13.8 13.8 14.6 15.1 15.5 15.9;proc univariate normal;run;4.独立性检验的SAS程序data ex;do a=1 to 3;

14、do b=1 to 3;input f ;output;end;end;cards;32 38 58 45 44 28 14 18 23;proc freq;weight f; tables a*b/chisp;run;运行结果:a b Frequency Percent Row Pct Col Pct 1 2 3 Total 1 32 38 58 128 10.67 12.67 19.33 42.67 25.00 29.69 45.31 35.16 38.00 53.21 2 45 44 28 117 15.00 14.67 9.33 39.00 38.46 37.61 23.93 49.4

15、5 44.00 25.69 3 14 18 23 55 4.67 6.00 7.67 18.33 25.45 32.73 41.82 15.38 18.00 21.10 Total 91 100 109 300 30.33 33.33 36.33 100.00 Statistics for Table of a by b Statistic DF Value Prob Chi-Square(卡方检验值) 4 13.5862 0.0087 (概率小于显著水平0.05因此拒绝原假设) Likelihood Ratio Chi-Square 4 13.9366 0.0075 Mantel-Haens

16、zel Chi-Square 1 1.4488 0.2287 Phi Coefficient 0.2128 Contingency Coefficient 0.2081 Cramers V 0.1505 Sample Size = 300 5:应用SAS作单因素试验方差分析 (1)不等重复的情形:(课本200页例题1.1) data ex;do a=1 to 3;input n ;do i=1 to n;input x ;Output;end;end;cards;8 21 29 24 22 25 30 27 2610 20 25 25 23 29 31 24 26 20 216 24 22 2

17、8 25 21 26;proc anova;class a;model x=a; run;(如果要作多重比较并求均值差的置信区间,则增加means a/lsd cldiff;run;) Class Level Information Class Levels Values a 3 1 2 3 Number of observations 24 The ANOVA Procedure Dependent Variable: x Sum of Source DF(自由度) Squares(平方和)Mean Square (均方和) F Value Pr F Model(因素A) 2 6.76666

18、67 3.3833333 0.32 0.7314 Error(误差) 21 223.7333333 10.6539683 Corrected Total (总和) 23 230.5000000 R-Square Coeff Var Root MSE x Mean 0.029356 13.18805 3.264042 24.75000 Source DF Anova SS Mean Square F Value Pr F a 2 6.76666667 3.38333333 0.32 0.7314 The ANOVA Procedure t Tests (LSD) for x NOTE: This

19、 test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 21 Error Mean Square 10.65397 Critical Value of t 2.07961 Comparisons significant at the 0.05 level are indicated by *. Difference a Between 95% Confidence Comparison Means Lim

20、its 1 - 2 1.100 -2.120 4.320 1 - 3 1.167 -2.499 4.833 2 - 1 -1.100 -4.320 2.120 2 - 3 0.067 -3.439 3.572 3 - 1 -1.167 -4.833 2.499 3 - 2 -0.067 -3.572 3.439 (2)等重复的情形:(课本201页,例题1.2) data ex;do a=1 to 4;do i=1 to 4;input x ;output;end;end;cards;19 23 21 13 21 24 27 2020 18 19 15 22 25 27 22;proc anov

21、a;class a;model x=a;means a/lsd cldiff;run; Class Level Information Class Levels Values a 4 1 2 3 4 Number of observations 16 The ANOVA Procedure Dependent Variable: x Sum of Source DF(自由度)Squares(平方和) Mean Square F Value Pr F Model(因素A) 3 104.0000000 34.6666667 3.53 0.0487 Error(误差) 12 118.0000000

22、9.8333333 Corrected Total(总和) 15 222.0000000 R-Square Coeff Var Root MSE x Mean 0.468468 14.93245 3.135815 21.00000 Source DF Anova SS Mean Square F Value Pr F a 3 104.0000000 34.6666667 3.53 0.0487 The ANOVA Procedure t Tests (LSD) for x NOTE: This test controls the Type I comparisonwise error rate

23、, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 9.833333 Critical Value of t 2.17881 Least Significant Difference 4.8312 Comparisons significant at the 0.05 level are indicated by *. Difference a Between 95% Confidence Comparison Means Limits 4 - 2 1.000

24、 -3.831 5.831 4 - 1 5.000 0.169 9.831 * 4 - 3 6.000 1.169 10.831 * 2 - 4 -1.000 -5.831 3.831 2 - 1 4.000 -0.831 8.831 2 - 3 5.000 0.169 9.831 * 1 - 4 -5.000 -9.831 -0.169 * 1 - 2 -4.000 -8.831 0.831 1 - 3 1.000 -3.831 5.831 3 - 4 -6.000 -10.831 -1.169 * 3 - 2 -5.000 -9.831 -0.169 * 3 - 1 -1.000 -5.8

25、31 3.831 此处的PrF值小于显著性水平0.05,所以落在拒绝域,因此拒绝原假设,接受对立假设,认为苗高之间有显著差异。6.双因素试验方差分析(不考虑交互作用)(课本212页例题2.1)data anova01;do a=1 to 4; do b=1 to 5;input x ; output; end; end;cards;53 56 45 52 49 47 50 47 47 5357 63 54 57 58 45 52 42 41 48;proc anova; class a b; model x=a b;means a b/lsd duncan cldiff; run; 运行结果: Class Level Information Class Levels Values a 4 1 2 3 4 b 5 1 2 3 4 5 Number of observations 20 Dependent Variable: x

