1、上海交大医学院统计学上机重点H0:不同组男女构成比相等,1=2;H1:。统计结论,P0.05,按=0.05水平不拒绝H0,无统计学差异,可认为。相等。程序解释:MEAN过程常用的主要统计量关键词包括:N(样本量)SUM(和)MEAN(均数) RANGE(全距) MIN(最小值) MAX(最大值) STD(标准差) CV(变异系数) VAR(方差) STDERR(标准误) LCLM(总体均数可信区间下限) UCLM(上限) T(检验=0时的T值) PRT(t值对应的双侧概率)data student;input sex $ age height weight birth yymmdd10.; i
2、ndex=weight/height*2;cards;male 18 1.74 71.3 1981-3-21female 19 . 54.2 1982-12-4female 18 1.62 58.9 1981-5-6male 18 1.78 75.2 1980-1-4female 18 1.62 61.8 1981-7-12male 19 1.76 72.6 1981-9-23;proc print data=student;var sex age height weight index birth;format birth mmddyy10.;run;proc means data=stud
3、ent;var age height weight index ;run;从已建立的SAS数据集中读入数据建立新的SAS数据集libname course d:data;data course.student;set student;run;data a;set course.student;proc print;run;Data b;Set a;Run;数据集的拆分data male;set student;if sex=male then output;run;data female;set student;if sex=male then delete;run;data male fem
4、ale;set student;if sex=male then output male;else output female;run;data height;set student;keep sex age height index;run;proc print ;run;data weight;set student;drop height birth;run;proc print ;run;多个SAS数据集纵向合并data one;input name $ pid group age;cards;Liming 111 1 54Wangli 112 2 49Xiaoli 113 1 34;
5、data two;input name $ pid drug $ sex;cards;Yaohong 211 A 1Zhaohong 212 B 2Mixue 213 A 2;data total;set one two;proc print data=one;proc print data=two;proc print data=total;run;多个SAS数据集横向合并data one;input pid sex age;cards;101 1 54102 2 45103 2 42105 1 34;data two;input pid weight height;cards;104 45
6、 162102 64 171103 54 165101 51 160;proc sort data=one;by pid;proc sort data=two;by pid;data total;merge one two;by pid;proc print data=total;run;Means过程计算各统计量(std标准差)data shg;input x ;cards;108.0 97.6 103.4 101.6 104.4 98.5 110.5 103.8 109.7109.8104.5 99.5 104.0 103.9 97.2 106.3 106.2 107.6 108.397.
7、6102.7 103.7 107.6 103.2 103.6 103.3 102.8 102.3 102.2103.3101.2 107.5 106.3 109.7 99.5 107.4 103.4 106.6 105.7107.4103.0 109.6 106.4 107.3 100.6 112.3 100.5 101.9 98.899.7104.3 110.2 105.3 95.2 105.8 105.2 106.1 103.6 106.6105.1105.5 113.5 107.7 106.8 106.2 109.8 99.7 107.9 104.8103.9106.8 106.4 10
8、8.3 106.5 103.3 107.7 106.2 100.4 102.6102.1110.6 112.2 110.2 103.7 102.3 112.1 105.4 104.2 105.7104.4102.8 107.8 102.5 102.3 105.8 103.7 103.1 101.6 106.5100.0103.2 109.3 105.8 106.1 104.9 105.9 105.3 103.7 99.6106.2102.5 108.1 106.1 108.3 99.8 108.3 104.0 100.6 112.6 103.7;proc means data= shg n m
9、ean std cv min max;var x;run;分组计算各统计量“结果保留三位小数”data a; input group VA VB1;cards;1 1.8 1.4 2 1.7 1.1 1 2.2 1.5 3 1.9 1.2 2 2.5 1.0 1 2.7 1.6 2 2.3 1.3 2 2.8 0.9 3 3.0 1.1 1 2.6 1.4 1 2.4 1.2 2 1.9 1.3 3 2.9 0.8 1 3.2 1.7 3 3.1 1.5 2 2.6 1.9 3 3.5 1.6 3 3.3 1.5;proc sort data=a ;by group;proc means me
10、an std max min maxdec=3;by group;var VA VB1;run;计算几何均数(频数表)data a;input f x ;y=log10(x);cards;1 4 3 8 8 16 13 3221 64 9 128 4 256 1 512;proc means noprint;var y;freq f;output out=b mean=meany;run;data c;set b;meanx=10*(meany);run;proc print;run;程序解释:FREQ:规定该变量的值为分析变量的频数。OUTPUT:指定MEANS过程产生的统计量的输出数据集名
11、。统计量关键字=:指明在输出数据集中想要的统计量,且指定这些统计量对应的新变量名。univariate过程输出3种数据图(茎叶图、盒式图、正态概率图),频数表(变量值Value频数Count百分数Cell累计百分数Cum),正态性检验结果data shg; input x ;cards;108.0 97.6 103.4 101.6 104.4 98.5 110.5 103.8 109.7 109.8104.5 99.5 104.0 103.9 97.2 106.3 106.2 107.6 108.3 97.6102.7 103.7 107.6 103.2 103.6 103.3 102.8 1
12、02.3 102.2 103.3101.2 107.5 106.3 109.7 99.5 107.4 103.4 106.6 105.7 107.4103.0 109.6 106.4 107.3 100.6 112.3 100.5 101.9 98.8 99.7104.3 110.2 105.3 95.2 105.8 105.2 106.1 103.6 106.6 105.1105.5 113.5 107.7 106.8 106.2 109.8 99.7 107.9 104.8 103.9106.8 106.4 108.3 106.5 103.3 107.7 106.2 100.4 102.6
13、 102.1110.6 112.2 110.2 103.7 102.3 112.1 105.4 104.2 105.7 104.4102.8 107.8 102.5 102.3 105.8 103.7 103.1 101.6 106.5 100.0103.2 109.3 105.8 106.1 104.9 105.9 105.3 103.7 99.6 106.2102.5 108.1 106.1 108.3 99.8 108.3 104.0 100.6 112.6 103.7;proc univariate data=shg plot freq normal;var x;run;程序解释:1、
14、Tests for Normality为正态性检验,检验结果P0.05,可认为是正态分布。2、Uncorrected SS为平方和;corrected SS为离均差平方和;Interquartile Range四分位数间距。总体均数的区间估计(计算总体均数的置信区间,99%的置信区间)data shg; input x ;cards;108.0 97.6 103.4 101.6 104.4 98.5 110.5 103.8 109.7 109.8104.5 99.5 104.0 103.9 97.2 106.3 106.2 107.6 108.3 97.6102.7 103.7 107.6 1
15、03.2 103.6 103.3 102.8 102.3 102.2 103.3101.2 107.5 106.3 109.7 99.5 107.4 103.4 106.6 105.7 107.4103.0 109.6 106.4 107.3 100.6 112.3 100.5 101.9 98.8 99.7104.3 110.2 105.3 95.2 105.8 105.2 106.1 103.6 106.6 105.1105.5 113.5 107.7 106.8 106.2 109.8 99.7 107.9 104.8 103.9106.8 106.4 108.3 106.5 103.3
16、 107.7 106.2 100.4 102.6 102.1110.6 112.2 110.2 103.7 102.3 112.1 105.4 104.2 105.7 104.4102.8 107.8 102.5 102.3 105.8 103.7 103.1 101.6 106.5 100.0103.2 109.3 105.8 106.1 104.9 105.9 105.3 103.7 99.6 106.2102.5 108.1 106.1 108.3 99.8 108.3 104.0 100.6 112.6 103.7;proc means data=shg n mean std clm
17、alpha=0.01; var x;run;T检验:(1)样本均数与总体均数比较的T检验(总体均数72 ;t(检验=0时的T值); prt(t值对应的双侧概率) data mb; input x ; d=x-72;cards;74 73 68 75 75 82 80 69 72 74 83 72 71 74 76 79 67 73 81 70 67 70 78 69 70 72 67 74 80 66;proc means data=mb mean std stderr t prt;var x d;run;主要看d的t;prt也可以使用univariate过程proc univariate d
18、ata=mb normal;var x d;run;(2)配体设计样本的T检验data ch4_7; input after before ; d=after-before;cards;70.55 64.2988.60 64.0768.44 45.8861.64 45.2364.73 50.4074.68 61.5969.15 51.8560.51 60.1365.59 64.2969.04 51.93;proc means data=ch4_7 mean std stderr t prt; var d;run;(3)成组设计两样本均数比较的t检验(ttest过程进行成组t检验;class g
19、roup表示分组变量为group;X为血红蛋白的增加量)(3)-1data hb; input group x ; cards; 1 26 1 32 1 25 1 22 1 20 1 28 1 24 1 19 1 29 1 17 1 34 1 21 1 20 1 23 1 27 2 21 2 23 2 18 2 24 2 23 2 19 2 16 2 22 2 20 2 25 2 23 2 17 2 15 2 26 2 22 ;proc ttest data=hb;class group;var x;run;结果解释:先看方差齐性检验(Equality of Variances),P0.05,
20、方差齐;然后看t Value和对应的P值,P0.05方差齐)4、均数间的多重比较:proc glm data=dat5_1; class group; model x=group; means group/hovtest; means group/snk bon dunnett(1); means group/snk alpha=0.01; contrast 1 2 vs 3 group -0.5 -0.5 1; contrast 1 vs 2 group 1 -1 0; run;结果解释:SAS中使用GLM过程步或ANOVA过程步中means语句后的选项来实现各种两两比较,程序中为几种不同的
21、比较方法。随机区组设计方差分析:4个种系(区组);3个处理data dat2; do block=1 to 4; do treat=1 to 3; input x ; output; end; end;cards;76 86 11512 38 8540 81 10312 33 57;proc glm data=dat2; class treat block; model x=treat block/p;output out=r R=RES; means treat block / snk; run; proc univariate data=r normal;var res;run;结果解释:
22、“/p”要求输出预测值和残差;output out将预测值和残差输到数据集r; RES为残差的变量名;normal为对残差进行正态性检验。拉丁方设计方差分析:data dat3; do person=1 to 5; do stress=1 to 5; input cloth $ x ; output; end; end;cards;B 103 A 121 C 100 D 92 E 95C 102 B 129 D 98 E 124 A 115D 118 C 133 E 103 A 109 B 90E 99 D 122 A 99 B 84 C 100A 102 E 139 B 103 C 104
23、D 95;proc anova; class person stress cloth; model x=person stress cloth;run;quit;相关分析:A药在血中和尿中的半衰期:1、建立数据集data dat1;input x1 x2;cards;9.9 7.911.2 8.99.4 8.58.4 9.414.8 1212.4 11.513.1 14.513.4 12.311.2 9.29.5 1110.7 8.39.2 8.5;run;2、绘制散点图proc plot data=dat1;plot x1*x2=*/haxis=by 3 vaxis=by 3;run;过程解
24、释:使用过程步plot进行绘制散点图。*定义散点的符号为*haxis vaxis说明间隔的距离*前为纵坐标,后为横坐标。2、检验双变量的二元正态分布proc reg data=dat1;model x2=x1/p;output out=r R=RES;run;proc univariate data=dat1 normal;var x1;run;proc univariate data=r normal;var res;run;过程解释:Reg:做回归方程估计,回归分析Model 因变量=自变量/p对残差和X1做正态性检验,若两者的P值皆0.05,表明符合二元正态分布。3、进行相关分析proc
25、 corr data=dat1;var x1;with x2;run;过程解释:使用过程步corr进行相关分析。var x1;with x2;指定欲分析的相关变量。0.72048是相关系数,0.0082是t检验的P值回归分析:data dat2;input x y;cards;1 8.033 14.975 19.237 27.839 36.23;proc plot data=dat2;plot y*x=*;run;proc reg data=dat2;model y=x/p;plot y*x;run;intercept 截距为a(3.94300);3.46300为b四格表卡方检验:题:西药治疗
26、79例,有效63人;中药治疗54例,有效47人,问两种药物治疗有效率?data dat1; do r=1 to 2; do c=1 to 2; input freq ; output; end; end;cards;63 1647 7;run;proc freq data=dat1; tables r*c/chisq; weight freq;run;过程解释:freq 频数过程步用FREQ,TABLES语句定义列表的格式:行变量*列变量,斜杠后面是选择项,chisq表示要卡方检验。Weight语句指定频数变量。结果解释: 4行分别代表频数,总百分比,行百分比和列百分比proc freq data=dat1;tables r*c/chisq nopercent nocol expected; weight freq;run;过程解释:nopercent nocol去掉总百分比和列百分比;expected计算每小格的理论频数。K2表卡方检验:题目:增加中西结合组68例,有效65,三种疗法是否有差异?data dat2; do r=1 to 3; do c=1 to 2; inp
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1