上海交大医学院统计学上机重点.docx

上传人:b****5 文档编号:30291143 上传时间:2023-08-13 格式:DOCX 页数:36 大小:1.36MB
下载 相关 举报
上海交大医学院统计学上机重点.docx_第1页
第1页 / 共36页
上海交大医学院统计学上机重点.docx_第2页
第2页 / 共36页
上海交大医学院统计学上机重点.docx_第3页
第3页 / 共36页
上海交大医学院统计学上机重点.docx_第4页
第4页 / 共36页
上海交大医学院统计学上机重点.docx_第5页
第5页 / 共36页
点击查看更多>>
下载资源
资源描述

上海交大医学院统计学上机重点.docx

《上海交大医学院统计学上机重点.docx》由会员分享,可在线阅读,更多相关《上海交大医学院统计学上机重点.docx(36页珍藏版)》请在冰豆网上搜索。

上海交大医学院统计学上机重点.docx

上海交大医学院统计学上机重点

H0:

不同组男女构成比相等,π1=π2;H1:

统计结论,P>0.05,按α=0.05水平不拒绝H0,无统计学差异,可认为。

相等。

 

程序解释:

MEAN过程常用的主要统计量关键词包括:

N(样本量)SUM(和)MEAN(均数)RANGE(全距)MIN(最小值)MAX(最大值)STD(标准差)CV(变异系数)VAR(方差)STDERR(标准误)LCLM(总体均数可信区间下限)UCLM(上限)T(检验μ=0时的T值)PRT(t值对应的双侧概率)

datastudent;

inputsex$ageheightweightbirthyymmdd10.;

index=weight/height**2;

cards;

male181.7471.31981-3-21

female19.54.21982-12-4

female181.6258.91981-5-6

male181.7875.21980-1-4

female181.6261.81981-7-12

male191.7672.61981-9-23

;

procprintdata=student;

varsexageheightweightindexbirth;

formatbirthmmddyy10.;

run;

procmeansdata=student;

varageheightweightindex;

run;

从已建立的SAS数据集中读入数据建立新的SAS数据集

libnamecourse'd:

\data';

datacourse.student;

setstudent;

run;

dataa;

setcourse.student;

procprint;

run;

Datab;

Seta;

Run;

数据集的拆分

datamale;

setstudent;

ifsex='male'thenoutput;

run;

datafemale;

setstudent;

ifsex='male'thendelete;

run;

datamalefemale;

setstudent;

ifsex='male'thenoutputmale;

elseoutputfemale;

run;

dataheight;

setstudent;

keepsexageheightindex;

run;

procprint;run;

dataweight;

setstudent;

dropheightbirth;

run;

procprint;run;

多个SAS数据集纵向合并

dataone;

inputname$pidgroupage;

cards;

Liming111154

Wangli112249

Xiaoli113134

;

datatwo;

inputname$piddrug$sex;

cards;

Yaohong211A1

Zhaohong212B2

Mixue213A2

;

datatotal;

setonetwo;

procprintdata=one;

procprintdata=two;

procprintdata=total;

run;

多个SAS数据集横向合并

dataone;

inputpidsexage;

cards;

101154

102245

103242

105134

;

datatwo;

inputpidweightheight;

cards;

10445162

10264171

10354165

10151160

;

procsortdata=one;

bypid;

procsortdata=two;

bypid;

datatotal;

mergeonetwo;

bypid;

procprintdata=total;

run;

Means过程计算各统计量(std标准差)

datashg;

inputx@@;

cards;

108.097.6103.4101.6104.498.5110.5103.8109.7

109.8

104.599.5104.0103.997.2106.3106.2107.6108.3

97.6

102.7103.7107.6103.2103.6103.3102.8102.3102.2

103.3

101.2107.5106.3109.799.5107.4103.4106.6105.7

107.4

103.0109.6106.4107.3100.6112.3100.5101.998.8

99.7

104.3110.2105.395.2105.8105.2106.1103.6106.6

105.1

105.5113.5107.7106.8106.2109.899.7107.9104.8

103.9

106.8106.4108.3106.5103.3107.7106.2100.4102.6

102.1

110.6112.2110.2103.7102.3112.1105.4104.2105.7

104.4

102.8107.8102.5102.3105.8103.7103.1101.6106.5

100.0

103.2109.3105.8106.1104.9105.9105.3103.799.6

106.2

102.5108.1106.1108.399.8108.3104.0100.6112.6103.7

;

procmeansdata=shgnmeanstdcvminmax;

varx;

run;

分组计算各统计量“结果保留三位小数”

dataa;

inputgroupVAVB1@@;

cards;

11.81.421.71.112.21.531.91.222.51.012.71.6

22.31.322.80.933.01.112.61.412.41.221.91.3

32.90.813.21.733.11.522.61.933.51.633.31.5

;

procsortdata=a;bygroup;

procmeansmeanstdmaxminmaxdec=3;

bygroup;

varVAVB1;

run;

计算几何均数(频数表)

dataa;

inputfx@@;

y=log10(x);

cards;

14388161332

2164912842561512

;

procmeansnoprint;

vary;

freqf;

outputout=bmean=meany;

run;

datac;

setb;

meanx=10**(meany);

run;

procprint;

run;

程序解释:

FREQ<变量名>:

规定该变量的值为分析变量的频数。

OUTPUT

指定MEANS过程产生的统计量的输出数据集名。

统计量关键字=<新变量名列>···:

指明在输出数据集中想要的统计量,且指定这些统计量对应的新变量名。

univariate过程输出3种数据图(茎叶图、盒式图、正态概率图),频数表(变量值Value频数Count百分数Cell累计百分数Cum),正态性检验结果

datashg;

inputx@@;

cards;

108.097.6103.4101.6104.498.5110.5103.8109.7109.8

104.599.5104.0103.997.2106.3106.2107.6108.397.6

102.7103.7107.6103.2103.6103.3102.8102.3102.2103.3

101.2107.5106.3109.799.5107.4103.4106.6105.7107.4

103.0109.6106.4107.3100.6112.3100.5101.998.899.7

104.3110.2105.395.2105.8105.2106.1103.6106.6105.1

105.5113.5107.7106.8106.2109.899.7107.9104.8103.9

106.8106.4108.3106.5103.3107.7106.2100.4102.6102.1

110.6112.2110.2103.7102.3112.1105.4104.2105.7104.4

102.8107.8102.5102.3105.8103.7103.1101.6106.5100.0

103.2109.3105.8106.1104.9105.9105.3103.799.6106.2

102.5108.1106.1108.399.8108.3104.0100.6112.6103.7

;

procunivariatedata=shgplotfreqnormal;

varx;

run;

程序解释:

1、TestsforNormality为正态性检验,检验结果P>0.05,可认为是正态分布。

2、UncorrectedSS为平方和;correctedSS为离均差平方和;InterquartileRange四分位数间距。

总体均数的区间估计(计算总体均数的置信区间,99%的置信区间)

datashg;

inputx@@;

cards;

108.097.6103.4101.6104.498.5110.5103.8109.7109.8

104.599.5104.0103.997.2106.3106.2107.6108.397.6

102.7103.7107.6103.2103.6103.3102.8102.3102.2103.3

101.2107.5106.3109.799.5107.4103.4106.6105.7107.4

103.0109.6106.4107.3100.6112.3100.5101.998.899.7

104.3110.2105.395.2105.8105.2106.1103.6106.6105.1

105.5113.5107.7106.8106.2109.899.7107.9104.8103.9

106.8106.4108.3106.5103.3107.7106.2100.4102.6102.1

110.6112.2110.2103.7102.3112.1105.4104.2105.7104.4

102.8107.8102.5102.3105.8103.7103.1101.6106.5100.0

103.2109.3105.8106.1104.9105.9105.3103.799.6106.2

102.5108.1106.1108.399.8108.3104.0100.6112.6103.7

;

procmeansdata=shgnmeanstdclmalpha=0.01;

varx;

run;

T检验:

(1)样本均数与总体均数比较的T检验(总体均数72;t(检验μ=0时的T值);prt(t值对应的双侧概率))

datamb;

inputx@@;

d=x-72;

cards;

7473687575828069

7274837271747679

6773817067707869

707267748066

;

procmeansdata=mbmeanstdstderrtprt;

varxd;

run;

主要看d的t;prt

也可以使用univariate过程

procunivariatedata=mbnormal;

varxd;

run;

(2)配体设计样本的T检验

datach4_7;

inputafterbefore@@;

d=after-before;

cards;

70.5564.29

88.6064.07

68.4445.88

61.6445.23

64.7350.40

74.6861.59

69.1551.85

60.5160.13

65.5964.29

69.0451.93

;

procmeansdata=ch4_7meanstdstderrtprt;

vard;

run;

(3)成组设计两样本均数比较的t检验(ttest过程进行成组t检验;classgroup表示分组变量为group;X为血红蛋白的增加量)

(3)-1

datahb;

inputgroupx@@;

cards;

126132125122120128124119129117134121120123127

221223218224223219216222220225223217215226222

;

procttestdata=hb;

classgroup;

varx;

run;

结果解释:

先看方差齐性检验(EqualityofVariances),P>0.05,方差齐;然后看tValue和对应的P值,P<0.05,因此按α=0.05水准拒绝H0,故可认为两组贫血儿童的血红蛋白的增加量不同,新药组儿童的血红蛋白增加量均数比常规药组大。

(3)-2:

变量变换后成组比较的t检验(抗体滴度—求对数)

dataktdd;

inputgroupx@@;

y=log10(x);

cards;

150130140160160135170120170135140150125

240230225210225230235215220240215230220

;

procttest;

classgroup;

vary;

run;

单因素三水平的方差分析

1、使用循环语句建立SAS数据集:

(@@非常重要)

datadat5_1;

dogroup=1to3;

inputn;

doi=1ton;

inputx@@;

output;

end;

end;

cards;

15

4010352520153515-5302570654550

15

50204555201580-10105751060456030

10

6030100852055453077105

;

run;

2、正态性检验:

procsortdata=dat5_1;

bygroup;

run;

procunivariatedata=dat5_1normal;

varx;

bygroup;

run;

3、方差齐性检验:

使用Levene检验,程序包含在glm和ANOVA过程中。

4、方差分析:

procglmdata=dat5_1;

classgroup;

modelx=group;

meansgroup/hovtest;

meansgroup;

run;

结果解释:

1用glm过程进行方差分析。

2首先用class语句指定分组变量,此为group。

3然后用model语句指定所用模型。

等号左边为因变量,右边为分组变量。

4MEANS关键词后面是分组变量名,后面跟着一个斜杠,接着是这种选择项。

hovtest做方差齐性检验(P>0.05方差齐)

4’、均数间的多重比较:

procglmdata=dat5_1;

classgroup;

modelx=group;

meansgroup/hovtest;

meansgroup/snkbondunnett('1');

meansgroup/snkalpha=0.01;

contrast'12vs3'group-0.5-0.51;

contrast'1vs2'group1-10;

run;

结果解释:

SAS中使用GLM过程步或ANOVA过程步中means语句后的选项来实现各种两两比较,程序中为几种不同的比较方法。

随机区组设计方差分析:

{4个种系(区组);3个处理}

datadat2;

doblock=1to4;

dotreat=1to3;

inputx@@;

output;

end;

end;

cards;

7686115

123885

4081103

123357

;

procglmdata=dat2;

classtreatblock;

modelx=treatblock/p;outputout=rR=RES;

meanstreatblock/snk;

run;

procunivariatedata=rnormal;varres;run;

结果解释:

“/p”要求输出预测值和残差;outputout将预测值和残差输到数据集r;RES为残差的变量名;normal为对残差进行正态性检验。

拉丁方设计方差分析:

datadat3;

doperson=1to5;

dostress=1to5;

inputcloth$x@@;

output;

end;

end;

cards;

B103A121C100D92E95

C102B129D98E124A115

D118C133E103A109B90

E99D122A99B84C100

A102E139B103C104D95

;

procanova;

classpersonstresscloth;

modelx=personstresscloth;

run;

quit;

 

相关分析:

A药在血中和尿中的半衰期:

1、建立数据集

datadat1;

inputx1x2;

cards;

9.97.9

11.28.9

9.48.5

8.49.4

14.812

12.411.5

13.114.5

13.412.3

11.29.2

9.511

10.78.3

9.28.5

;

run;

2、绘制散点图

procplotdata=dat1;

plotx1*x2='*'/haxis=by3vaxis=by3;

run;

过程解释:

使用过程步plot进行绘制散点图。

'*'定义散点的符号为*

haxisvaxis说明间隔的距离

*前为纵坐标,后为横坐标。

2、检验双变量的二元正态分布

procregdata=dat1;

modelx2=x1/p;outputout=rR=RES;

run;

procunivariatedata=dat1normal;varx1;run;

procunivariatedata=rnormal;varres;run;

过程解释:

Reg:

做回归方程估计,回归分析

Model因变量=自变量/p

对残差和X1做正态性检验,若两者的P值皆>0.05,表明符合二元正态分布。

3、进行相关分析

proccorrdata=dat1;

varx1;

withx2;

run;

过程解释:

使用过程步corr进行相关分析。

varx1;withx2;指定欲分析的相关变量。

0.72048是相关系数,0.0082是t检验的P值

 

回归分析:

datadat2;

inputxy;

cards;

18.03

314.97

519.23

727.83

936.23

;

procplotdata=dat2;

ploty*x='*';

run;

procregdata=dat2;

modely=x/p;

ploty*x;

run;

 

intercept截距为a(3.94300);3.46300为b

四格表卡方检验:

题:

西药治疗79例,有效63人;中药治疗54例,有效47人,问两种药物治疗有效率?

datadat1;

dor=1to2;

doc=1to2;

inputfreq@@;

output;

end;

end;

cards;

6316

477

;

run;

procfreqdata=dat1;

tablesr*c/chisq;

weightfreq;

run;

过程解释:

freq频数

过程步用FREQ,TABLES语句定义列表的格式:

行变量*列变量,斜杠后面是选择项,chisq表示要卡方检验。

Weight语句指定频数变量。

结果解释:

4行分别代表频数,总百分比,行百分比和列百分比

 

procfreqdata=dat1;

tablesr*c/chisqnopercentnocolexpected;

weightfreq;

run;

过程解释:

nopercentnocol去掉总百分比和列百分比;expected计算每小格的理论频数。

 

K×2表卡方检验:

题目:

增加中西结合组68例,有效65,三种疗法是否有差异?

datadat2;

dor=1to3;

doc=1to2;

inp

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 高等教育 > 工学

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1