1、最新R语言因子实验设计和解释案例分析报告 附代码数据R语言因子实验设计和解释案例分析报告示例1:两组比较示例2:多个组实例3:两个条件,两个基因型,一个交互项o野生型治疗效果(主效应)。o突变体治疗的效果o没有治疗的突变型和野生型之间有什么区别?o通过治疗,突变型和野生型有什么区别?o基因型的不同反应(相互作用项)实例4:两个条件,三个基因型o基因型I的条件效应(主效应)o基因型III的条件效应。o基因型II的条件效应。o在条件A下III与II的影响o基因型III与基因型I的条件效应的相互作用项基因型III与基因型II的条件效应的相互作用项。为了允许iDEP中的复杂模型(http:/ge-la
2、b.org/idep/),我尝试了解如何构建事实模型,并从DESeq2中提取期望的结果。以下是基于DESeq2中resutls()函数的帮助文档,以及Mike Love对用户提问的回答。我想要做的一个重点是,当研究设计涉及多个因素时(参见上面关于基因型+治疗实例的图),结果的解释是棘手的。与R中的回归分析类似,分类因素的参考水平构成了我们的分歧的基础。然而,默认情况下,它们是按字母顺序确定的。选择每个因素的参考水平是至关重要的。否则你的系数可能会有所不同,这取决于你如何进入DESeq2的实验设计。这可以通过R中的relevel()函数完成。参考级别是构成有意义比较基础的因素的基线级别。在野生型
3、与突变型实验中,“野生型”是参考水平。在治疗与未治疗,参考水平显然是未经处理的。例3中的更多细节。例1:两组比较首先制作一些示例数据。library(DESeq2)dds-makeExampleDESeqDataSet(n=10000,m=6)assay(dds)1:10,# sample1 sample2 sample3 sample4 sample5 sample6# gene1 6 4 11 1 2 13# gene2 9 12 23 13 14 28# gene3 58 121 173 178 118 97# gene4 0 4 0 3 8 3# gene5 27 3 6 9 8 12
4、# gene6 48 8 35 38 21 13# gene7 36 50 61 52 44 22# gene8 6 8 16 14 18 19# gene9 214 266 419 198 157 166# gene10 20 12 16 12 16 2这是一个非常简单的实验设计,有两个条件。colData(dds)# DataFrame with 6 rows and 1 column# condition# # sample1 A# sample2 A# sample3 A# sample4 B# sample5 B# sample6 Bdds-DESeq(dds)resultsName
5、s(dds)# 1 Interceptcondition_B_vs_A这显示了可用的结果。 请注意,默认情况下,R会根据字母顺序为因素选择一个参考级别。 这里A是参考水平。 折叠变化定义为B与A比较。要更改参考级别,请尝试使用“同一个”()函数。res-results(dds, contrast=c(condition,B,A)res-resorder(res$padj),library(knitr)kable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene9056360.168909-2.0453790.00000000.0001366ge
6、ne308743.897516-2.2033030.00001730.0858143gene376372.409877-1.8347870.00004340.1434712gene2054322.4949631.5374080.00006810.1689463gene46176.2274156.1252380.00020190.4008408如果我们想用B作为控制,并用B作为基线定义倍数变化。 那我们可以这样做:res-results(dds, contrast=c(condition,A,B)ix=which.min(res$padj)res-resorder(res$padj),kable
7、(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene9056360.1689092.0453790.00000000.0001366gene308743.8975162.2033030.00001730.0858143gene376372.4098771.8347870.00004340.1434712gene2054322.494963-1.5374080.00006810.1689463gene46176.227415-6.1252380.00020190.4008408正如你所看到的,折叠的方向是完全相反的。 这里我们展示最重要的基因。ba
8、rplot(assay(dds)ix,las=2, main=rownames(dds)ix) 示例2:多个组假设我们有三个组A,B和C.dds-makeExampleDESeqDataSet(n=100,m=6)dds$condition-factor(c(A,A,B,B,C,C)dds-DESeq(dds)res=results(dds, contrast=c(condition,C,A)res-resorder(res$padj),kable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene23.634986-5.1017730.0348
9、6790.5515088gene204.678176-4.4909820.04456640.5515088gene3456.068672-1.4621550.01678200.5515088gene35537.847175-1.1772400.00879130.5515088gene4193.9678101.0647340.04120340.5515088Example 3: two conditions, two genotypes, with an interaction termHere we have two genotypes, wild-type (WT), and mutant
10、(MU). Two conditions, control (Ctrl) and treated (Trt). We are interested in the responses of both wild-type and mutant to treatment. We are also interested in the differences in response between genotypes, which is captured by the interaction term in linear models.First, we construct example data.
11、Note that we changed sample names from “sample1” to “Wt_Ctrl_1”, according to the two factors.dds-makeExampleDESeqDataSet(n=10000,m=12)dds$condition-factor(c(rep(Ctrl,6), rep(Trt,6)dds$genotype-factor(rep(rep(c(WT,MU),each=3),2)colnames(dds)-paste(as.character(dds$genotype),as.character(dds$conditio
12、n),rownames(colData(dds), sep=_)colnames(dds)=gsub(sample,colnames(dds)kable(assay(dds)1:5,)WT_Ctrl_1WT_Ctrl_2WT_Ctrl_3MU_Ctrl_4MU_Ctrl_5MU_Ctrl_6WT_Trt_7WT_Trt_8WT_Trt_9MU_Trt_10MU_Trt_11MU_Trt_12gene18147135728180765752496457gene2707794545436516657514767gene3163782843614107114015gene42513921500148
13、gene5001050000000kable(colData(dds)conditiongenotypeWT_Ctrl_1CtrlWTWT_Ctrl_2CtrlWTWT_Ctrl_3CtrlWTMU_Ctrl_4CtrlMUMU_Ctrl_5CtrlMUMU_Ctrl_6CtrlMUWT_Trt_7TrtWTWT_Trt_8TrtWTWT_Trt_9TrtWTMU_Trt_10TrtMUMU_Trt_11TrtMUMU_Trt_12TrtMUCheck reference levels:dds$condition# 1 Ctrl Ctrl Ctrl Ctrl Ctrl Ctrl Trt Trt
14、 Trt Trt Trt Trt # Levels: Ctrl TrtAs you could see, “Ctrl” apeared first in the 2nd line, indicating it is the reference level for factor condition, as we can expect based on alphabetical order. This is what we want and we do not need to do anything.dds$genotype# 1 WT WT WT MU MU MU WT WT WT MU MU
15、MU# Levels: MU WTBut “Mu” is the reference level for genotype, which is will give us results difficult to interpret. We need to change it.dds$genotype=relevel(dds$genotype, WT)dds$genotype# 1 WT WT WT MU MU MU WT WT WT MU MU MU# Levels: WT MUSet up the model, and run DESeq2:design(dds)-genotype+cond
16、ition+genotype:conditiondds-DESeq(dds)resultsNames(dds)# 1 Interceptgenotype_MU_vs_WT# 3 condition_Trt_vs_CtrlgenotypeMU.conditionTrtBelow, we are going to use the combination of the different results (“genotype_MU_vs_WT”, “condition_Trt_vs_Ctrl”, “genotypeMU.conditionTrt” ) to derive biologically m
17、eaningful comparisons.The effect of treatment in wild-type (the main effect).res=results(dds, contrast=c(condition,Trt,Ctrl)ix=which.min(res$padj)# most significantres-resorder(res$padj),# sortkable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene27525.388362.6078960.00005880.2684851gene2744101.029
18、54-1.6708370.00010990.2684851gene544158.674691.7589210.00012610.2684851gene702174.293591.6108530.00013450.2684851gene7795326.43308-1.6243230.00004070.2684851This is for WT, treated compared with untreated. Note that WT is not mentioned, because it is the reference level. In other words, this is the
19、difference between samples No. 7-9, compared with samples No. 1-3.Here we show the most significant gene.barplot(assay(dds)ix,las=2, main=rownames(dds)ix)The effect of treatment in mutantThis is, by definition, the main effectplusthe interaction term (the extra condition effect in genotype Mutant co
20、mpared to genotype WT).res-results(dds, list(c(condition_Trt_vs_Ctrl,genotypeMU.conditionTrt)ix=which.min(res$padj)# most significantres-resorder(res$padj),# sortkable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene510218.690174.7570571.60e-060.0156910gene7367170.692591.4810161.19e-050.0396834gene
21、835127.772273.0336229.80e-060.0396834gene803453.34342-1.8410231.62e-050.0403724gene727292.82351-1.4149676.70e-050.1125808This measures the effect of treatment in mutant. In other words, samples No. 10-12 compared with samples No. 4-6.Here we show the most significant gene, which is downregulated exp
22、ressed in samples 10-12, than samples 4-6 , as expected.barplot(assay(dds)ix,las=2, main=rownames(dds)ix)What is the difference between mutant and wild-type without treatment?As Ctrl is the reference level, we can just retrieve the “genotype_MU_vs_WT”.res=results(dds, contrast=c(genotype,MU,WT)ix=wh
23、ich.min(res$padj)# most significantres-resorder(res$padj),# sortkable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene510218.69017-4.7145190.00000200.0195594gene228945.047111.7636150.00003330.1218120gene3388122.85582-1.5293230.00003660.1218120gene91539.032502.1040030.00011330.2374845gene835127.7722
24、7-2.6531820.00011900.2374845In other words, this is the samples No.4-6 compared with No. 1-3. Here we show the most significant gene.barplot(assay(dds)ix,las=2, main=rownames(dds)ix)With treatment, what is the difference between mutant and wild-type?res=results(dds, list(c(genotype_MU_vs_WT,genotype
25、MU.conditionTrt)ix=which.min(res$padj)# most significantres-resorder(res$padj),# sortkable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene412750.036771.9250940.00000910.0910878gene387934.67704-2.1332030.00002830.0940387gene479924.11059-2.4593450.00001970.0940387gene544158.67469-1.7447690.00014180.
26、2357640gene5926161.167371.5241630.00013390.2357640This gives us the difference between genotype MU and WT, under condition Trt. In other words, this is the sampless No. 10-12 compared with samples 7-9.Here we show the most significant gene.barplot(assay(dds)ix,las=2, main=rownames(dds)ix)The differe
27、nt response in genotypes (interaction term)Is the effect of treatmentdifferentacross genotypes? This is the interaction term.res=results(dds, name=genotypeMU.conditionTrt)ix=which.min(res$padj)# most significantres-resorder(res$padj),# sortkable(res1:5,-(3:4)baseMeanlog2FoldChangepvaluepadjgene51021
28、8.690177.3957710.00000000.0000341gene544158.67469-3.2976730.00000040.0019104gene803453.34342-2.5640300.00002050.0682558gene885814.686504.7039360.00002970.0742066gene479924.11059-2.9901010.00012340.2463639Here we show the mostsignificant gene.barplot(assay(dds)ix,las=2, main=rownames(dds)ix)The different response in genotypes (interaction term)Is the effect of treatmentdifferentacross genotypes? This is the interaction term.res=results(dds, name=genotypeMU.conditionTrt)ix=which.min(res$padj)# most significan
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1