高通量测序文献参考Word文件下载.docx

资源描述

高通量测序文献参考Word文件下载.docx

《高通量测序文献参考Word文件下载.docx》由会员分享，可在线阅读，更多相关《高通量测序文献参考Word文件下载.docx（20页珍藏版）》请在冰豆网上搜索。

高通量测序文献参考Word文件下载.docx

SupplementaryMethods

DNAextraction,PCRand454pyrosequencing

ThegenomicDNAwasextractedfromeachtailingssubsamplewithamodifiedindirectDNAextractionprotocolasdescribedpreviously（Tanetal.,2008）.Briefly,cellswererecoveredfromabout20gtailingsbycentrifugationat900×

gat4º

Cfor10min,using20mLsodiumpyrophosphate（pH3.0orpH7.0）asdispersalreagent（Duarteetal.,1998）,thenthesupernatantwascollected.Thisrecoverystepwasrepeatedtwice.Thecollectedsupernatantwascentrifugatedat10,000×

Cfor15mintopelletthecells,thenthesupernatantwasremoved.Thecellpelletsobtainedweretreatedwith20mLof0.3Mammoniumoxalate（pH3.0orpH7.0）for20mintodissolvemostoftheironprecipitate（McKeagueandDay,1966）,followedbycentrifugationat10,000×

Ctopelletthecells,thesupernatantwasremovedandthisstepwasrepeateduntilthesupernatantturnedcolorless.DNAfromthecellpelletswasextractedwithaFastDNAKitforsoil（QbiogeneInc.,Carlsbad,CA）followingthemanufacturer’sinstructions.Theuniversalprimerset515F/806R（Batesetal.,2010）wasusedtoamplifythebacterialandarchaeal16SrRNAgenessimultaneously,withan8-bpbarcodespecifictotailingssubsampleontheprimer806R.Theprimersequenceswereasfollows:

（i）CGTATCGCCTCCCTCGCGCCATCAGCAGTGCCAGCMGCCGCGGTAA,theunderlinedsequenceistheLinkPrimerSequence,the‘CA’inblueisthetwo-baseprotectingsequenceontheforwardprimersequence,thesequenceingreenistheprimer515F;

（ii）CTATGCGCCTTGCCAGCCCGCTCAGAACGAACGTCGGACTACVSGGGTATCTAAT,theunderlinedsequenceistheLinkPrimerSequence,the8-bpsequenceinredisthebarcodesequencespecifictotailingssubsample（seeTableS2forallthebarcodes）,the‘TC’inblueisthetwo-baseprotectingsequenceonthereverseprimersequence,thesequenceingreenistheprimer806R.PCRreactions（30µ

L）contained0.75unitsExTaqDNApolymerase（TaKaRa,Dalian,China）,1×

ExTaqloadingbuffer（TaKaRa,Dalian,China）,0.2mMdNTPmix（TaKaRa,Dalian,China）,0.2µ

Mofeachprimerandabout100ngtemplateDNA.PCRamplificationwasconductedaccordingtotheprocedureasfollows:

initialdenaturationat95º

Cfor3min;

35cyclesofdenaturationat94º

Cfor30s,primerannealingat50º

Cfor1min,extensionat72º

Cfor1min;

afinalextensionof10minat72º

C.Foreachtailingssubsample,thePCRreactionwasconductedintriplicateandtheproductswerepooledtomitigatePCRamplificationbiases.ThecompositesampleforpyrosequencingwascreatedbycombiningequimolarratiosofamplificationproductsfromindividualsubsamplesasdescribedbyFiereretal.（2008）,followedbygelpurificationusingQIAquickGelExtractionKit（Qiagen,Chatsworth,CA）.ThepurifiedcompositeDNAsamplewassenttoMacrogenInc.（Seoul,Korea）forpyrosequencingona454GSFLXTitaniumpyrosequencer（Roche454LifeSciences,Branford,CT,USA）.

Processingof454pyrosequencingdata

Pyrosequencingdataanalysiswasperformedwithversion1.26ofthemothursoftwarepackage（Schlossetal.,2009）asdescribedbySchlossetal.（2011）.Giventheinflationofbiodiversityestimateofsequencesfrom454pyrosequencing（Kuninetal.,2010）,thesequencesweredenoisedusingthecommandsof‘shhh.flows’（translationofPyroNoisealgorithm;

Quinceetal.,2009）and‘pre.cluster’（Huseetal.,2010）.Additionally,thechimericsequenceswereidentifiedandremovedusingChimericUchime（Edgaretal.,2011）.Wealsoremovedthesequenceswith:

（i）asequencelength<

280bp;

and/or（ii）eightormorehomopolymers;

and/or（iii）oneormoreambiguousbases.TheOTUswereidentifiedatthesequenceidentitylevelof97%usingthe‘cluster’commandwiththeaverageclusteringalgorithm（Huseetal.,2010）.Subsequently,arepresentativesequencewasselectedfromeachOTUandthetaxonomicassignmentwasachievedusingtheRibosomalDatabaseProject（RDP）Classifier（Wangetal.,2007）withaminimumconfidenceof80%.Thealphamicrobialbiodiversityofthe18tailingssubsampleswasestimatedbytheabundance-basedindicesofChao1,ShannonandSimpson.5,000qualitysequenceswererandomlysampled（iterations,10）fromeachofthe18tailingssubsample,andtheaveragevalueofeachtailingssamplewascalculatedbasedonthevaluesofcorrespondingthreetailingssubsamples.

Metagenomicssequencingandanalysis

Libraryconstructionandrandomshotgunsequencing.ForT2andT6tailingssamples,genomicDNAextractedfromthethreesubsamplesofeachsamplewerepooledandpurifiedwithgelelectrophoresis.ThepurifiedDNAsampleswerethensenttoBGIInc.（Shenzhen,China）forshotgunlibraryconstructionandIlluminasequencing.Forbothsamples,wholegenomeshotgunsequencinglibrarieswithinsertsizeof180bpweregenerated,thenwerepaired-endsequenced（90bp×

2）byIllumina’sHiSeq（2000）platform.

Artifactfilteringandqualitycontrol.TherawIlluminasequencedata（2GBforeachmetagenome）werepassedseveralfilteringandcontrolstepstoobtaincleansequencedataasfollows:

（i）thereadswithadaptercontaminationwereidentifiedandremoved;

（ii）theduplicateswereidentifiedandremoved;

（iii）forthenon-duplicatereads,thereadscontainmorethan18Nwereidentifiedandremoved;

and（iv）theretainedreadsweretrimatthe3’endtoremovethebaseswithaqualityscoreof<

20,andthereadswithover20%oflow-quality（qualityscore<

20）baseswerealsoremoved.Theobtainedcleanreadswereusedforfurtheranalysis.

Wholemetagenomeassembly.Thecleanreadsweredenovoassembledusingvelvet（version1.1.04）（ZerbinoandBirney,2008）,usingoptionsins_length=180,exp_cov=auto.Wetriedtoassemblybothmetagenomesusingoptionskfrom21to55,thenthebestassemblyresultswereselectedbasedonthelengthofN50contigandlongestcontig.Asaresult,thebestk-mervalueforT2metagenomewas45（N50contig:

522bp;

longestcontig:

60233bp）,andthatvalueforT6metagenomewas51（N50contig:

955bp;

40620bp）.

Microbialcommunitycompositionanalysis.TwostrategieswereemployedtorevealthemicrobialcompositionofT2andT6metagenomes:

（i）The16SrRNAgeneswereidentifiedusingBLASTnagainsttheRDPdatabase（release10）（Coleetal.,2009）fromallthecontigs（e-valuethreshold=10-5）,andthetaxonomicassignmentoftheidentified16SrRNAwiththeanchors≥100bpwasachievedusingtheRDPClassifierwithaminimumconfidenceof80%;

and（ii）thecontigs（≥300bp）werecomparedagainsttheNationalCenterforBiotechnologyInformation（NCBI）non-redundant（nr）database（e-valuethreshold=10-5）,thenthecontigswereclassifiedintotaxonomicgroupswiththelowestancestoralgorithminMEGAN（Husonetal.,2011）withdefaultparameters（minimumscore,35;

minimumsupport,1;

toppercent,10%）.

Genepredictionandfunctionalannotation.ThecontigshadreliableNCBI-nrhits,asindicatedbyMEGAN,wereextractedforfurtheranalysis.TheobtainedcontigsweresubjecttogenepredictionusingGenemarkwithdefaultparameters（Zhuetal.,2010）,whichyielded51981and49538putativeprotein-codinggenesforT2andT6metagenome,respectively（TableS5）.Wethencomparedtheseputativeprotein-codinggenesagainsttheNCBI-nrdatabase,andtheoneswithNCBI-nrhitswerefurthercomparedagainsttheKyotoEncyclopediaofGenesandGenomes（KEGG）database,andtheClustersofOrthologousGroupsofproteins（COG）database,usingBLASTx（e-valuethreshold=10-5）.

Genomebinning.BasedonthecontigsblastingresultsandMEGANanalysis（minimumscore,35;

toppercent,10%）,thedominatinggenusinT2andT6metagenomeswerebinned.Asaresult,theinformationofthelargestbinsisshowninTableS6.

Contigscoverageestimate.Forthecoverageestimateofcontigs,wefirstlyalignedthecleanreadsusedforassemblytothecontigsusingSOAPAligner（Lietal.,2009）,threestepswerethenconducted:

（i）theindexwerebuiltusingallthecontigsfromassemblyresults（2bwt-builder）;

（ii）aligncleanreadsagainstthecontigsbasedindex（soap）;

and（iii）theSOAP.COVERAGE（Lietal.,2009）wasusedtoparsetheoutputfileofSOAPAligner.ThecoverageestimateofcontigsisshowninFig.S7.

ThefunctionalabundanceprofileanalysisofCOGcataloguesandCOGcategories

BasedontheCOGblastresults,thepredictedgeneswithreliableCOGblasthitswereassignedtoCOGcataloguesandCOGcategories（ifavailable）.TodeterminewhetheraspecificCOGcatalogueorCOGcategorywasenrichedinourmetagenomes,theoddsratioforaspecificCOGcatalogueorCOGcategoryagainstthatinallsequencedbacteriaandarchaeawascalculatedasfollows.

Where:

A=No.ofgenesassignedtoaspecificCOGcatalogue（orCOGcategory）inmetagenomeT2（orT6）

B=No.ofgenesassignedtoallotherCOGcatalogues（orCOGcategories）inmetagenomeT2（orT6）

C=No.ofgenesassignedtoaspecificCOGcatalogue（orCOGcategory）inallsequencedbacteriaandarchaea

D=No.ofgenesassignedtoallotherCOGcatalogues（orCOGcategories）inallsequencedbacteriaandarchaea

Thevaluesfor‘C’and‘D’wereobtainedfromtheIntegratedMicrobialGenomes（IMG）system（http:

//img.jgi.doe.gov/cgi-bin/w/main.cgi;

Markowitzetal.,2012）.TheP-valuew

展开阅读全文