MAQ 参数说明.docx
《MAQ 参数说明.docx》由会员分享,可在线阅读,更多相关《MAQ 参数说明.docx(17页珍藏版)》请在冰豆网上搜索。
MAQ参数说明
MAQKeyCommands
fasta2bfa
maqfasta2bfain.ref.fastaout.ref.bfa
ConvertsequencesinFASTAformattoMaq’sBFA(binaryFASTA)format.
fastq2bfq
maqfastq2bfq[-nnreads]in.read.fastqout.read.bfq|out.prefix
ConvertreadsinFASTQformattoMaq’sBFQ(binaryFASTQ)format.
OPTIONS:
-nINT
numberofreadsperfile[notspecified]
map
maqmap[-nnmis][-amaxins][-c][-1len1][-2len2][-dadap3][-mmutrate][-uunmapped][-emaxerr][-Mc|g][-N][-Hallhits][-Cmaxhits]out.aln.mapin.ref.bfain.read1.bfq[in.read2.bfq]2>out.map.log
Mapreadstothereferencesequences.
OPTIONS:
-nINT
Numberofmaximummismatchesthatcanalwaysbefound[2]
-aINT
Maximumouterdistanceforacorrectreadpair[250]
-AINT
MaximumouterdistanceoftwoRFpaiedread(0fordisable)[0]
-c
Mapreadsinthecolourspace(forSOLiDonly)
-1INT
Readlengthforthefirstread,0forauto[0]
-2INT
Readlengthforthesecondread,0forauto[0]
-mFLOAT
Mutationratebetweenthereferencesequencesandthereads[0.001]
-dFILE
Specifyafilecontainingasinglelineofthe3’-adaptersequence[null]
-uFILE
Dumpunmappedreadsandreadscontainingmorethannmismismatchestoaseparatefile[null]
-eINT
Thresholdonthesumofmismatchingbasequalities[70]
-HFILE
Dumpmultiple/all01-mismatchhitstoFILE[null]
-CINT
Maximumnumberofhitstooutput.Unlimitediflargerthan512.[250]
-Mc|g
methylationalignmentmode.AllC(orG)ontheforwardstrandwillbechangedtoT(orA).Thisoptionisfortestingonly.
-N
storethemismatchpositionintheoutputfileout.aln.map.Whenthisoptionisinuse,themaximumallowedreadlengthis55bp.
NOTE:
*
Pairedendreadsshouldbepreparedintwofiles,oneforeachend,withreadsaresortedinthesameorder.Thismeansthek-threadinthefirstfileismatedwiththek-threadinthesecondfile.Thecorrespondingreadnamesmustbeidenticaluptothetailing‘/1’or‘/2’.Forexample,suchapairofreadnamesareallowed:
‘EAS1_1_5_100_200/1’and‘EAS1_1_5_100_200/2’.Thetailing‘/[12]’isusuallygeneratedbytheGAPipelinetodistinguishthetwoendsinapair.
*
Theoutputisacompressedbinaryfile.Itisaffectedbytheendianness.
*
Thebestwaytorunthiscommandistoprovideabout1to3millionreadsasinput.Morereadsconsumemorememory.
*
Option-ncontrolsthesensitivityofthealignment.Bydefault,ahitwithupto2mismatchescanbealwaysfound.Higher-nfindsmorehitsandalsoimprovestheaccuracyofmappingqualities.However,thisisdoneatthecostofspeed.
*
Alignmentswithmanyhigh-qualitymismatchesshouldbediscardedasfalsealignmentsorpossiblecontaminations.Thisbehaviouriscontrolledbyoption-e.The-ethresholdisonlycalculatedapproximatelybecausebasequalitiesaredividedby10atacertainstageofthealignment.The-Qoptionintheassemblecommandpreciselysetthethreshold.
*
ApairofreadsaresaidtobecorrectlypairedifandonlyiftheorientationisFRandtheouterdistanceofthepairisnolargerthanmaxins.Thereisnolimitontheminimuminsertsize.ThissettingisdeterminedbythepairedendalignmentalgorithmusedinMaq.Requiringaminimuminsertsizewillleadtosomewrongalignmentswithhighlyoverestimatedmappingqualities.
*
Currently,readpairsfromIllumina/Solexalong-insertlibraryhaveRFreadorientation.Themaximuminsertsizeissetbyoption-A.However,long-insertlibraryisalsomixedwithasmallfractionofshort-insertreadpairs.-ashouldalsobesetcorrectly.
*
Sometimes5’-endoreventheentire3’-adaptersequencemaybesequenced.Providing-drendersMaqtoeliminatetheadaptercontaminations.
*
Given2millionreadsasinput,maqusuallytakes800MBmemory.
mapmerge
maqmapmergeout.aln.mapin.aln1.mapin.aln2.map[...]
Mergeabatchofreadalignmentstogether.
NOTE:
*
Intheory,thiscommandcanmergeunlimitednumberofalignments.However,asmapmergewillbereadingalltheinputsatthesametime,itmayhitthelimitofthemaximumnumberofopeningfilessetbytheOS.Atpresent,thishastobemanuallysolvedbyendusers.
*
Commandmapmergecanbeusedtomergealignmentfileswithdifferentreadlengths.Allthesubsequentanalysesdonotassumefixedlengthanymore.
rmdup
maqrmdupout.rmdup.mapin.ori.map
Removepairswithidenticaloutercoordinates.Inprinciple,pairswithidenticaloutercoordinatesshouldhappenrarely.However,duetotheamplificationinsamplepreparation,thisoccursmuchmorefrequentlythanbychance.PracticalanalysesshowthatremovingduplicateshelpstoimprovetheoverallaccuracyofSNPcalling.
assemble
maqassemble[-sp][-mmaxmis][-Qmaxerr][-rhetrate][-tcoef][-qminQ][-NnHap]sin.ref.bfain.aln.map2>s.log
Calltheconsensussequencesfromreadmapping.
OPTIONS:
-tFLOAT
Errordependencycoefficient[0.93]
-rFLOAT
Fractionofheterozygotesamongallsites[0.001]
-s
Takesingleendmappingqualityasthefinalmappingquality;otherwisepairedendmappingqualitywillbeused
-p
Discardpairedendreadsthatarenotmappedincorrectpairs
-mINT
Maximumnumberofmismatchesallowedforareadtobeusedinconsensuscalling[7]
-QINT
Maximumallowedsumofqualityvaluesofmismatchedbases[60]
-qINT
Minimummappingqualityallowedforareadtobeusedinconsensuscalling[0]
-NINT
Numberofhaplotypesinthepool(>=2)[2]
NOTE:
*
Option-Qsetsalimitonthemaximumsumofmismatchingbasequalities.Readscontainingmanyhigh-qualitymismatchesshouldbediscarded.
*
Option-Nsetsthenumberofhaplotypesinapool.Itisdesignedforresequencingofsamplesbypoolingmultiplestrains/individualstogether.Fordiploidgenomeresequencing,thisoptionequals2.
indelpe
maqindelpein.ref.bfain.aln.map>out.indelpe
Callconsistentindelsfrompairedendreads.TheoutputisTABdelimitedwitheachlineconsistingofchromosome,startposition,typeoftheindel,numberofreadsacrosstheindel,sizeoftheindelandinserted/deletednucleotides(separatedbycolon),numberofindelsonthereversestrand,numberofindelsontheforwardstrand,5’sequenceaheadoftheindel,3’sequencefollowingtheindel,numberofreadsalignedwithoutindelsandthreeadditionalcolumnsforfilters.
Atthe3rdcolumn,typeoftheindel,astarindicatestheindelisconfirmedbyreadsfrombothstrands,aplusmeanstheindelishitbyatleasttworeadsbutfromthesamestrand,aminusshowstheindelisonlyfoundononeread,andadotmeanstheindelistooclosetoanotherindelandisfilteredout.
Usersarerecommendedtorunthrough‘maq.plindelpe’tocorrectthenumberofreadsmappedwithoutindels.Formoredetails,seethe‘maq.plindelpe’section.
indelsoa
maqindelsoain.ref.bfain.aln.map>out.indelsoa
Callpotentialhomozygousindelsandbreakpointsbydetectingtheabnormalalignmentpatternaroundindelsandbreakpoints.TheoutputisalsoTABdelimitedwitheachlineconsistingofchromosome,approximatecoordinate,lengthoftheabnormalregion,numberofreadsmappedacrosstheposition,numberofreadsontheleft-handsideofthepositionandnumberofreadsontheright-handside.Thelastcolumncanbeignored.
Theoutputcontainsmanyfalsepositives.Arecommendedfiltercouldbe:
awk’$5+$6-$4>=3&&$4<=1’in.indelsoa
Notethatthiscommanddoesnotaimtobeanaccurateindeldetector,butmainlyhelpstoavoidsomefalsepositivesinsubstitutioncalling.Inaddition,itonlyworkswellgivendeepdepth(~40Xforexample);otherwisethefalsenegativeratewouldbeveryhigh.
FormatConverting
sol2sanger
maqsol2sangerin.sol.fastqout.sanger.fastq
ConvertSolexaFASTQtostandard/SangerFASTQformat.
bfq2fastq
maqbfq2fastqin.read.bfqout.read.fastq
ConvertMaq’sBFQformattostandardFASTQformat.
mapass2maq
maqmapass2maqin.mapass2.mapout.maq.map
Convertobsoletemapass2’smapformattoMaq’smapformat.Theoldformatdoesnotcontainreadnames.
InformationExtracting
mapview
maqmapview[-bN]in.aln.map>out.aln.txt
Displaythereadalignmentinplaintext.ForreadsalignedbeforetheSmith-Watermanalignment,eachlineconsistsofreadname,chromosome,position,strand,insertsizefromtheoutercoorniatesofapair,pairedflag,mappingquality,single-endmappingquality,alternativemappingquality,numberofmismatchesofthebesthit,sumofqualitiesofmismatchedbasesofthebesthit,numberof0-mismatchhitsofthefirst24bp,numberof1-mismatchhitsofthefirst24bponthereference,lengthoftheread,readsequenceanditsquality.Alternativemappingqualityalwaysequalstomappingqualityifthereadsarenotpaired.Ifreadsarepaired,itequalstothesmallermappingqualityofthetwoends.Thisalternativemappingqualityisactuallythemappingqualityofanabnormalpair.
Thefifthcolumn,pairedflag,isabitwiseflag.Itslower4bitsgivetheorientation:
1standsforFF,2forFR,4forRF,and8forRR,whereFRmeansthatthereadwithsmallercoordinateisontheforwardstrand,anditsmateisonthereversestrand.OnlyFRisallowedforacorrectpair.Thehigherbitsofthisflaggivefurtherinformation.Ifthepairmeetsthepairedendrequirement,16willbeset.Ifthetworeadsaremappedtodifferentchromosomes,32willbeset.Ifoneofthetworeadscannotbemappedatall,64willbeset.Theflagforacorrectpairalwaysequalsto18.
ForreadsalignedbytheSmith-Watermanalignmentafterwards,