clustalw使用方法Word下载.docx
《clustalw使用方法Word下载.docx》由会员分享,可在线阅读,更多相关《clustalw使用方法Word下载.docx(24页珍藏版)》请在冰豆网上搜索。
choice.NucleicAcidsResearch,22:
4673-4680.
--------------------------------------------------------------
What'
sNew(March1996)inVersion1.6(sinceversion1.5).
1)Improvedhandlingofsequencesofunequallength.Previously,we
increasedthegapextensionpenaltiesforbothsequencesifthetwosequences
(orgroupsofpreviouslyalignedsequences)wereofdifferentlengths.
Now,weincreasethegapopeningandextensionpenaltiesfortheshorter
sequenceonly.Thishelpspreventshortsequencesbeingstretchedout
alonglongerones.
2)Addedthe"
Gonnet"
seriesofweightmatrices(fromGastonGonnetand
co-workersattheETHinZurich).Fixedabuginthematrix
choicemenu;
nowPAMmatricescanbeselectedok.
3)Addedsecondarystructure/gappenaltymasks.Theseallowyouto
include,inanalignment,apositionspecificsetofgappenalties.
Youcaneithersetagapopeningpenaltyateachpositionorspecify
thesecondarystrcuture(ifprotein;
alphahelix,betastrandorloop)
andhavegappenaltiessetautomatically.This,basically,isusedtomake
gapshardertoopeninsidehelicesorstrands.
Thesemasksareonlyusedinthe"
profilealignment"
menu.Theymaybereadin
aspartofanalignmentinaspecialformat(seetheon-linehelpfor
details)orassociatedwitheachsequence,ifthesequencesareinSwissProt
formatandsecondarystructureinformationisgiven.Allofthemask
parameterscanbesetfromtheprofilealignmentmenu.Basically,the
maskismadeupofaseriesofnumbersbetween1and9,oneperposition.
Thegapopeningpenaltyatapositioniscalculatedasthestartingpenalty
multipleiedbythemaskvalueatthatsite.
4)Addedcommandlineoptions/profileand/sequences.
Theseallowusestochoosebetweennormalprofilealignmentwherethe
twoprofiles(pre-existingalignmentsspecifiedinthefiles
/profile1=and/profile2=)aremerged/alignedwitheachother(/profile)
andthecasewheretheindividualsequencesin/profile2arealigned
sequentiallywiththealignmentin/profile1(/sequences).
5)FixedbuginmodifiedMyersandMilleralgorithm-gappenaltyscore
wasnotalwayscalculatedproperlyfortype2midpoints.Thisisthecore
alignmentalgorithm.
6)Onlyallowsoneoutputfileformattobeselectedfromcommandline
-ie.multipleoutputalignmentfilesarenotallowed.
7)Fixed'
badcallstockfree'
errorduringcalculationofphylipdistance
matrix.
8)Fixedcommandlineoptions/gapopen/gapext/type=protein/negative.
9)AllowedusertochangecommandlineseparatoronUNIXfrom'
/'
to'
-'
.
Thisallowsunixuserstousethemoreconventinal'
symbol
forseperatingcommandlineoptions."
/"
canthenbeusedinunix
filenamesonthecommandline.Thesymbolthatisused,
isspecifiedinthefileclustalw.hwhichmustbeeditedifyou
wishtochangeit(andtheprogrammustthenberecompiled).Findthe
blockofcodeinclustalw.hthatcorrspondstotheoperatingsystemyou
areusing.Theseblocksarestartedbyoneofthefollowing:
#ifdefVMS
#elifMAC
#elifMSDOS
#elifUNIX
Onthenextlineaftereachistheline:
#defineCOMMANDSEP'
Changethisintheappropriateblockofcode(e.g.theUNIXblock)to
ifyouwishtousethe"
-"
characterascommandseperator.
sNew(April1995)inVersion1.5(sinceversion1.3).
1)portedtoMACandPC.Theseversionsarequiteslowunlessyou
haveanicebeefymachine.OnaPowerMacoraPentiumbox
itisniceandfast.TwoprecompiledversionsaresuppliedforMacs
(Powermacandoldmacversions).
Mac:
1500residuesby100sequences
PowerMac3000"
"
PC1500"
2)alignmentofnewsequencestoanalignment.Fixedaseriousbug
whichassignedweightstothewrongsequences.Nowalso,weights
sequencesaccordingtodistancefromtheincomingsequence.The
newweightsare:
treeweights*similaritytoincomingsequence.
Thetreeweightsaretheoldweightsthatwederivefromthetree
connectingallthesequencesintheexistingalignment.
3)forallplatforms,outputlinelength=60.
4)Bootstrapfiles(*.phb):
the"
final"
node(arbitrarytrichotomy
attheendoftheneighbor-joiningprocess)islabelledas
TRICHOTOMYinthebootstrapoutputfiles.Thisistohelp
linkbootstrapfigureswithnodeswhenyourerootthetree.
5)Commandline/bootstrapoptionnowmorerobust.
INTRODUCTION
ThisdocumentgivessomeBRIEFnotesaboutusageoftheClustalW
multiplealignmentprogramforUNIXandVMSmachines.ClustalW
isamajorupdateandrewriteoftheClustalVprogramwhich
wasdescribedin:
Higgins,D.G.,Bleasby,A.J.andFuchs,R.(1992)
CLUSTALV:
improvedsoftwareformultiplesequencealignment.
ComputerApplicationsintheBiosciences(CABIOS),8
(2):
189-191.
Themainnewfeaturesareagreatlyimproved(moresensitive)
multiplealignmentprocedureforproteinsandimprovedsupport
fordifferentfileformats.Thissoftwarewasdescribedin:
improvingthesensitivityofprogressivemultiple
sequencealignmentthroughsequenceweighting,positionspecific
gappenaltiesandweightmatrixchoice.
NucleicAcidsResearch,22(22):
TheusageofClustalWislargelythesameasfor
ClustalVdetailsofwhicharedescribedinclustalv.doc.Detailsofthe
newalignmentalgorithmsaredescribedinthemanuscriptby
Thompsonet.al.above,anascii/textversionofwhichisincluded
(clustalw.ms).Thisfilelistssomeofthedetailsnotcoveredbyeither
oftheabovedocuments.
Therearebriefnotesonthefollowingtopics:
1)InstallationforVMSandUNIXandMACandPC
2)Fileinput
3)fileoutput
4)changestothealignmentalgorithms
5)minormodificationstothephylogenetictreeandbootstrappingmethods
6)summaryofthecommandlineusage.
-------------------------------------------------------------------
1)INSTALLATION(forUnix,VAX/VMS,PCandMAC)
*****IMPORTANT*****
Ifyouwishtorecompiletheprogram(orcompileitforthefirst
time;
youwillhavetodothiswithUNIXorVAX):
firstcheckthefileCLUSTALW.Hwhichneedstobechangedifyou
movethecodefrombetweenunixandvmsmachines.Atthetop
ofthefilearefourlineswhichdefineoneofVMS,MSDOS,MACor
UNIXtobe1.AlloftheseEXCEPTonemustbecommentedout
usingenclosed/*...*/.
*******************
Unix
-----
Makefilesaresuppliedforunixmachines.Thecodewascompiledand
testedusingDecstation(Ultrix),SUN(GnuCcompiler/gcc),Silicon
Graphics(IRIX)andDEC/Alpha(OSF1).Wehavenottestedthecodeonanyother
systems.Justusemakefiletomakeonmostsystems.ForSun,youneedto
havetheGnucC(gcc)compilerinstalled...usethefilemakefile.suninthis
case.Youmaketheprogramwith:
make(ormake-fmakefile.sun)
Thisproducesthefileclustalwwhichcanberunbytypingclustalwand
pressingreturn.Thehelpfileiscalledclustalw_help
VMS
----
ThereisasmallDCLcommandfile(VMSLINK.COM)tocompileandlinkthe
codeforVMSmachines(vaxoralpha).Thisprocedurejustcompilesthe
sourcefilesandlinksusingdefaultsettings.Runitusing:
$@vmslink
ThisproducesClustalw.exewhichcanberunusingtheruncommand:
$runclustalw
Theintermediateobjectfilescanbedeletedwith:
$del*.obj;
Thereisanextensivecommandlinefacility.Tousethis,youmust
createasymboltoruntheprogram(andputthisinyourfile).
e.g.
$clustalw:
==$$drive:
[dir.dir]clustalw
where$driveisthedriveonwhichtheexecutablefileisstored(clustalw.exe)
and[dir.dir]isthefulldirectoryspecification.NOTETHEEXTRADOLLARSIGN.
Thentheprogramcanberunusingthecommand:
$clustalw
Thehelpfileiscalledclustalw.hlp...thismustbedefinedtobe
clustalw_helpusingthecommand:
$defineclustalw_help$drive:
[dir.dir]clustalw.hlp
where$driveisthedrivenameand[dir.dir]isthenameofthe
directorywherethehelpfileisstored.
PC
__
Wesupplytwoexecutablefiles(Clustalw.exeandClwbig.exe)whichwillrun
usingMSDOS.Theywillalsorununderwindows(asaDOSapplication)
***IFyouhaveamathscoprocessor***.Ifyoudonothaveamathschip
(e.g.80387),theprogramcanonlyberununderMSDOS.Inthelattercase,
youmusthavethefileEMU387.exeinthesamedirectoryasCLUSTALW.EXE.
Thisfileemulatesamathschipifyoudonothaveone.
WegeneratedtheseexecutablefilesusinggnucforMSDOS.
Itwillalsocompile(withabout10,000warningmessages)
usingMicrosoftCbutwehavenottesteditandthereappeartobeproblems
withtheexecutable.
Youwillneedtousea"
memoryextender"
toallowtheprogramtogetatmore
than640kbofmemory.
Clustalw.exe:
upto100sequencesofmax.length1500residues(includingGAPS)
Clwbig.exe:
upto150sequencesofmax.length2600residues(includingGAPS)
MAC
---
Thecodecomp