一个使用gromacs进行蛋白质模拟的入门教程.docx
《一个使用gromacs进行蛋白质模拟的入门教程.docx》由会员分享,可在线阅读,更多相关《一个使用gromacs进行蛋白质模拟的入门教程.docx(22页珍藏版)》请在冰豆网上搜索。
一个使用gromacs进行蛋白质模拟的入门教程
一个使用gromacs进行蛋白质模拟的入门教程
GROMACSTutorial
LysozymeinWater
JustinLemkul
DepartmentofBiochemistry,VirginiaTech
Thisexamplewillguideanewuserthroughtheprocessofsettingupasimulationsystemcontainingaprotein(lysozyme)inaboxofwater,withions.Eachstepwillcontainanexplanationofinputandoutput,usingtypicalsettingsforgeneraluse.
ThistutorialassumesyouareusingaGROMACSversionintheseries.
GROMACSTutorial
StepOne:
PreparetheTopology
Wemustdownloadtheproteinstructurefilewewillbeworkingwith.Forthistutorial,wewillutilizeheneggwhitelysozyme(PDBcode1AKI).GototheRCSBwebsiteanddownloadthePDBtextforthecrystalstructure.
Onceyouhavedownloadedthestructure,youcanvisualizethestructureusingaviewingprogramsuchasVMD,Chimera,PyMOL,etc.Onceyou'vehadalookatthemolecule,youaregoingtowanttostripoutthecrystalwaters.Useaplaintexteditorlikevi,emacs(Linux/Mac),orNotepad(Windows).Donotusewordprocessingsoftware!
Deletethelinescorrespondingtothesemolecules(residue"HOH"inthePDBfile).Notethatsuchaprocedureisnotuniversallyappropriate.,thecaseofaboundactivesitewatermolecule).Forourintentionshere,wedonotneedcrystalwater.
Alwayscheckyour.pdbfileforentrieslistedunderthecommentMISSING,astheseentriesindicateeitheratomsorwholeresiduesthatarenotpresentinthecrystalstructure.Terminalregionsmaybeabsent,andmaynotpresentaproblemfordynamics.Incompleteinternalsequencesoranyaminoacidresiduesthathavemissingatomswillcausepdb2gmxtofail.Thesemissingatoms/residuesmustbemodeledinusingothersoftwarepackages.Alsonotethatpdb2gmxisnotmagic.Itcannotgeneratetopologiesforarbitrarymolecules,justtheresiduesdefinedbytheforcefield(inthe*.rtpfiles-generallyproteins,nucleicacids,andaveryfiniteamountofcofactors,likeNAD(H)andATP).
Nowthatthecrystalwatersaregoneandwehaveverifiedthatallthenecessaryatomsarepresent,thePDBfileshouldcontainonlyproteinatoms,andisreadytobeinputintothefirstGROMACStool,pdb2gmx.Thepurposeofpdb2gmxistogeneratethreefiles:
1.Thetopologyforthemolecule.
2.Apositionrestraintfile.
3.Apost-processedstructurefile.
Thetopologybydefault)containsalltheinformationnecessarytodefinethemoleculewithinasimulation.Thisinformationincludesnonbondedparameters(atomtypesandcharges)aswellasbondedparameters(bonds,angles,anddihedrals).Wewilltakeamoredetailedlookatthetopologyonceithasbeengenerated.
Executepdb2gmxbyissuingthefollowingcommand:
pdb2gmx-f-o-waterspce
Thestructurewillbeprocessedbypdb2gmx,andyouwillbepromptedtochooseaforcefield:
SelecttheForceField:
From'/usr/local/gromacs/share/gromacs/top':
1:
AMBER03forcefield(Duanetal.,J.Comp.Chem.24,1999-2012,2003)
2:
AMBER94forcefield(Cornelletal.,JACS117,5179-5197,1995)
3:
AMBER96forcefield(Kollmanetal.,Acc.Chem.Res.29,461-469,1996)
4:
AMBER99forcefield(Wangetal.,J.Comp.Chem.21,1049-1074,2000)
5:
AMBER99SBforcefield(Hornaketal.,Proteins65,712-725,2006)
6:
AMBER99SB-ILDNforcefield(Lindorff-Larsenetal.,Proteins78,1950-58,2010)
7:
AMBERGSforcefield(Garcia&Sanbonmatsu,PNAS99,2782-2787,2002)
8:
CHARMM27all-atomforcefield(withCMAP)-version
9:
GROMOS9643a1forcefield
10:
GROMOS9643a2forcefield(improvedalkanedihedrals)
11:
GROMOS9645a3forcefield(SchulerJCC2001221205)
12:
GROMOS9653a5forcefield(JCC2004vol25pag1656)
13:
GROMOS9653a6forcefield(JCC2004vol25pag1656)
14:
OPLS-AA/Lall-atomforcefield(2001aminoaciddihedrals)
15:
[DEPRECATED]Encadall-atomforcefield,usingfullsolventcharges
16:
[DEPRECATED]Encadall-atomforcefield,usingscaled-downvacuumcharges
17:
[DEPRECATED]Gromacsforcefield(seemanual)
18:
[DEPRECATED]GromacsforcefieldwithhydrogensforNMR
Theforcefieldwillcontaintheinformationthatwillbewrittentothetopology.Thisisaveryimportantchoice!
Youshouldalwaysreadthoroughlyabouteachforcefieldanddecidewhichismostapplicabletoyoursituation.Forthistutorial,wewillusetheall-atomOPLSforcefield,sotype14atthecommandprompt,followedby'Enter'.
Therearemanyotheroptionsthatcanbepassedtopdb2gmx.Somearelistedhere:
-ignh:
IgnoreHatomsinthePDBfile;especiallyusefulforNMRstructures.Otherwise,ifHatomsarepresent,theymustbeinthecorrectorderandnamedexactlyhowGROMACSexpectsthemtobe.
-ter:
InteractivelyassignchargestatesforN-andC-termini.
-inter:
InteractivelyassignchargestatesforGlu,Asp,Lys,Arg,andHis;assigndisulfidestoCys.
Youhavenowgeneratedthreenewfiles:
,and.isaGROMACS-formattedstructurefilethatcontainsalltheatomsdefinedwithintheforcefield.,Hatomshavebeenaddedtotheaminoacidsintheprotein).Thefileisthesystemtopology(moreonthisinaminute).Thefilecontainsinformationusedtorestrainthepositionsofheavyatoms(moreonthislater).
StepTwo:
ExaminetheTopology
Let'slookatwhatisintheoutputtopology.Again,usingaplaintexteditor,inspectitscontents.Afterseveralcommentlines(precededby;),youwillfindthefollowing:
#include""
ThislinecallstheparameterswithintheOPLS-AAforcefield.Itisatthebeginningofthefile,indicatingthatallsubsequentparametersarederivedfromthisforcefield.Thenextimportantlineis[
moleculetype],belowwhichyouwillfind
;Namenrexcl
Protein_A3
Thename"Protein_A"definesthemoleculename,basedonthefactthattheproteinwaslabeledaschainAinthePDBfile.Thereare3exclusionsforbondedneighbors.MoreinformationonexclusionscanbefoundintheGROMACSmanual;adiscussionofthisinformationisbeyondthescopeofthistutorial.
Thenextsectiondefinesthe[atoms]intheprotein.Theinformationispresentedascolumns:
[atoms]
;nrtyperesnrresidueatomcgnrchargemasstypeBchargeBmassB
;residue1LYSrtpLYSHq+
1opls_2871LYSN1;qtot
2opls_2901LYSH11;qtot
3opls_2901LYSH21;qtot
4opls_2901LYSH31;qtot
5opls_293B1LYSCA1;qtot
6opls_1401LYSHA1;qtot1
Theinterpretationofthisinformationisasfollows:
nr:
Atomnumber
type:
Atomtype
resnr:
Aminoacidresiduenumber
residue:
Theaminoacidresiduename
Notethatthisresiduewas"LYS"inthePDBfile;theuseof.rtpentry"LYSH"indicatesthattheresidueisprotonated(thepredominantstateatneutralpH).
atom:
Atomname
cgnr:
Chargegroupnumber
Chargegroupsdefineunitsofintegercharge;theyaidinspeedingupcalculations
charge:
Self-explanatory
The"qtot"descriptorkeepsarunningcountofthetotalchargeonthemolecule
mass:
Alsoself-explanatory
typeB,chargeB,massB:
Usedforfreeenergyperturbation(notdiscussedhere)
Subsequentsectionsinclude[bonds],[pairs],[angles],and[dihedrals].Someofthesesectionsareself-explanatory(bonds,angles,anddihedrals).TheparametersandfunctiontypesassociatedwiththesesectionsareelaboratedoninChapter5oftheGROMACSmanual.Special1-4interactionsareincludedunder"pairs"(sectionoftheGROMACSmanual).
Theremainderofthefileinvolvesdefiningafewotheruseful/necessarytopologies,startingwithpositionrestraints.The""filewasgeneratedbypdb2gmx;itdefinesaforceconstantusedtokeepatomsinplaceduringequilibration(moreonthislater).
;IncludePositionrestraintfile
#ifdefPOSRES
#include""
#endif
Thisendsthe"Protein_A"moleculetypedefinition.Theremainderofthetopologyfileisdedicatedtodefiningothermoleculesandprovidingsystem-leveldescriptions.Thenextmoleculetype(bydefault)isthesolvent,inthiscaseSPC/Ewater.OthertypicalchoicesforwaterincludeSPC,TIP3P,andTIP4P.Wechosethisbypassing"-waterspce"topdb2gmx.Foranexcellentsummaryofthemanydifferentwatermodels,click?
here,butbeawarethatnotallofthesemodelsarepresentwithinGROMACS.
;Includewatertopology
#include""
#ifdefPOSRES_WATER
;Positionrestraintforeachwateroxygen
[position_restraints]
;ifunctfcxfcyfcz
11100010001000
#endif
Asyoucansee,watercanalsobeposition-restrained,usingaforceconstant(kpr)of1000kJmol-1nm-2.
Ionparametersareincludednext:
;Includegenerictopologyforions
#include""
Finallycomesystem-leveldefinitions.The[system]directivegivesthenameofthesystemthatwillbewrittentooutputfilesduringthesimulation.The[molecules]directivelistsallofthemoleculesinthesystem.
[system]
;Name
LYSOZYME
[molecules]
;Compound#mols
Protein_A1
Afewkeynotesaboutthe[molecules]directive:
1.Theorderofthelistedmoleculesmustexactlymatchtheorderofthemoleculesinthecoordinate(inthiscase,.gro)file.
2.Thenameslistedmustmatchthe[moleculetype]nameforeachspecies,notresiduenamesoranythingelse.
3.
Ifyoufailtosatisfytheseconcreterequirementsatanytime,youwillgetfatalerrorsfromgrompp(discussedlater)aboutmismatchednames,moleculesnotbeingfound,oranumberofothers.
Nowthatwehaveexaminedthecontentsofatopologyfile,wecancontinuebuildingoursystem.
StepThree:
DefiningtheUnitCell&AddingSolvent
NowthatyouarefamiliarwiththecontentsoftheGROMACStopology,itistimetocontinuebuildingoursystem.Inthisexample,wearegoingtobesimulatingasimpleaqueoussystem.Itispossibletosimulateproteinsandothermoleculesindifferentsolvents,providedthatgoodparametersareavailableforallspeciesinvolve