python外文文献.docx

上传人:b****8 文档编号:8998896 上传时间:2023-02-02 格式:DOCX 页数:12 大小:25.64KB
下载 相关 举报
python外文文献.docx_第1页
第1页 / 共12页
python外文文献.docx_第2页
第2页 / 共12页
python外文文献.docx_第3页
第3页 / 共12页
python外文文献.docx_第4页
第4页 / 共12页
python外文文献.docx_第5页
第5页 / 共12页
点击查看更多>>
下载资源
资源描述

python外文文献.docx

《python外文文献.docx》由会员分享,可在线阅读,更多相关《python外文文献.docx(12页珍藏版)》请在冰豆网上搜索。

python外文文献.docx

python外文文献

apythonEnvironmentforTreeExploration

ReviewedbyJaimeHuerta-Cepas,correspondingauthorlJoaqunDopazo,2and

ToniGabald6ncorrespondingauthor1

Abstract

Manybioinformaticsanalyses,rangingfromgeneclusteringtophylogenetics,

producehierarchical

treesastheirmain

result.

Theseareused

torepresent

the

relationships

among

different

biological

entities,

thus

facilitating

theiranalysis

and

interpretation.

Anumberof

standalone

programs

are

available

thatfocuson

tree

visualizationorthatperformspecificanalysesonthem.However,suchapplicationsarerarelysuitableforlarge-scalesurveys,inwhichahigherlevelofautomationisrequired.Currently,manygenome-wideanalysesrelyontree-likedatarepresentationandhencethereisagrowingneedforscalabletoolstohandletreestructuresatlargescale.

Keywords:

Python,spikingneurons,simulation,integrateandfire,teaching,neuralnetworks,computationalneuroscienee,software

Background

HerewepresenttheEnvironmentforTreeExploration(ETE),apythonprogramming

toolkitthatassistsintheautomatedmanipulation,analysisandvisualizationof

hierarchicaltrees.ETElibrariesprovideabroadsetoftreehandlingoptionsaswellasspecificmethodstoanalyzephylogeneticandclusteringtrees.Amongotherfeatures,

ETEallowsfortheindependentanalysisoftreepartitions,hassupportfortheextendednewickformat,providesanintegratednodeannotationsystemandpermitstolinktreestoexternaldatasuchasmultiplesequeneealignmentsornumericalarrays.Inaddition,ETEimplementsanumberofbuilt-inanalyticaltools,includingphylogeny-based

orthologypredictionandclustervalidationtechniques.Finally,ETE'sprogrammabletreedrawingenginecanbeusedtoautomatethegraphicalrenderingoftreeswithcustomizednode-specificvisualizations.

Conclusions

ETEprovidesacompletesetofmethodstomanipulatetreedatastructuresthatextendscurrentfunctionalityinotherbioinformatictoolkitsofamoregeneralpurpose.ETEisfreesoftwareandcanbedownloadedfromhttp:

//ete.cgenomics.org.

Treesarecommonlyusedtorepresenttheresultsofmanybioinformaticsanalyses.

Inparticular,suchtypeofbinarygraphsareidealtodescribethehierarchicalrelationshipsamongavarietyofbiologicalentities.Somecommonexamplesaretheevolutionaryanalysisofmolecularsequencesortheclusterizationofgenesandproteinsaccordingtotheirproperties.Besidestheinformationencodedinthetopologyoftrees,branchlengthscanalsobescaledtoprovideinformationonthedistancesbetweenthedifferentpartitions.Inphylogenetics,forinstanee,treesareusedtoillustratethe

evolutionaryrelationshipsamongspeciesormolecularsequences,consideringterminal

nodesasextantOperationalTaxonomicUnits(OTU)andinternalnodesastheircorrespondingancestors.Insuchphylogenetictrees,branchlengthsareusuallyproportionaltotheevolutionarydistaneeamongsequences.Otherapplications,suchastheanalysisofgeneexpression,usehierarchicalclusteringanalysistogroupgenesorexperimentalconditionsaccordingtothesimilarityoftheirexpressionpatterns.Likewise,treesareusedbymanyproteinclassificationmethodsandfortheanalysisofphylogeneticprofiles.Thus,theanalysisoftreedatastructuresisacommontaskinmanyareasofbioinformaticsandthereisaneedforanalyticalandvisualizationtools.Inthisrespect,anumberofbioinformaticprogramsdoexistthatassistintheexplorationofhierarchicaltrees.Mostofthem,however,consistofstandaloneapplicationsthatarefocusedonvisualizationand,occasionally,onperformingspecifictests.SomewellknownexamplesareTreeView[1],awidelyusedprogramforinspectingphylogenetictrees;ClusterTreeview[2],anapplicationforvisualizingmicroarrayclusteringresults;ATV

[3],ajavaprogramusedtoexplorephylogenieswhichprovidesalsosomeediting

options;MEGA[4],anevolutionarygeneticsanalysissuitethatincludesabuilt-intreeviewer;andmanyotherrecentapplications[5-8].Whilealltheseprogramsareveryusefultomanagesingletrees,theycanhardlybeautomatizedoradaptedtospecificneeds.Thus,whentheanalysisofhundredsorthousandsoftreesisrequired,theuseofstandaloneprogramsbecomesrestrictive,becauseamuchhigherlevelofautomationisrequired.Insuchcases,programmingtoolkitsrepresentamoreadequateframework,sincetheyprovidetoolsandmethodstohandledataatalowerlevel.Usingtoolkits,bioinformaticianscaneasilycreatetheirownanalysispipelinesandprogramcustomtasksoverlargecollectionsofdata[9].SeveralgenericbioinformatictoolkitsdoexistthatcoverawiderangeofprogrammingIanguagesandscopes,withBioPerl[10]and

BioPython[11]beingthemostextensivelydeveloped.Togetherwithabroadrangeof

otherfeatures,thesetoolkitsallowcertainlevelofinteractionwithtreedatastructures.

However,onlybasicactionsarecurrentlysupported.Alternatively,thePyCogent[12]and

P4http:

//bmnh.org/~pf/p4.htmlpythontoolkitscanbeusedtoextendthisfunctionality,althoughtheyaremostlyfocusedonphylogeneticreconstruction.R[13],ageneralpurposestatisticalframework,doesincludeseveralpackagestoperformstatisticaltestsonclusteringandphylogenetictrees.Nevertheless,thesepackagesarefocusedonperformingspecificanalysesratherthaninprovidingtreehandlingandmanipulationfeatures.Finally,incontrasttothegreatnumberofstandalonetreeviewers,programmingtoolkitsofferfew,ifany,graphicalrenderingpossibilities.AnintermediatealternativebetweenstandaloneviewersandprogrammatictreerenderingisthatoftheTreeDynprogram[14],whichhassupportforsomescriptingoptionsandcanbeusedtocreatefullyannotatedtreeimages.

Inresponsetotheselimitations,wepresentheretheEnvironmentforTree

Exploration(ETE),apythonprogrammingtoolkittoanalyze,manipulateorvisualizeanykindofhierarchicaltree.Itextendsthefunctionalityinothertoolkitsandallowsahighlevelofcustomization.ETE'sdrawingfeatures,althoughlessexhaustivethanin

standaloneeditors,relyonthePythonscriptingIanguage,whichmakespossibleto

combineadvaneedtreeanalysesandtreevisualizationintoasingleprogram.Thetoolkitincludesmethodstobrowseandmanipulatetreetopologies,providessupportfortheNewHampshireeXtended(NHX)formatandallowsadvaneedactionssuchasnode

annotation,automaticrooting,cut&pastepartitions,treeconcatenation,nodesearch,

andbranchdistaneerelatedoperations.Inaddition,ETEimplementstwospecific

modulestoworkwithphylogeneticandclusteringtrees.Thephylogeneticextensionallowstreestobelinkedtotheircorrespondingmultiplesequeneealignments,includestwoorthologyandparalogypredictionalgorithms,implementstheduplicationdatingmethoddescribedin[15]andprovidesaccesstothePhylomeDBdatabase[16].Similarly,clusteringtreescanbelinkedtotheirsourcedata,whichallowstreepartitionstobeanalyzedthroughseveralvalidationtechniques.Additionally,ETEimplementsafully

programmabledrawingenginethatcanbeusedtogenerate,dynamically,customtreerepresentationsinPDForPNGformats.Thisdrawingengineisfullyintegratedwiththebuilt-inextensions,thusprovidingpre-definedvisualizationlayoutsforclusteringtreesandphylogenies.AGraphicalUserInterfaceisalsoincludedwhichallowsontheflyinteractionwithtrees.

Currently,theETEtoolkitisusedindiverseprojects,includingGEPAS[17],Phylemon[18]andPhylomeDB[15].ETEpackageanddocumentationcanbeaccessedat

http:

//ete.cgenomics.org

Implementation

ETEisentirelywritteninPython[19],aprogrammingIanguagethatoffersastrongsupportforintegrationwithotherIanguagesandtools,andwhosepopularityisraisingamongthebioinformaticscommunity[20].ETE'sphilosophyistofacilitatethe

integrationwithothertoolkitsaswellastoprovideascalableprogramarchitecture.Thus,ETEtreeobjectscanbeeasilyimportedandexpandedbyincorporatingcustommethodsandproperties.ThefunctionalityoftheETEtoolkitisdividedintoseveralpythonmodules,whichcanbeimportedatconvenience.AsummaryoffeaturesofthedifferentmodulesisshowninTable

Treehandlingmodule

ETE'smainmoduleallowstoreadandrendertreesusingthetwomostcommon

formats:

NewHampshire(NH)andNewHampshireeXtended(NHX).Moreover,itallows

togeneraterandomtreesorcreatecustomtreestructuresfromscratch.Inorderto

increasecompatibilitywithothertools,severalnewickformatstandardsarecurrently

supportedbyETE,bothforreadingandwritingtrees(seeETE'sextended

documentation).ETE'streesareinternallyencodedasaseriesoftreenodeinstancesconnectedfollowingaparent-childrelationship.Eachnodeisencodedasan

independentPythonobject,whichprovidesmanymethodstomanipulateits

connections(i.e.add,remove,deleteordetachnodes)andtoeasilybrowseitstopology

(i.e.treetraversalandgetterminal,children,sisterordescendantnodes).Asa

consequenee,eachinternalnodeistreatedasafullyfeaturedsubtree,thusallowingto

analyzedifferentpartsoftreesseparately.ETE'streeobjectimplementationsupportsmultifurcationsandcanbeusedtodealwithvery

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 小学教育 > 数学

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1