1、Escherichia coli K12 a cooperatively developed annotation snapshotEscherichia coli K-12 a cooperatively developed annotation snapshot2005Monica Riley,* Takashi Abe,1 Martha B. Arnaud,2 Mary K.B. Berlyn,3 Frederick R. Blattner,4 Roy R. Chaudhuri,5 Jeremy D. Glasner,4 Takashi Horiuchi,6 Ingrid M. Kese
2、ler,7 Takehide Kosuge,1 Hirotada Mori,8,9 Nicole T. Perna,4 Guy Plunkett, III,4 Kenneth E. Rudd,10 Margrethe H. Serres, Gavin H. Thomas,11 Nicholas R. Thomson,12 David Wishart,13 and Barry L. Wanner14 Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA1Center for Infor
3、mation Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems, Yata 1111, Mishima, Shizuoka 411-8540, Japan2Department of Genetics, Candida Genome Database Stanford University School of Medicine, Stanford, CA 94305-5120, USA3Department of
4、 Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06520-8103, USA4Genome Center of Wisconsin, 425 Henry Mall, University of Wisconsin, Madison, WI 53706, USA5Division of Immunity and Infection, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK6National Institut
5、e for Basic Biology, Nishigonaka 38, Myodaiji, Okazaki 444-8585 Aichi, Japan7SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493, USA8Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan9The Institute of Advanced Biosciences, K
6、eio University, Tsuruoka, Yamagata 997-0017, Japan10Department of Biochemistry and Molecular Biology, The University of Miami Miller School of Medicine, Miami, FL 33140, USA11Department of Biology, University of York, PO Box 373, York YO10 5YW, UK12The Wellcome Trust Sanger Institute, Genome Campus,
7、 Hinxton, Cambridge CB10 1SA, UK13Department of Computing Science and Biological Sciences, 2-21 Athabasca Hall University of Alberta, Edmonton, Alberta, Canada T6G 2E814Department of Biological Sciences, Purdue University, 915 W. State Street, West Lafayette, IN 47907-2054, USA*To whom correspondenc
8、e should be addressed. Tel: +1 508 269 7388; Fax: +1 508 457 4727; Email: mrileymbl.edu Correspondence may also be addressed to Barry L. Wanner. Tel: +1 765 494 8034; Fax: +1 765 494 0876; Email: blwannerpurdue.edu Received November 5, 2005; Revised December 5, 2005; Accepted December 5, 2005.The on
9、line version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxfor
10、d University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact
11、journals.permissionsoxfordjournals.org This article has been cited by other articles in PMC.TopAbstractINTRODUCTIONRESULTS AND DISCUSSIONRESULTSSUPPLEMENTARY DATASupplementary MaterialREFERENCESAbstractThe goal of this group project has been to coordinate and bring up-to-date information on all gene
12、s of Escherichia coli K-12. Annotation of the genome of an organism entails identification of genes, the boundaries of genes in terms of precise start and end sites, and description of the gene products. Known and predicted functions were assigned to each gene product on the basis of experimental ev
13、idence or sequence analysis. Since both kinds of evidence are constantly expanding, no annotation is complete at any moment in time. This is a snapshot analysis based on the most recent genome sequences of two E.coli K-12 bacteria. An accurate and up-to-date description of E.coli K-12 genes is of pa
14、rticular importance to the scientific community because experimentally determined properties of its gene products provide fundamental information for annotation of innumerable genes of other organisms. Availability of the complete genome sequence of two K-12 strains allows comparison of their genoty
15、pes and mutant status of alleles.TopAbstractINTRODUCTIONRESULTS AND DISCUSSIONRESULTSSUPPLEMENTARY DATASupplementary MaterialREFERENCESINTRODUCTIONEscherichia coli strain K-12 is arguably the single organism about which the most is known. Originally isolated in 1922, it was catapulted to prominence
16、by the discovery of strain K-12s ability to carry out genetic recombination by conjugation (1) and, soon after, by generalized transduction (2). The strain K-12 has been widely distributed to laboratories across the world. Over the ensuing years it became the primary model organism for basic biology
17、, molecular genetics and physiology of bacteria, and was the founding workhorse of the biotechnology industry.Annotation of E.coli has not only served the E.coli community, but has formed a basis for extrapolation of gene functions to virtually every other prokaryotic, as well as eukaryotic, genome
18、through analogy based on protein sequence similarities. As such, the accuracy and completeness of the E.coli information is of great importance to the community of biologists working in all disciplines and with all organisms. We report here the work of a group of scientists dedicated to full review
19、and update of the annotation of E.coli K-12.The entire genome sequence of K-12 strain MG1655 was first completed and annotated by a group assembled by F. R. Blattner (3). The genome of a second K-12 strain, W3110, was completed recently under the direction of Takashi Horiuchi at the National Institu
20、te for Basic Biology in Japan (4). At the same time the sequence of the genome of MG1655 was corrected and updated. MG1655 was chosen for its close relationship with the original E.coli strain K-12 (called EMG2), whereas W3110 was chosen because it has been widely used as a wild-type strain by many
21、investigations worldwide from the 1950s. Both had been cured of the prophage and lack the F+ fertility factor of ancestral E.coli K-12 EMG2. MG1655 and W3110 are 1- and 2-step descendents of E.coli K-12 W1485 (F+, ), respectively, which is in turn a direct descendent of EMG2 (4,5).By comparing and r
22、e-sequencing regions of discrepancies between MG1655 and W3110, highly accurate genomes have now been created for both strains (4). Corrections to the original MG1655 genome (3) are at 243 sites (totaling 358 nt), a correction rate 8 years later of 7 in 105. Work done by the participants of an E.col
23、i annotation workshop held in November 2003 reconciled sequence differences that led to deposit of a corrected MG1655 genome sequence entry (GenBank U00096.2, released in June 2004). Subsequent work done in a March 2005 workshop introduced additional changes. The participants of these workshops have
24、 co-authored this manuscript.Although both MG1655 and W3110 are isolates of the E.coli K-12 strain, their genomes are not identical. The different lengths of the MG1655 (4 639 675 nt) and W3110 (4 646 332 nt) genomes reflect a larger number of insertion sequence (IS) elements and absence of a defect
25、ive phage in the W3110 genome. Other differences are found in the occurrence of mutations, reflecting changes that presumably occurred during maintenance of the cultures in separate laboratories.Genome annotation, of necessity, is an ongoing process. In the interim from 1997, many scientists, not or
26、ganized as a group, but united intellectually by their interest in developing a unified vision of the organism, have continued to upgrade, update and collate new information about E.coli as it has emerged. This has resulted in a number of public databases with information on genes, genomics and prot
27、eins of E.coli K-12, none identical, each with a different emphasis. Other more general databases contain information relevant to many organisms, helpful in interpretation of gene sequences.The goal of the current project was to consolidate the work of scientists who have been working independently
28、by developing our best consensus on the status and properties of each of the genes of E.coli K-12 at the present moment. The goal was decidedly not to create a new database, but instead, to present to the public a comprehensive, updated annotation of E.coli K-12 which would be presented both in spre
29、adsheet and simple flat-file formats. The latter can easily be parsed by computers and readers alike and therefore can be incorporated into extant databases by their providers. These are available as Supplementary Table 1.xls, Supplementary Table 1.txt and, to aid in interpreting the data, Supplemen
30、tary Table 1 Explanatory Notes. Less extensive information from the new MG1655 and W3110 annotations have been included in new GenBank and DNA Data Bank of Japan (DDBJ) entries, accession number U00096.3 and DDBJ AP009048, respectively.We refer to this outcome as a snapshot to emphasize that informa
31、tion about E.coli genes and their products are a moving target, and overtaken rapidly with more recent information. The authors have made no plans to develop this snapshot further. Highly desirable would be the establishment of an accessible community resource of data on E.coli K-12 with community p
32、articipation, ongoing maintenance and continuous updating of all information. At this moment interested members of the E.coli community are applying to NIH for support to establish a K-12 information resource.TopAbstractINTRODUCTIONRESULTS AND DISCUSSIONRESULTSSUPPLEMENTARY DATASupplementary MaterialREFERENCESRESULTS AND DISCUSSIONThe workshopsThe need to consolidate the efforts of scientists who had been working independently was a subject of discussion at an informal E.coli consortium meeting organized by Barry Wanner an
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1