生物信息学课程复习Word格式文档下载.docx-资源下载

生物信息学课程复习Word格式文档下载.docx

1、2. genome-一个细胞、细胞器或病毒中的所有DNA（或RNA） The genet ic material of an organism, contained in one haploid set of chromosomes.3. Proteome-蛋白质组一个细胞内的全套蛋白质，反映特殊阶段、环境、状态下细胞或组织在翻译水平的蛋白质表达谱。The entire collection of proteins t hat are encoded by the genome of an organism. Initially the proteome is estimated by gen

2、e prediction and annotation methods but eventually will be r evised as more information on the sequence of the expressed genes is obtained.4. Proteomics （蛋白质组学）Systematic analysis of protein expression of normal and diseased tissues that involves the separation, identification and characterization o

3、f all of the proteins in an organism.5. 功能基因组学=以解释基因组的功能及控制机制为目标,核心问题: 基因组多样性, 表达及调节, 模式生物6. 比较基因组学=将不同物种基因组进行比较,有助于根据同源性方法分析基因组功能,有助于发现人类和其他生物的本质差异,探索遗传语言的奥秘7. genome physical mapping （基因组物理作图）采用分子生物学技术直接检验DNA分子来作图以标示序列特征（基因等）在基因组上的位置。遗传图的解析度和精确度较低,需物理图补充，单位: bp，物理作图方法（很多,可大致分为3类）限制作图（restriction m

4、apping）限制性酶切图谱；FISH （fluorescent in situ hybridization）；荧光原位杂交STS mapping。数据量大,只能用计算机完成8. 遗传作图（genetic mapping）= 采用遗传技术（杂交,谱系等）作图以标示序列特征（基因等）在基因组上的位置，单位: cM，标记基因，分子/DNA标记: RFLP, SSLP（小卫星/VNTR,微卫星）, SNP，遗传作图方法: 连锁分析9. 基因芯片（gene chip）= 也叫DNA chip或microarray（微阵列），是由大量DNA或寡核苷酸探针密集排列所形成的探针阵列，将DNA短片段附着于固体

5、（玻璃,塑料,硅等）表面以形成阵列，其工作的基本原理是通过杂交检测信息=Biochips（生物芯片）Miniaturized arrays of large numbers of molecular substrates, often oligonucleotides, in a defined pat tern. They are also called DNA microarrays and microchips.10. 序列比对（sequence alignment）= 为评价相似性（similarity）的程度或同源性（homology）的可能,将两个或更多的序列排列起来以得到最大一致

6、性（identity）的过程。一致性序列相同的程度;同源性序列源于共同的祖先而产生的相似. 共有序列（consensus）又称一致性片段,描述了功能位点每个位置上进化的保守性= Alignment （联配/ 比对/ 联配） Refers to the procedure of comparing tw o or more sequences by looking for a series of individual characters or char acter patterns that are in the same order in the sequences. Of the tw

7、o types of alignment, local and global, a local alignment is generally the most useful . See also Local and Global alignments.11. 全局比对（global alignment）= 全局比对将两个序列从头到尾比较, 以保证能找到较好的匹配，Needleman & Wunsch （1970），是Dynamic programming （动态规划）方法对生物序列比对的最早运用，是一个逐步递增最优比对的方法：步骤=建立矩阵，给矩阵打分，得到优化比对12. 局部比对（local

8、 alignment）-局部比对则找到优化匹配的子序列（subsequence），然后将其扩展到全局。数据库搜索几乎都是使用局部比对，Smith & Waterman （1981）解决了局部比对的问题，其实是Needleman-Wunsch算法的变种，其他一些算法（如BLAST,FASTA）更快,但是以牺牲部分精确性为代价13. 计分矩阵（scoring matrix）= 叫替换矩阵（substitution matrix）,用来给一个比对打分,以衡量两个序列相似程度,由大量训练集比对产生,最有名的是PAM250和BLOSUM62=是BLAST的缺省矩阵,对于亲缘远的近的序列性能都很好14.

9、功能位点（functional site） DNA序列中,除基因外,还包含其它信息，如调控因子等,存放这些信息的DNA片段称为功能位点15. 基因调控网络，genetic regulatory networks=GRN16. 蛋白互作网络, Protein interaction network17. 代谢网络, Metabolic network18. Phylogenetic studies （系统发育研究）19. Paralogous （旁系同源）Homologous sequences within a single species that arose by gene duplica

10、tion. Genes that are related through gene dup lication events. These events may lead to the production of a family of related proteins with similar biological functions within a species. Paralogous gene families within a species are identified by using an i ndiv idual protein as a query in a databas

11、 esimilarity search of the entireproteome of an organism . The process is repeated for the entire proteome and the result ing sets of related prot eins are then searched for clusters that are most likely to have a conserved domain structure and should represent a paralogous gene family.20. Orthologo

12、us（直系同源）= Homologous sequences in diff erent species that arose from a common ancestral gene during speciation; may or may not be responsible for a similar function. A pair of genes found in tw o species are orthologous when the encoded proteins are 60-80% identical in an alignment. The proteins alm

13、ost certainly have the same three-dimensi onal structure, domai n structure, and biological function, and the encoding genes have originated from a common ancestor gene at an earlier evolutionary time. Two orthologs 1 and II in genomes A and B, respectively, may be identified when the complete genom

14、es of two species are available: （1） in a database similarity search of all of the proteome of B using I as a query, II is the best hit found, and （2） I is the best hit when 11 is used as a query of t he proteome of B. The best hit is the database sequence with the highest expect value （E）. Ortholog

15、y is also predicted by a very close phylogenetic relationship bet ween sequences or by a cluster analysis. Compare to Para logs. See also Cluster analysis.21. Database（数据库）=A computerized storehouse of data that provides a standardized way for locating, adding, removing, and changing data. See also

16、Object-oriented database, Relational database.22. Contig （序列重叠群/ 拼接序列）=A set of clones that can be assembled into a linear order. A DNA sequence that overlap s with another conti g. The full set of overlapping sequences （contigs） can be put together to obt ai n the sequence for a long region of DNA

17、that cannot be sequenced in on e run in a sequencing assay . Important in genetic mapping at the molecular level. 23. COG（直系同源簇） Clusters of orthologous groups in a set of groups of related sequences in microorganism and yeast （S. cerevisiae）. These groups are found by whol e proteome comparisons an

18、d include or thologs and paralogs. See also Orthologs and Paralogs.24. Codon usage= Analysis of the codons used in a particular gene or organism. 25. BAC clone（细菌人工染色体克隆） =Bacterial artificial chromosome vector carrying a genomic DNA insert, typically 100200 kb. Most of the large-insert clones seque

19、nced in th e project were BAC clones.26. Accession number （记录号）=A unique identifier that is assigned to a single datab ase entry for a DNA or protein sequence.27. Alignment score （联配/ 比对/ 联配值） =An algorit hmically c omputed score based on the number of matches, substitutions, insertions, and deletio

20、ns （gaps） within an alignment. Scores for matches and substitutions Are derived from a scoring matrix such as the BLOSUM and PAM matrices for proteins, and aftine gap penalties suitable f or the matrix are chosen. Alignment scores are in log odds units, often bit units （log to the base 2）. Higher sc

21、ores denote better alignments. See also Similarity score, Distance in sequence analysis.28. Annotation（注释）=The prediction of genes in a genome, including the location of protein-encoding genes, the sequence of the encoded proteins, any significant matches to other Proteins of known function, and the

22、 location of RNA-encoding genes. Predictions are bas ed on gene models; e.g., hidden Markov models of introns and exons in proteins encod ing genes, and models of secondary structure in RNA.29. 基因组组装（genome assembly）: 将大量短序列拼装成完整基因组的过程30. FTP （File Transfer Protocol）（文件传输协议）Allows a person to transf

23、er files from one computer to another across a network using an FTP-capable client program. The FTP client program can only communicate wit h machines that r un an FTP server. The server, in turn, will make a specific portion of its tile system available for FTP access, providing that the client is

24、able to supply a recognized user name and password to the server.二、英译汉Genomics 基因组学, proteomics 蛋白质组学, 基因调控网络=genetic regulatory networks=GRN, 蛋白互作网络, Protein interaction network, 代谢网络, Metabolic network.比较基因组学（comparative genomics）,结构基因组学（Strutural genomics, 2001）,功能基因组学（Functional genomics）,系统生物学（

25、system biology）,系统树重建（phylogenic reconstruction），功能序列（functional sequence）、序列模式/模体/基元/基序（motif）、信号（signal）,启动子（promoter）,基因终止序列（terminator sequence）, 剪切位点（splice site）, enhancer增强子, operator（操纵子）, 转录起始位点（transcription initiation site）, 转录终止位点（transcription stop site）, 翻译起始（translation initiation）, 编

26、码区（coding region） =Coding region of DNA= CDS., 翻译终止,（translation stop）, 识别区（recognition region）, 5UTR（untranslated region）, 3UTR,密码子用法（codon usage）, 剪切位点（splicing）, Exon （外显子）, intron=内含子, Alignment （联配/ 比对/ 联配）, DNA microarrays（芯片）=biochips, Conservation （保守）, Consensus（一致序列）= A single sequence tha

27、t represents, at each subsequent position, the variation found within corresponding columns of a multiple sequence alignment., DNA Sequencing （DNA测序）, Domain （功能域）, Dot matrix （点标矩阵图）, Draft genome sequence （基因组序列草图）, Expect value （E）（E值）, Expressed Sequence Tag （EST）, （表达序列标签）=Randomly selected, p

28、artial cDNA sequence; represen ts its corresponding mRNA. dbEST is a large database of ESTs at GenBank, NCBI. FASTA （一种主要数据库搜索程序）, Full shotgun clone （鸟枪法克隆）=A large-insert clone for which full shotgun sequence has been produced. gap （空位/ 间隙/ 缺口）, Genetic map （遗传图谱）, Global alignment（整体联配）= Attempts

29、 to match as many characters as possible, from end to end, in a set, of two or more sequences. Local alignment （局部联配）=Attempts to align regions of sequences with the highest density of matches, in doing so, one or more islands of suba lignments are created in the aligned sequences. GSS（基因综述, 调查序列）=G

30、enome survey sequence. HGMP （人类基因组图谱计划）=Human Genome Mapping Project. Homology （同源性）=A similar component in two organism s （e.g., genes with strongly similar sequences ） that can be attributed to a common ancestor of the two organisms during evolution. HTGS/HGT（高通量基因组序列） High-throughout genome seque

31、nces. Identity（相同性/ 相同率）=The extent to which two （nucleotide or amino acid） sequences are invariant. MMDB （分子建模数据库）=Molecular Modelling Dat abase. A taxonomy assigned data base of PDB files, and related information. Multiple Sequence Alignment （多序列联配）=An alignment of three or more sequences with gaps inserted in the seq

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
特别说明：	请自助下载，系统不会自动发送文件的哦；如果您已付费，想二次下载，请登录后访问：我的下载记录
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？