ImageVerifierCode 换一换
格式:DOCX , 页数:17 ,大小:24.42KB ,
资源ID:6187868      下载积分:3 金币
快捷下载
登录下载
邮箱/手机:
温馨提示:
快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。 如填写123,账号就是123,密码也是123。
特别说明:
请自助下载,系统不会自动发送文件的哦; 如果您已付费,想二次下载,请登录后访问:我的下载记录
支付方式: 支付宝    微信支付   
验证码:   换一换

加入VIP,免费下载
 

温馨提示:由于个人手机设置不同,如果发现不能下载,请复制以下地址【https://www.bdocx.com/down/6187868.html】到电脑端继续下载(重复下载不扣费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录   QQ登录  

下载须知

1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
2: 试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。
3: 文件的所有权益归上传用户所有。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

本文(MAQ 参数说明.docx)为本站会员(b****5)主动上传,冰豆网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知冰豆网(发送邮件至service@bdocx.com或直接QQ联系客服),我们立即给予删除!

MAQ 参数说明.docx

1、MAQ 参数说明MAQ Key Commandsfasta2bfa maq fasta2bfa in.ref.fasta out.ref.bfa Convert sequences in FASTA format to Maqs BFA (binary FASTA) format. fastq2bfq maq fastq2bfq -n nreads in.read.fastq out.read.bfq|out.prefix Convert reads in FASTQ format to Maqs BFQ (binary FASTQ) format. OPTIONS: -n INT numbe

2、r of reads per file not specified map maq map -n nmis -a maxins -c -1 len1 -2 len2 -d adap3 -m mutrate -u unmapped -e maxerr -M c|g -N -H allhits -C maxhits out.aln.map in.ref.bfa in.read1.bfq in.read2.bfq 2 out.map.log Map reads to the reference sequences. OPTIONS: -n INT Number of maximum mismatch

3、es that can always be found 2 -a INT Maximum outer distance for a correct read pair 250 -A INT Maximum outer distance of two RF paied read (0 for disable) 0 -c Map reads in the colour space (for SOLiD only) -1 INT Read length for the first read, 0 for auto 0 -2 INT Read length for the second read, 0

4、 for auto 0 -m FLOAT Mutation rate between the reference sequences and the reads 0.001 -d FILE Specify a file containing a single line of the 3-adapter sequence null -u FILE Dump unmapped reads and reads containing more than nmis mismatches to a separate file null -e INT Threshold on the sum of mism

5、atching base qualities 70 -H FILE Dump multiple/all 01-mismatch hits to FILE null -C INT Maximum number of hits to output. Unlimited if larger than 512. 250 -M c|g methylation alignment mode. All C (or G) on the forward strand will be changed to T (or A). This option is for testing only. -N store th

6、e mismatch position in the output file out.aln.map. When this option is in use, the maximum allowed read length is 55bp. NOTE: * Paired end reads should be prepared in two files, one for each end, with reads are sorted in the same order. This means the k-th read in the first file is mated with the k

7、-th read in the second file. The corresponding read names must be identical up to the tailing /1 or /2. For example, such a pair of read names are allowed: EAS1_1_5_100_200/1 and EAS1_1_5_100_200/2. The tailing /12 is usually generated by the GAPipeline to distinguish the two ends in a pair. * The o

8、utput is a compressed binary file. It is affected by the endianness. * The best way to run this command is to provide about 1 to 3 million reads as input. More reads consume more memory. * Option -n controls the sensitivity of the alignment. By default, a hit with up to 2 mismatches can be always fo

9、und. Higher -n finds more hits and also improves the accuracy of mapping qualities. However, this is done at the cost of speed. * Alignments with many high-quality mismatches should be discarded as false alignments or possible contaminations. This behaviour is controlled by option -e. The -e thresho

10、ld is only calculated approximately because base qualities are divided by 10 at a certain stage of the alignment. The -Q option in the assemble command precisely set the threshold. * A pair of reads are said to be correctly paired if and only if the orientation is FR and the outer distance of the pa

11、ir is no larger than maxins. There is no limit on the minimum insert size. This setting is determined by the paired end alignment algorithm used in Maq. Requiring a minimum insert size will lead to some wrong alignments with highly overestimated mapping qualities. * Currently, read pairs from Illumi

12、na/Solexa long-insert library have RF read orientation. The maximum insert size is set by option -A. However, long-insert library is also mixed with a small fraction of short-insert read pairs. -a should also be set correctly. * Sometimes 5-end or even the entire 3-adapter sequence may be sequenced.

13、 Providing -d renders Maq to eliminate the adapter contaminations. * Given 2 million reads as input, maq usually takes 800MB memory. mapmerge maq mapmerge out.aln.map in.aln1.map in.aln2.map . Merge a batch of read alignments together. NOTE: * In theory, this command can merge unlimited number of al

14、ignments. However, as mapmerge will be reading all the inputs at the same time, it may hit the limit of the maximum number of opening files set by the OS. At present, this has to be manually solved by endusers. * Command mapmerge can be used to merge alignment files with different read lengths. All

15、the subsequent analyses do not assume fixed length any more. rmdup maq rmdup out.rmdup.map in.ori.map Remove pairs with identical outer coordinates. In principle, pairs with identical outer coordinates should happen rarely. However, due to the amplification in sample preparation, this occurs much mo

16、re frequently than by chance. Practical analyses show that removing duplicates helps to improve the overall accuracy of SNP calling. assemble maq assemble -sp -m maxmis -Q maxerr -r hetrate -t coef -q minQ -N nHap s in.ref.bfa in.aln.map 2 s.log Call the consensus sequences from read mapping. OPTION

17、S: -t FLOAT Error dependency coefficient 0.93 -r FLOAT Fraction of heterozygotes among all sites 0.001 -s Take single end mapping quality as the final mapping quality; otherwise paired end mapping quality will be used -p Discard paired end reads that are not mapped in correct pairs -m INT Maximum nu

18、mber of mismatches allowed for a read to be used in consensus calling 7 -Q INT Maximum allowed sum of quality values of mismatched bases 60 -q INT Minimum mapping quality allowed for a read to be used in consensus calling 0 -N INT Number of haplotypes in the pool (=2) 2 NOTE: * Option -Q sets a limi

19、t on the maximum sum of mismatching base qualities. Reads containing many high-quality mismatches should be discarded. * Option -N sets the number of haplotypes in a pool. It is designed for resequencing of samples by pooling multiple strains/individuals together. For diploid genome resequencing, th

20、is option equals 2. indelpe maq indelpe in.ref.bfa in.aln.map out.indelpe Call consistent indels from paired end reads. The output is TAB delimited with each line consisting of chromosome, start position, type of the indel, number of reads across the indel, size of the indel and inserted/deleted nuc

21、leotides (separated by colon), number of indels on the reverse strand, number of indels on the forward strand, 5 sequence ahead of the indel, 3 sequence following the indel, number of reads aligned without indels and three additional columns for filters. At the 3rd column, type of the indel, a star

22、indicates the indel is confirmed by reads from both strands, a plus means the indel is hit by at least two reads but from the same strand, a minus shows the indel is only found on one read, and a dot means the indel is too close to another indel and is filtered out. Users are recommended to run thro

23、ugh maq.pl indelpe to correct the number of reads mapped without indels. For more details, see the maq.pl indelpe section. indelsoa maq indelsoa in.ref.bfa in.aln.map out.indelsoa Call potential homozygous indels and break points by detecting the abnormal alignment pattern around indels and break po

24、ints. The output is also TAB delimited with each line consisting of chromosome, approximate coordinate, length of the abnormal region, number of reads mapped across the position, number of reads on the left-hand side of the position and number of reads on the right-hand side. The last column can be

25、ignored. The output contains many false positives. A recommended filter could be: awk $5+$6-$4 = 3 & $4 out.aln.txt Display the read alignment in plain text. For reads aligned before the Smith-Waterman alignment, each line consists of read name, chromosome, position, strand, insert size from the out

26、er coorniates of a pair, paired flag, mapping quality, single-end mapping quality, alternative mapping quality, number of mismatches of the best hit, sum of qualities of mismatched bases of the best hit, number of 0-mismatch hits of the first 24bp, number of 1-mismatch hits of the first 24bp on the

27、reference, length of the read, read sequence and its quality. Alternative mapping quality always equals to mapping quality if the reads are not paired. If reads are paired, it equals to the smaller mapping quality of the two ends. This alternative mapping quality is actually the mapping quality of a

28、n abnormal pair. The fifth column, paired flag, is a bitwise flag. Its lower 4 bits give the orientation: 1 stands for FF, 2 for FR, 4 for RF, and 8 for RR, where FR means that the read with smaller coordinate is on the forward strand, and its mate is on the reverse strand. Only FR is allowed for a

29、correct pair. The higher bits of this flag give further information. If the pair meets the paired end requirement, 16 will be set. If the two reads are mapped to different chromosomes, 32 will be set. If one of the two reads cannot be mapped at all, 64 will be set. The flag for a correct pair always equals to 18. For reads aligned by the Smith-Waterman alignment afterwards,

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1