资源预览内容
第1页 / 共24页
第2页 / 共24页
第3页 / 共24页
第4页 / 共24页
第5页 / 共24页
第6页 / 共24页
第7页 / 共24页
第8页 / 共24页
第9页 / 共24页
第10页 / 共24页
亲,该文档总共24页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
5 Open Problems in Bioinformatics,Pedigrees from Genomes Comparative Genomics of Alternative Splicing Viral Annotation Evolving Turing Patterns Protein Structure Evolution,Three Processes,Recombination Choosing Parents The Mutational Process,Pedigree process,Coalescent Rebombination process,Seqeunce/Individual Boundary,From Yun Song,From genomes to pedigrees,Probability of Data given a pedigree.,Elston-Stewart (1971) -Temporal Peeling Algorithm:,Lander-Green (1987) - Genotype Scanning Algorithm:,Mother,Father,Condition on parental states Recombination and mutation are Markovian,Mother,Father,Condition on paternal/maternal inheritance Recombination and mutation are Markovian,Comment: Obvious parallel to Wiuf-Hein99 reformulation of Hudsons 1983 algorithm,Genomes with r and m/r infinity r - recombination rate, m - mutation rate,Counting within a small interval would reveal the length of the path connecting the two segments. Siblings are readily revealed, since they will have segments with 2m density of mutations The distribution of path lengths are readily observable between two sequences All embedded phylogenies are observable,Benevolent Mutation and Recombination Process,From Phylogenies to Pedigrees Mikes counter example, linkage and individuals,grandparents,Individual 1,Individual 2,Different Pedigrees Same Phylogenies,Gluing Phylogenies together,?,Sibling Sequences come from different parents.,A recombinants parent are sister sequences.,Comparative Genomics of Alternative Splicing,From Transcripts to the AS-Graph,How well known is the AS-graph as a function number of transcripts? A family and distribution of transcripts, can they be explained an AS-graph with probabilities at donor sites or do we need probabilities for (donor,acceptor) pairs? Or possibly even more complicated situations. And is sampling transcripts good enough to distinguish these situations.,Mini-project: reliability of AS-detection.,Choose Idealized AS-Graph: Genome Choose donor and acceptor sites in random pairs. For each possible splice pair assign probability for choosing it. This should define a probability for all transcripts.,Generate a set of transcripts. Reconstruct AS-Graph.,Key questions: How many transcripts must be sampled to detect AS. How well will the AS-Graph be recovered?,Optimal DAG (directed acyclic graph) under restrictions,Finding a set of annotations: Find set of paths, maximizing sum of scores. The score of minimal path must be above threshold. Two paths must differ significantly: An enclosed area, the maximal height must be d higher than the boundary defining it. Height(i,j) = di,j + di,j,Does known AS genes have more CTO structure than non-AS genes? Do the AS correspond to the CTO structure Is the CTO structure evolutionary conserved?,Phylogenetically related ASGs,Is ASG conserved? What is conserved? How is selection along position dependent on splicing status?,http:/www.tulane.edu/dmsander/WWW/335/Diarrhoea.html,http:/www.tulane.edu/dmsander/WWW/335/Papovaviruses.html,http:/www.tulane.edu/dmsander/WWW/335/Retroviruses.html,Virus Annotation,Classes of Gene Structures,Retroviridae Arrangements,Papoviridae Arrangement,Diarrhoea Causing Arrangements,Illustrating the 3 main classes of gene structures: Unidirectional, Convergent and Divergent.,The Problems of Viral Annotation,HMM gene structure generator (McCauley) Gene Structure Evolution (de Groot) Alignment (Caldeira, Lunter, Rocco) Recombination (Lyngs, Song) Multiple constraints: RNA secondary structure, gene conservation, binding/transcriptional instructional sites.,Our 8 State HMM which allows for Unidirectional overlapping gene structures,HMM States Non-coding Coding RF1 Coding RF2 Coding RF3 Coding RF1,2 Coding RF1,3 Coding RF2,3 Coding RF1,2,3,Combining Levels of Selection.,Protein-Protein Hein & Stvlbk, 1995 Codon Nucleotide Independence Heuristic Jensen & Pedersen, 2001 Contagious Dependence,Assume multiplicativity: fA,B = fA*fB,Protein-RNA,Doublets,Singlet,Contagious Dependence,Table illustrating the performance benefit in Sensitivity we obtain utilizing a Phylogenetic HMM. We extend the HMM model to include evolutionary information from 13 aligned HIV2 sequences.,http:/www.ncbi.nlm.nih.gov/Genbank/,http:/www.ncbi.nlm.nih.gov/genomes/VIRUSES/viruses.html,Entrez Genomes currently contains 2120 Reference Sequences for 1510 viral genomes and 36 Reference Sequences for viroids.,Properties of overlapping genes are conserved across microbial genomes. Genome Res. 2004 Nov;14(11):2268-72.,GenBank: Centralized resource for publicly available viral sequence data.,Within microbial genomes, one third of annotated genes contain some degree of overlap, and one third of these are either Convergent or Divergent.,Krakauer, D.C
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号