资源预览内容
第1页 / 共39页
第2页 / 共39页
第3页 / 共39页
第4页 / 共39页
第5页 / 共39页
第6页 / 共39页
第7页 / 共39页
第8页 / 共39页
第9页 / 共39页
第10页 / 共39页
亲,该文档总共39页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Computational EpigenomicsMichael Q. Zhang Cold Spring Harbor Laboratory Tsinghua UniversityGraduate Summer School on Bioinformatics of China (GSSBC07)Special Session: Computational epigenetics and chromatin regulation2:30 p.m. 3:00 p.m.Insulator (CTCF) binding site motif and its distribution in the human genome (M. Q. Zhang, CSHL) 3:00 p.m 3:30 p.m.Allelic expression and genomic imprinting (A. Hartemink, Duke)3:30 p.m. 4:00 p.m.Coffee Break4:00 p.m 4:30 p.m.Mapping the structure of human chromatin (W.S. Noble, UW) 4:30 p.m 5:00 p.m.A genomic code for nucleosome positioning (E. Sega, Wl) 5:00 p.m. 5:30 p.mRealizing the medical potential of epigenomics by tailored algorithms and software (T. Lengauer/C. Bock, MPI)ISMB, Vienna, Jul 23, 2007Epigenomics and DNA Methylation LandscapeEpigenetics and epigenomicsEpigenetics: Inheritable changes in gene expression that cannot be attributed to changes in DNA sequence. (Silencing refers to epigenetic loss of gene expression that is equivalent to an inactivating mutation.)Epigenomics: genome-wide approach to studying epigenetics. Central goal is to define the DNA sequence features that direct epigenetic processes.Epigenetic Mechanisms : RNAi (Post-transcriptional); Histone modifications/variation and DNA methylation (Transcriptional).Genomic ImprintingBeckwith-Wiedemann syndrome and with many types of tumors Temporal changes in methylation pattern during development Methylation pattern is largely fixed in post- development stageGamateGamateZygoteBlastocystEnzymatic Fractionation of the Human Genome Based On Methylation StatusUnmethylated CompartmentMethylated CompartmentCpG Island (CGI): Size 200bp C+G 0.50 CpG Obs/Exp 0.60 Gardiner-Garden or the evolution of Alu insertion site towards avoidance of U- domainsCGI 31.6/1.78 18fold! 75% promoters, 7% total CpGsthe C value paradoxThe enrichment of regulatory sequences in the relatively small unmethylated compartment suggests that cytosine methylation constrains the effective size of the genome by specific exposure of regulatory sequences and the sequestration of other sequences. This buffers the genome against changes in length and provides an explanation for the C value paradox, which applies only to those eukaryotes that contain methylated genomes. Gene Expression GEO Dataset: single channel U133 platform for whole brain 102 genes with unmethylated promoter 32 genes with methylated promoter Wilcoxon rank sum test shows p-value of 0.0032 Average rank U: 40 M: 60 Genes with methylated promoters are under- expressedMotifs at the boundary (In addition to AluYs)PromotersPromoter: 500bp upstream and 100bp downstream around TSS Includes Refseq, EPD and FirstEF prediction Methylated promoters have lower GC content and CpG density400 Window051015202500.20.40.60.81G+C contentMcrBC_Group IMcrBC_Group IIREClassifier Design Feature selection Choosing a subset of features that achieves a low error rate Select a method of prediction PCA K-means clustering Linear Discriminant Logistic Regression Support Vector Machine Estimate the error rate of each model and choose the model with lowest error rateFeatures Features Total 102 features G+C content, 16 dinucleotides, 64 trinucleotides 20 hexamers: 10 from U and 10 from M (masked Alu) Alu coverage Choose those hexamers that give better scores/counts from DME Data Normalization Standard normalizationFeature Selection by Recursive Feature Elimination (RFE) RFE algorithm: Guyon absent from primate, bovine genomes KRAB-ZNF repressor: regulates fetal growth factors, placental hormones Loss may be related to evolution of singleton versus multiparous births? ZNF71: Conserved in most mammals, but lost in rodent genomes KRAB-ZNF repressor: Targets play a role in blood vessel and heart development, placenta structure LLNL765: a novel human gene Single ortholog in dog, rat and mouse; duplicated to a family of 4 in primates SCAN-ZNF protein: Targets are mostly transcription factors including proteins with roles in angiogenesis, placenta and heart development Transgenic mice display specific structural defects in heart and amniotic sacZNF264LLNL765 (mouse homolog)L. StubbsParalogs diverge through positive selection on critical amino acids and deletions and duplications within the tandem finger arraysZNF2 2 1*DVL YAI QHT SAV HQE VRR QAS CDK WCN TRS RDF WCK QQS - SNM WCK ZNF1 5 5*dvl YAV QHT RAV HQE FRS QAR GDK WCN TQS TNL RSN ZNF2 3 0*dvl YAI qht CIV HQK LSR - DDK WYI - SGL RSN ZNF2 2 2*dvl YAI QRT CAV HQK LSR - ddk WYV -SGF HSN ZNF2 2 3*dvl YAI QHT CAV HQK LSR - ddk RYI - TGL QSN ZNF2 8 4 dvt yai QQT RGV HQE SNR QAS CDK WCR tqs - - - - SNL WGR EKF WTT ZNF22 4 dvy YAI QHT RAV HQE GRR QAS CDT WCK TQS RDF RCK QHS - SNM WCK QQS WTT eht RNM QSL ZNF2 2 5 dvy YAI QHI RGV HQE SNR LAS YDK WGR TQS RDF WCN QQT QQS SNM RSN EQS WTT More t
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号