资源预览内容
第1页 / 共32页
第2页 / 共32页
第3页 / 共32页
第4页 / 共32页
第5页 / 共32页
第6页 / 共32页
第7页 / 共32页
第8页 / 共32页
第9页 / 共32页
第10页 / 共32页
亲,该文档总共32页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Microrray Data Standardisation,Microarray Gene Expression Database group - MGEDDecember, 2000,Public data repositories for microarray data,There is a growing consensus in the life science community for a need for public repositories of gene expression data analogous to DDBJ/EMBL/GenBank for sequences,Some of the reasons:,Gradually building up gene expression profiles for various organisms, tissues, cell types, developmental stages, various states, under influence of various compounds Through links to other genomics databases builds up systematic knowledge about gene functions and networksComparison of profiles, access and analysis of data by third partiesCross validation of results and platforms - quality control,Systematic gene expression profiling initiatives in public domain,The International Life Science Institute (ILSI) is coordinating a program undertaken by 25 pharmaceutical and food companies to generate toxicity related gene expression data under defined experimental conditionsevaluate gene expression profiles in standardised test systems following exposure to toxicantsrelate changes in gene expression to other measures of toxicity,Microarray data handling and analysis - a major bottleneck (Calculations by Jerry Lanfear),Experiments:100 000 genes in human320 cell types2000 compounds3 time points2 concentrations2 replicatesData8 x 1011 data-points1 x 1015 = 1 petaB of data,Expression data repository projects,Public repositories in making:GEO - NCBIGeneX - NCGRArrayExpress - EBIIn-house databases - Stanford, MIT, University of Pennsylvania, Organism specific databases: Mouse in JacksonProprietary databases - Gene Logic, NCI,Difficulties,Raw data are imagesWhat is needed for higher level analysis and mining is gene expression matrix (genes/samples/gene expression levels)lack of standard measurement units for gene expressionlack of standards for sample annoation,Raw data - images,Treated sample labeled red (Cy5)Control data labeled green (Cy3)Competitive hybridization onto chipRed dot - gene overexpressed in treated sampleGreen dot - gene underexpressed in treated sampleYellow - equally expressedIntensity - “absolute” levelred/green - ratio of expression2 - 2x overexpressed0.5 - 2x underexpressedlog2( red/green ) - “log ratio” 1 2x overexpressed-1 2x underexpressed,cDNA plotted microarrayStanford university (Yeast,1997),Gene expression matrix,Samples,Genes,Gene expression levels,Gene expression levels,What we would like to havegene expression levels expressed in some standard units (e.g. molecules per cell)reliability measure associated with each value (e.g. standard deviation)What we do haveeach experiment using different unitsno reliability information,Comparing expression data,Comparing expression data,Comparing expression data,Measurement units,In perspective:standard controls for experiments (on chips and in the samples)replicate measurementsTemporary solution:storing intermediate analysis results (including the images) and annotations of how they were obtained - i.e., the evidence,Comparing expression data - problem 2,How gene names relate in different data matrices?How samples relate in different data matrices?,Sample annotation,Gene expression data have any meaning only in the context of what are the experimental conditions of the target systemControlled vocabularies and ontologies (species, cell types, compound nomenclature, treatments, etc) are needed for unambiguous sample annotation Sample annotations in current public databases are typically useless,In perspective,Standard units for gene expression measurementsStandards for sample annotation.,More immediate actions,To understand what information about microarray experiments should be captured to make the descriptions reasonably self-containedDevelop data exchange format able to capture this minimum informationDevelop recommendations how data should be normalised and what controls should be used,MGED group,The MGED group is an open discussion group initially established at the Microarray Gene Expression Database meeting MGED 1 (14-15 November, 1999, Cambridge, UK). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. The underlying goal is to facilitate the establishing of gene expression data repositories, comparability of gene expression data from different sources and interoperability of different gene expression databases and data analysis software. Since 1999 the group has had two general meetings and the third one is planned for 2001For more see www.mged.org,MGED participants including,AffymetrixBerkeleyDDBJ DKFZEMBLGene LogicIncyteMax Plank Institute,NCBINCGRNHGRISanger CentreStanfordUni PennsylvaniaUni WashingtonWhitehead Institute,
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号