资源预览内容
第1页 / 共15页
第2页 / 共15页
第3页 / 共15页
第4页 / 共15页
第5页 / 共15页
第6页 / 共15页
第7页 / 共15页
第8页 / 共15页
第9页 / 共15页
第10页 / 共15页
亲,该文档总共15页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
本科毕业论文外文文献及译文文献、资料题目:Cluster AnalysisBasicConceptsand Algorithms文献、资料来源:文献、资料发表(出版)日期:院 (部): 土木工程学院专 业: 土木工程班 级:姓 名:学 号: 指导教师:翻译日期:外文文献:Cluster AnalysisBasicConceptsand AlgorithmsCluster analysis divides data into groups (clusters) that are meaningful, useful,or both. If meaningful groups are the goal, then the clusters should capture thenatural structure of the data. In some cases, however, cluster analysis is only auseful starting point for other purposes, such as data summarization. Whetherfor understanding or utility, cluster analysis has long played an importantrole in a wide variety of elds: psychology and other social sciences, biology,statistics, pattern recognition, information retrieval, machine learning, anddata mining.There have been many applications of cluster analysis to practical problems. We provide some specic examples, organized by whether the purposeof the clustering is understanding or utility.Clustering for UnderstandingClasses, or conceptually meaningful groupsof objects that share common characteristics, play an important role in howpeople analyze and describe the world. Indeed, human beings are skilled atdividing objects into groups (clustering) and assigning particular objects tothese groups (classication). For example, even relatively young children canquickly label the objects in a photograph as buildings, vehicles, people, animals, plants, etc. In the context of understanding data, clusters are potentialclasses and cluster analysis is the study of techniques for automatically ndingclasses. The following are some examples:Biology. Biologists have spent many years creating a taxonomy (hierarchical classication) of all living things: kingdom, phylum, class,order, family, genus, and species. Thus, it is perhaps not surprising thatmuch of the early work in cluster analysis sought to create a disciplineof mathematical taxonomy that could automatically nd such classication structures. More recently, biologists have applied clustering toanalyze the large amounts of genetic information that are now available.For example, clustering has been used to nd groups of genes that havesimilar functions. Information Retrieval. The World Wide Web consists of billions ofWeb pages, and the results of a query to a search engine can returnthousands of pages. Clustering can be used to group these search results into a small number of clusters, each of which captures a particularaspect of the query. For instance, a query of “movie” might returnWeb pages grouped into categories such as reviews, trailers, stars, andtheaters. Each category (cluster) can be broken into subcategories (sub-clusters), producing a hierarchical structure that further assists a users exploration of the query results. Climate. Understanding the Earths climate requires nding patternsin the atmosphere and ocean. To that end, cluster analysis has beenapplied to nd patterns in the atmospheric pressure of polar regions andareas of the ocean that have a signicant impact on land climate. Psychology and Medicine. An illness or condition frequently has anumber of variations, and cluster analysis can be used to identify thesedifferent subcategories. For example, clustering has been used to identifydifferent types of depression. Cluster analysis can also be used to detectpatterns in the spatial or temporal distribution of a disease. Business. Businesses collect large amounts of information on current and potential customers. Clustering can be used to segment customers into a small number of groups for additional analysis and marketing activities.Clustering for Utility:Cluster analysis provides an abstraction from individual data objects to the clusters in which those data objects reside. Additionally, some clustering techniques characterize each cluster in terms of a cluster prototype; i.e., a data object that is representative of the other objects in the cluster. These cluster prototypes can be used as the basis for a number of data analysis or data processing techniques. Therefore, in the context of utility, cluster analysis is the study of techniques for nding the most representative cluster prototypes. Summarization. Many data analysis techniques, such as regression or PCA, have a time or space complexity of O(m2) or higher (where m is the number of objects), and thus, are not practical for large data sets. However, instead of applying the algorithm to the entire data set, it can be applied to a reduced data set consisting only of cluster prototypes. Depending on the type of analysis, the number of prototypes, and the accuracy with which the prototypes represent the data, the results can be comparable to those that would have been obtained if all the data could have been used. Compression. Cluster prototypes can also be used for data compres-sion. In particular, a table is created that consists of the prototypes for each cluster; i.e., each prototype is
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号