资源预览内容
第1页 / 共53页
第2页 / 共53页
第3页 / 共53页
第4页 / 共53页
第5页 / 共53页
第6页 / 共53页
第7页 / 共53页
第8页 / 共53页
第9页 / 共53页
第10页 / 共53页
亲,该文档总共53页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
2018年9月14日星期五,Data Mining: Concepts and Techniques,1,知识挖掘理论和技术 Knowledge Mining Theory and Technology Chapter 1 ,魏玮 计算机科学与软件学院 河北工业大学 weiweiscse.hebut.edu.cn,2018年9月14日星期五,Data Mining: Concepts and Techniques,2,教材:数据挖掘 概念与技术 Data Mining: Concepts and Techniques,Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/hanj 2006 Jiawei Han and Micheline Kamber, All rights reserved 范明 孟小峰 译 机械工业出版社,2008,2018年9月14日星期五,Data Mining: Concepts and Techniques,3,Data Mining: Concepts and Techniques,2nd ed. to be published in Jan. 2006 Seven chapters will be covered in Fall semester E-preprints will be used in this class,2018年9月14日星期五,Data Mining: Concepts and Techniques,4,教学参考书,朱明编著.数据挖掘.中国科学技术大学出版社,2002 (加)Jiawei Han,Micheline Kamber. DATA MINING:Concepts and Techniques(英文版,第二版).机械工业出版社,2006.4 (美)Mehmed Kantardzic著.数据挖掘概念、模型、方法和算法.闪四清、陈茵、程雁等译,清华大学出版社,2003 (美)Richard J. Roiger,Michael W. Geatz. 数据挖掘教程.翁敬农译,清华大学出版社,2003,2018年9月14日星期五,Data Mining: Concepts and Techniques,5,Disciplinary Coverage,Introduction Data Preprocessing Data Warehouse and OLAP Technology: An Introduction Advanced Data Cube Technology and Data Generalization Mining Frequent Patterns, Association and Correlations Classification and Prediction Cluster Analysis,2018年9月14日星期五,Data Mining: Concepts and Techniques,6,Chapter 1. Introduction,Motivation: Why data mining? What is data mining? Data Mining: On what kind of data? Data mining functionality Are all the patterns interesting? Classification of data mining systems Data Mining Task Primitives Integration of data mining system with a DB and DW System Major issues in data mining,2018年9月14日星期五,Data Mining: Concepts and Techniques,7,1.1 Motivation: Why data mining? Necessity Is the Mother of Invention,Data explosion problem Automated data collection tools, widely used database systems, computerized society, and the Internet lead to tremendous amounts of data accumulated and/or to be analyzed in databases, data warehouses, WWW, and other information repositories We are drowning in data, but starving for knowledge! Solution: Data warehousing and data mining Data warehousing and on-line analytical processing Mining interesting knowledge (rules, regularities, patterns, constraints) from data in large databases,2018年9月14日星期五,Data Mining: Concepts and Techniques,8,Fig 1.1 The evolution of database system technology,2018年9月14日星期五,Data Mining: Concepts and Techniques,9,Evolution of Database Technology,1960s: Data collection, database creation, IMS and network DBMS 1970s: Relational data model, relational DBMS implementation 1980s: RDBMS, advanced data models (extended-relational, OO, deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.) 1990s: Data mining, data warehousing, multimedia databases, and Web databases 2000s Stream data management and mining Data mining and its applications Web technology (XML, data integration) and global information systems,2018年9月14日星期五,Data Mining: Concepts and Techniques,10,Fig 1.2 We are data rich, but information poor,2018年9月14日星期五,Data Mining: Concepts and Techniques,11,1.2 What Is Data Mining?,Data mining (knowledge discovery from data) Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data Data mining: a misnomer? Alternative names Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc.,2018年9月14日星期五,Data Mining: Concepts and Techniques,12,Why Data Mining?Potential Applications,Data analysis and decision support Market analysis and management Target marketing, customer relationship management (CRM), market basket analysis, cross selling, market segmentation Risk analysis and management Forecasting, customer retention, improved underwriting, quality control, competitive analysis Fraud detection and detection of unusual patterns (outliers) Other Applications Text mining (news group, email, documents) and Web mining Stream data mining Bioinformatics and bio-data analysis,2018年9月14日星期五,Data Mining: Concepts and Techniques,13,Example 1: Market Analysis and Management,Where does the data come from?Credit card transactions, loyalty cards, discount coupons, customer complaint calls, plus (public) lifestyle studies Target marketing Find clusters of “model” customers who share the same characteristics: interest, income level, spending habits, etc., Determine customer purchasing patterns over time Cross-market analysisFind associations/co-relations between product sales, & predict based on such association Customer profilingWhat types of customers buy what products (clustering or classification) Customer requirement analysis Identify the best products for different customers Predict what factors will attract new customers Provision of summary information Multidimensional summary reports Statistical summary information (data central tendency and variation),
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号