资源预览内容
第1页 / 共24页
第2页 / 共24页
第3页 / 共24页
第4页 / 共24页
第5页 / 共24页
第6页 / 共24页
第7页 / 共24页
第8页 / 共24页
第9页 / 共24页
第10页 / 共24页
亲,该文档总共24页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 大数据技术研讨会 观看同期在线研讨会 :https:/grs.cisco.com/grsx/cust/grsCustomerSurvey.html?SurveyCode=9082&KeyCode=000238223&ad_id=bdc100 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 大数据时代 大数据技术综述 思科 CPA大数据架构 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 大数据不仅仅是指数据本身,还包括一系列用来收集、管理、挖掘、分析海量信息并解决复杂问题的技术: According to IDC “Big data refers not only to data itself but also to a set of technologies designed to collect, manage, mine, andanalyze large collections of information to solve complex problems.” IDC At a recent Big Data and High Performance Computing Summit in Boston hosted by Amazon Web Services (AWS), data scientist John Rauser mentioned a simple definition: 任何大到一台计算机处理不过来的数据就是大数据 , Any amount of data thats too big to be handled by one computer. Some says thats too simplistic. Others say its spot on. Amazon Web Services (AWS) “Big data” 是指数据 集合的尺寸超过典型数据库软件 工具的捕捉 、存储、管理和分析能力。 refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. MGI also says and proves strong evidence that big data can play a significant economic role to the benefit not only of private commerce but also of national economies and their citizens. Data can create significant value for the world economy, enhancing the productivity and competitiveness of companies and the public sector and creating substantial economic surplus for consumers. McKinsey Global Institute Foundation Research and Analytics Team 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 信息技术渗入人类生活 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 欺诈 离线审计 公众搜索关键字 网络泄密 怀疑事件 行为分析 犯罪事情预想 燃料节省 预测未来 日常成本估算 重新开发的区域 不了解的产品 名誉 了解现在 收集 保管 处理 分析 实行 Big Data 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 内存 MPP数据库 大数据 GPU图像处理 Server Server Server Server ? ? ? ? Applications Server Processing I/O Network Storage 用户界面 企业应用系统 Business Functions Virtual Server Virtual Server Phisical Server Phisical Server 智能资源调度 机制 通用平台资源池 Stand-By 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin Traditional Database MPP Database “Big Data” Store and Analyze “Big Data” Real-Time Capture, Read and Update Operations NoSQL Application Sales Products Process Inventory Finance Payroll Shipping Tracking Authorization Customers Profile Machine logs Sensor data Call data records Web click stream data Satellite feeds GPS data Sales data Blogs Emails Pictures Video 结构化数据 非结构化数据 Transaction HBase Oracle NoSQLDB Cassandra MongoDB CouchDB Redis Membase Neo4j Transaction Oracle DB2 SQL Server MySQL SAP HANA Analyze GreenPlum Netezza SAP HANA Analyze Hadoop MapR 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin Structured Commercial Open source Unstructured (RDBMS) (NoSQL DB) 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin Hadoop Distributed File System Data is not centrally located Data is stored across all data nodes in the cluster Data is stored in large blocks (128MB or larger) Data is stored reliably by replication Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 1 Block 2 Block 2 Block 3 Block 1 Block 3 Block 4 Block 5 Block 5 Block 6 Block 4 Block 6 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin 两种主要业务模式 : BI and ETL The complexity of the job (map and reduce) vary greatly depending on the use case and have a large impact on the network. Programs write functions for Maps and Reduce and their complexity varies 对网络的挑战: Burst IO and Stable connection 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin Hadoop Distribution (Similar to what Redhat does for Linux) Services and support model Spin-out from Yahoo. Services and support model for Apache Hadoop. Main customer is supporting Yahoo. Rewrote hadoop with many optimizations (rewrote HDFS into a C+ Filesystem and distributed the metadata) EMC Greenplum hadoop distribution. Uses MapR. Hadoop Distribution and NoSQL like offering to be announced at this years Oracle Openworld. Very similar to HBASE/other NoSQL offerings. Based of BerkeleyDB. Other NoSQL-like offerings Various Others 思科中国百度文库: http:/wenku.baidu.com/org/view?org=ciscochina 思科互动网络主页: www.cisco.com/go/cn/cin A Distributed, Scalable Key-Value Database Simple Data
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号