第1页 / 共134页
第2页 / 共134页
第3页 / 共134页
第4页 / 共134页
第5页 / 共134页
第6页 / 共134页
第7页 / 共134页
第8页 / 共134页
第9页 / 共134页
第10页 / 共134页
Cloud computing platform, infrastructure and theory 云计算平台,架构和理论Jiaheng Lu Key Lab of Data Engineering and Knowledge Engineering Renmin University of China www.jiahenglu.net主要内容3 云计算概述 Google 云计算技术:GFS,Bigtable 和 Mapreduce开源平台Hadoop介绍云计算理论事务处理理论 DataLog理论新编教材 分布式系统与云计算概述l系统讲述分布式系统与云计算的教材l2010年9月 清华大学出版社出版l欢迎使用,并提宝贵意见!课堂作业和编程习题l上课时,请认真回答三道关于云计算 的问答题l课后请到 cloudcomputing.ruc.edu.cn 提交Mapreduce的程序。l测试结果作为颁发毕业证书和向微软 研究院推荐学生的重要依据之一Cloud computingWhy we use cloud computing?Why we use cloud computing?Case 1: Write a file Save Computer down, file is lostFiles are always stored in cloud, never lostWhy we use cloud computing?Case 2: Use IE - download, install, use Use QQ - download, install, use Use C+ - download, install, use Get the serve from the cloudWhat is cloud and cloud computing?Cloud Demand resources or services over Internet scale and reliability of a data center.What is cloud and cloud computing?Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a serve over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud“ that supports them.Characteristics of cloud computinglVirtual.software, databases, Web servers, operating systems, storage and networking as virtual servers. lOn demand.add and subtract processors, memory, network bandwidth, storage.IaaS Infrastructure as a ServicePaaS Platform as a ServiceSaaS Software as a ServiceTypes of cloud serviceSoftware delivery modellNo hardware or software to managelService delivered through a browserlCustomers use the service on demandlInstant ScalabilitySaaSExampleslYour current CRM package is not managing the load or you simply dont want to host it in-house. Use a SaaS provider such as Salesforce.comlYour email is hosted on an exchange server in your office and it is very slow. Outsource this using Hosted Exchange.SaaSPlatform delivery modellPlatforms are built upon Infrastructure, which is expensivelEstimating demand is not a science!lPlatform management is not fun!PaaSExampleslYou need to host a large file (5Mb) on your website and make it available for 35,000 users for only two months duration. Use Cloud Front from Amazon.lYou want to start storage services on your network for a large number of files and you do not have the storage capacityuse Amazon S3.PaaSComputer infrastructure delivery modellA platform virtualization environmentlComputing resources, such as storing and processing capacity. lVirtualization taken a step furtherIaaSExampleslYou want to run a batch job but you dont have the infrastructure necessary to run it in a timely manner. Use Amazon EC2.lYou want to host a website, Use Google App Engine.IaaSCloud computing and other computing techniquesThe 21st Century Vision Of ComputingSun Microsystems co-founder Bill JoyThe 21st Century Vision Of ComputingDefinitionsCloudGridClusterutilityDefinitionsCloudGridClusterutilityUtility computing is the packaging of computing resources, such as computation and storage, as a metered service similar to a traditional public utilityDefinitionsCloudGridClusterutilityA computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. DefinitionsCloudGridClusterutilityGrid computing is the application of several computers to a single problem at the same time usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data DefinitionsCloudGridClusterutilityCloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.Grid Computing ldefine methods by which consumers discover, request and use resources provided by the central facilities; limplement the often highly parallel computations that execute on those resources. Grid Computing GFS chooses and returns the offset it writes to and appends the data to each replica at least oncelHeavily used by Googles Distributed applications.lNo need for a distributed lock managerlGFS choses the offset, not the clientAtomic Record Append: How?Follows similar control flow as mutations Primary tells secondary replicas to append at the same offset as the primary If a replica append fails at any replica, it is retried by the client. lSo replicas of the same chunk may contain different data, including duplicates, whole or in part, of the same recordAtomic Record Append: How?GFS does not guarantee that all replicas are bitwise identical.lOnly guarantees that data is written at least once in an atomic unit.lData must be written at the same offset for all chunk replicas for success to be reported.Detecting Stale ReplicasMaster has a chunk version number to distinguish up to date and stale replicas Increase version when grant
收藏 下载该资源
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号