资源预览内容
第1页 / 共51页
第2页 / 共51页
第3页 / 共51页
第4页 / 共51页
第5页 / 共51页
第6页 / 共51页
第7页 / 共51页
第8页 / 共51页
第9页 / 共51页
第10页 / 共51页
亲,该文档总共51页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Lecture 10 评估和理解性能 Reading: 4.1-4.6, 4.7* Homework: 4.1, 4.2, 4.3, 4.6, 4.7, 4.14, 4.19 - 4.22 本课件内容源于美国Lafayette 大学John Nestor教授的课件 计算机组成与结构 ECE 313 Fall 2006 1 Lecture 10 - Performance Roadmap for the term: major topics 4Computer Systems Overview 4Technology Trends 4Instruction sets (and Software) 4Logic & Arithmetic 4Performance 3 4Processor Implementation 4Memory Systems 4Input/Output ECE 313 Fall 2006 2 Lecture 10 - Performance Performance Outline 4Motivation 3 4Defining Performance 4Common Performance Metrics 4Benchmarks 4Amdahls Law ECE 313 Fall 2006 3 Lecture 10 - Performance Performance 4Goal: 学习“度量、报告和总结”计算机系统的性能 4为什么研究性能? 4当选择一个系统时,做更明智的决策,如:购买一台计算机 4当设计一个系统时,做更明智的决策 4理解影响执行决策的因素 4Challenges 4How do we measure performance accurately? 4How do we compare performance fairly? ECE 313 Fall 2006 4 Lecture 10 - Performance Whats a good measure of performance? 4一台计算机比另一台好,指的是什么? 4Execution Time 执行时间 (a.k.a. response time, latency)-单机用户 4How long it takes to complete a single task 4Example: “how long does it take to rip an MP3 file?” 4Throughput 吞吐量-数据中心管理人员 4How many tasks are completed per unit time 4Example: “how many MP3 files can I rip per hour? 4实际上,我们所使用的度量取决于应用 ECE 313 Fall 2006 5 Lecture 10 - Performance Execution Time vs. Throughput 4类似情况:民航飞机 (book Figure 4.1) 4协和式飞机Concorde - fastest “response time” for an individual user 4波音747 Boeing 747 - highest passenger throughput ECE 313 Fall 2006 6 Lecture 10 - Performance 例子:吞吐量与响应时间 4对计算机系统做如下改动,能提高吞吐量还是,缩短响 应时间? 41、用更快的处理器来替换原来的处理器。 42、有些计算机中装有多个处理器,并且不同的处理器处 理不同的任务,如:互联网搜索系统,在这种系统中增 加一些处理器,效果如何呢? 4对于1,二者均能得到改善; 4对于2,单个任务处理速度并不会加快,但是吞吐量提高 了,如果请求超出吞吐能力,请求将在系统中排队等待 处理。 4在许多实际系统中,二者是相互影响的。 ECE 313 Fall 2006 7 Lecture 10 - Performance Performance Outline 4Motivation 4Defining Performance 3 4Common Performance Metrics 4Benchmarks 4Amdahls Law ECE 313 Fall 2006 8 Lecture 10 - Performance 测量执行时间 4Wall-clock time or elapsed time:程序开始执行到退 出所花费的时间,响应时间 4Includes I/O waiting 4Includes time while OS runs other jobs 4CPU time - measured by OS 4用户CPU时间 运行用户程序代码所花费的时间 4系统CPU时间 为了执行用户程序而不得不运行的OS代码所花 费的时间 4Measuring CPU time: Unix/Linux time command time myprog 90.7u 12.9s 2:39 65% = (90.7+12.9)/159 User TimeSystem Time Wall-clock Time CPU Utilization ECE 313 Fall 2006 9 Lecture 10 - Performance 用执行时间Execution Time定义性能 4For a given program on machine X: 4Comparing performance of machines: PerformanceX PerformanceY if Execution TimeX tprop+ tsetup tclock = tprop+ tsetup + tslack 时钟clock需要一个松弛时间tslack ECE 313 Fall 2006 18 Lecture 10 - Performance 每条指令的平均时钟周期数 (CPI) 4Clock Cycles per Instruction (CPI) 4Consider the 68HC11 4ADDA - 3 cycles (IMM) - 5 cycles (IND, Y) 4MUL - 10 cycles 4IDIV - 41 cycles 4More complex processors have other issues 4Pipelining - parallel execution, but 有时需要停止 4Memory system issues: 缓存缺失, 缺页, etc. 4如何将这些组合为一个完整的度量指标? addldamulandsta Total Execution Time ECE 313 Fall 2006 19 Lecture 10 - Performance Definition: Clock Cycles per Instruction (CPI) 4Average number of clock cycles per instruction,每 条指令需要的平均时钟周期数 4Measured for an entire program 4 = 时钟周期数/指令数 ECE 313 Fall 2006 20 Lecture 10 - Performance Example - CPI 4What is the CPI of A? 4What is the CPI of B? ECE 313 Fall 2006 21 Lecture 10 - Performance Definition - MIPS 4MIPS - millions of instructions per second,每秒执 行的百万条指令数 4曾经被用于作为度量性能的一般指标 4But, not useful for comparing different architectures 4经常被嘲笑为“无意义的性能指示器”(meaningless indicator of performance) ECE 313 Fall 2006 22 Lecture 10 - Performance Example - MIPS 4What is the MIPS of A? 4What is the MIPS of B? ECE 313 Fall 2006 23 Lecture 10 - Performance Relating the Metrics - The Performance Equation 4The “Iron Law” of Performance 4CPU时间=程序中的指令数x每条指令所需的时钟周期数 x每个时钟周期对应的时间 ECE 313 Fall 2006 24 Lecture 10 - Performance Clock Cycles and Performance Example-page163 4Program runs on Computer A: 4CPU Time: 10 seconds 4Clock: 400MHz 4Computer B can run clock faster 4But, requires 1.2X clock cycles to perform same task 4Desired CPU Time: 6 Seconds 4为了达到这个目标,时钟频率应该是多少? 4Key to approach: Performance equation 4时钟周期数=程序中的指令数x每条指令的平均时钟周期数 ECE 313 Fall 2006 25 Lecture 10 - Performance Clock Cycles and Performance - Example (contd) First step: find clock cycles executed by Computer A Second step: find clock cycles executed by Computer B ECE 313 Fall 2006 26 Lecture 10 - Performance Clock Cycles and Performance - Example (contd) Third step: given clock cycles and CPU time, solve for clock rate of Computer B ECE 313 Fall 2006 27 Lecture 10 - Performance Performance Tradeoffs 4Program Instruction count - impacted by 4可获得的完成基本功能的指令(与指令集结构有关) 4编译器生成的代码质量 4CPI - impacted by 4芯片实现方式 4编译器生成的代码质量 4存储系统的性能 4Clock Rate 4IC技术中的延迟特性 Delay characteristics of IC technology 4实现的逻辑结构 Logical structure of implementation ECE 313 Fall 2006 28 Lecture 10 - Performance 影响性能的因素 4计算机性能的基本要素及其含义、测量单位 4图4-2 4程序性能取决于多个方面,包括算法、编程语言、编译 器、程序结构、硬件情况等 4Page165中的总结 4Page165中例题说明:只使用单一因素衡量计算机性能 是不适合的 4Page161和166中自测题 ECE 313 Fall 2006 29 Lecture 10 -
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号