资源预览内容
第1页 / 共7页
第2页 / 共7页
第3页 / 共7页
第4页 / 共7页
第5页 / 共7页
第6页 / 共7页
第7页 / 共7页
亲,该文档总共7页全部预览完了,如果喜欢就下载吧!
资源描述
- 1 - 中国中国科技论文在线科技论文在线 MPI and OpenMP Paradigms and its application of Solving Large Scale Banded Linear Systems# Lei Xu, Hanyuan Zheng, Zhixiang Liu, Weibing Feng, Wu Zhang* (School of Computer Science, Shanghai University, Shanghai 200072) 5 Foundations: Key Project of Science and Technology Commission of Shanghai Municipality No.10510500600, Shanghai Leading Academic Discipline Project No.J50103 and Ph.D. Programs Fund of Ministry of Education of China No.200802800007 . Brief author introduction:Lei Xu(1987-), Male, HPC Correspondance author: Wu Zhang(1957-),Male,Doctor,Professor,High Performance Computing. Abstract: This paper discusses the performance of MPI+OpenMP hybrid programming paradigm and different implementations. We design a multi-granularity parallel algorithm for solving larger scale B anded linear systems, and compare its performance with pure MPI algorithm on the high performance computer of Shanghai University. The results indicate that the hybrid algorithm has better scalability and speedup. 10 Key words: Hybrid paradigm; Banded Linear Systems; MPI ; OpenMP 0 Introduction At present, Clusters have become the de-facto standard in parallel processing due to their high performance to price ratio.the distributed shared memory parallel machine is the main trends of development of high performance computer, its main features are as follows: each node is a shared 15 memory multiprocessor, and they are connected by network (Myrinet, Infiniband, Internet, etc.) . In order to take full advantage of distributed shared memory system, it is important to find an effective way to use this special multi-layer architecture. In distributed shared memory system, there are a variety of parallel programming including pure MPI model by use of single message-passing, OpenMP model running on SMP nodes. Owing to the two level storage structure: 20 shared memory and distributed memory, memory access (OpenMP) in the internal node and message passing between nodes (MPI) can make better use of distributed shared memory system. Hybrid programming model has been applied to many scientific applications 1. In some areas, MPI and OpenMP hybrid programming model has been successful. In solving process of many practical problems 2,3, there often exists a large banded linear equations needed to solve, so the 25 combination of MPI and OpenMP parallel algorithm has a certain sense 4. The remainder of this paper is organized as follows. At first, this paper introduces the MPI and OpenMP hybrid programming model and its implementation. After this, we establish the multi-grain MPI + OpenMP hybrid parallel algorithms of solving large banded linear equations. And then we compare its performance with pure MPI algorithm in the high performance 30 computing cluster at the Shanghai University. At last we make a conclusion. 1 MPI and OpenMP 1.1 MPI MPI (message passing interface) is a standard specification for a message passing interface, allowing portable message passing programs in Fortran and C language 5. Message passing 35 library having wide portability and scalability is provided for users. In addition, MPI processes are heavyweight, single-threaded process. MPI can be used in shared and distributed memory parallel computing environment. It is also easily compatible with both distributed-memory multicomputers - 2 - 中国中国科技论文在线科技论文在线 and shared-memory multiprocessors and combinations of these elements. However, MPI has the following deficiencies: the decomposition, development and debugging of the application is 40 relatively difficult and a lot of changes for the code are usually required. Moreover, Communication may result in large overhead. In order to minimize the delay, larger code size is typically needed. 1.2 OpenMP The OpenMP (Open Multi-Processing) Application Program Interface (API) supports 45 multi-platform shared-memory parallel programming in C/C+ and Fortran on all architectures, including Unix platforms and Windows NT platforms. 6. OpenMP consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior. The core elements of OpenMP are the constructs for thread creation, work load distribution (work sharing), data environment management, thread synchronization, user level runtime routines and 50 environment variables. Due to the hidden communication, programming using OpenMP is relatively simple and easy to implement in parallel. At the same time, OpenMP makes better use of shared memory architecture, avoiding the overhead of message passing, and providing fine-grained and coarse-grained parallelism. 1.3 MPI+OpenMP hybrid programming paradigm 55 Combined with MPI and OpenMP, High-performance computers can make well use of distributed memory high performance computer architecture to achieve MPI and OpenMP hybrid programming model. The main advantages of this hybrid programming model are as
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号