# Courses / DescriptionsCOMP_ENG 358: Intro to Parallel Computing

### Quarter Offered

Spring : 2-3:20 TuTh ; G. Memik### Prerequisites

COMP_ENG 361 and COMP_SCI 211 or COMP_SCI 230 or have graduate student standing### Description

Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, distributed memory, message-passing data-parallel architectures, and programming.

**REQUIRED TEXT:** Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar. (2003). *Introduction to Parallel Computing, 2nd Edition*. Pearson Education/Addison Wesley. ISBN-13: 9780201648652

**REFERENCE TEXTS:**

- Ian Foster. (1995).
*Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering*, Addison Wesley. ISBN-13: 978-0201575941 - Barr E. Bauer. (1992).
*Practical Parallel Programming*, Academic Press. ISBN-13: 978-1493306039 - William Gropp et al. (1994).
*Using MPI: Portable Parallel Programming with the Message Passing**Interface*, MIT Press. - Charles H. Koelbel et al. (1994).
*The High-Performance Fortran Handbook*, MIT Press. ISBN-13: 978-0262610940

**COURSE COORDINATOR: Prof. Gokhan Memik**

**COURSE GOALS: **To provide an introduction to the field of parallel computing. The goals are to provide an overview of the three basic types of parallel computing: shared memory, distributed memory message-passing, and data parallel computing, with hands-on experience with real parallel programming on actual parallel machines.

**PREREQUISITES BY TOPIC:**

- An overview of computer architecture
- Basic concepts of processors, ALUs, memories, caches, input-output
- Basic to intermediate concepts on programming of serial computers using C or Fortran
- Simple concepts of data structures like arrays and link lists in programs
- Some knowledge of scientific and engineering applications

**DETAILED COURSE TOPICS:**

**Week 1**: Introduction to parallel computing: motivation for parallel computing, options of parallel computing, economics of parallel computing, basic concepts of parallel algorithms. Introduction to parallel programming: data and task parallelism, coarse and fine grain parallelism, performance of parallel programs, load balancing and scheduling, analysis of simple parallel programs.**Week 2**: Overview of shared memory parallel architectures: memory organization, interconnect organization, cache coherence, case studies of machines such as SGI Challenge, IBM J-30, HP/Convex Exemplar. Introduction to shared memory parallel programming: shared memory model, process creation and destruction, mutual exclusion, locks, barriers.**Week 3:**Explicit shared memory programming: loop scheduling, static and dynamic, loop parallelization strategies. Shared memory parallel programming: use of PTHREADS libraries, case studies of explicit parallel programming, computation of PI, matrix multiplication, solution of partial differential equations.**Week 4**: Implicit shared memory parallel programming: use of compiler directives for parallel programming, DOALL and DOACROSS and PRAGMA directives for loop level parallelism, parallel programming examples using directives.**Week 5**: Distributed memory multicomputer architectures: overview of distributed memory parallel machines, message passing schemes, store and forward versus wormhole routing, interconnection networks, case studies of parallel machines such as Intel Paragon, IBM SP-2, Thinking Machine CM-5. Global Communication operations in distributed memory machines: one-to-all broadcast, reduction, shift, scatter, gather operations, analysis of performance of above operations on various parallel architectures.**Week 6**: Introduction to message-passing programming: basics of message passing, global and local addresses, single-program multiple data (SPMD programs) introduction to Message Passing Interface (MPI). Intermediate concepts in message passing programming: global and local addresses, loop scheduling for parallel loops. Advanced message-passing concepts: topologies, and decompositions, case studies of example applications, matrix multiplication, solution of partial differential equations.**Week 7**: Introduction to SIMD parallel architectures: Single-instruction multiple data stream architectures, control and data units, interconnection networks, case studies of machines such as Thinking Machines CM-2, CM-5 and Masspar MP-2. Introduction to data parallel programming: Fortran-90, array sections, array operations, array intrinsic operations.**Week 8**: Introduction to High Performance Fortran (HPF): FORALL directives, INDEPENDENT directives, simple parallel programs. High Performance Fortran data distribution and alignment directives, simple parallel programming examples, matrix multiplication, solution of partial differential equations.**Week 9**: Methodology for Parallel Algorithm Design: concurrency, locality, scalability, modularity; partitioning, agglomeration, communication, mapping, performance analysis of parallel algorithms. Parallel Matrix Algorithms: matrix representations, parallel dense matrix operations, matrix-vector, matrix-matrix multiplication, solutions of linear system of equations.**Week 10**: Parallel Sparse Matrix Solvers: sparse matrix representations, parallel iterative methods, parallel direct methods. Parallel Search Algorithms: optimization methods, parallel best first search, parallel depth-first search, speedup anomalies.

**COMPUTER USAGE: **Students get hands-on parallel programming experience on 3 parallel machines at the Electrical and Computer Engineering Department, including a 16 processor IBM SP-2 distributed memory machine, an 8 processor IBM J-40 shared memory machine, and an 8-processor SGI Origin 2000 distributed shared memory multiprocessor. In addition, students will use the machines in the Wilkinson Lab as a cluster as well as access a dedicated cluster with 16 processors for the final project.

**HOMEWORK ASSIGNMENTS:**

**Homework 1:**Design problems dealing with shared memory parallel programming, examples of program transformations to parallelize loops, use of explicit and implicit parallel programs using both the SGI directives and PTHREADS.**Homework 2:**Design problems dealing with distributed memory message-passing parallel programming, use of MPI, analysis of communication patterns.**Homework 3:**Design programs related to data parallel programming, use of High Performance Fortran, data layouts and alignments.**Homework 4:**Design of parallel algorithms for various problems including matrix operations on dense and sparse matrices, analysis of parallel algorithms.

**LABORATORY PROJECTS:**

**Lab 1:**Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using both explicit parallel programs with PTHREADS and implicit parallel programs using SGI directions, experiments on SGI Origin 2000 and IBM J-40 shared memory multiprocessors.**Lab 2:**Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using message-passing parallel programming using C or Fortran with MPI message passing, and experiments on IBM SP-2 distributed memory and SGI Origin 2000 multiprocessor for portability.**Lab 3:**Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using data parallel programming with High Performance Fortran and experiments on the IBM SP-2 distributed memory multiprocessor.

**GRADES:**

- Four homeworks - 20 %
- Three labs - 20 %
- Midterm exam - 30%
- Final exam - 30%

**COURSE OBJECTIVES: When a student completes this course, s/he should be able to:**

- Solve a given problem using parallel computing. Analyze the problem for various ways of parallelization, and design the best parallel algorithm.
- Have a broad understanding of shared memory parallel architectures and programming.
- Design a shared memory parallel program for a given parallel algorithm using both explicit and implicit parallel programming, measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
- Have a broad understanding of distributed memory parallel architectures and programming.
- Design a message-passing distributed memory parallel program for a given parallel algorithm using the portable Message-Passing Interface (MPI), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
- Have a broad understanding of data parallel architectures and programming.
- Design a data parallel program for a given parallel algorithm using High Performance Fortran (HPF), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.

**ABET CONTENT CATEGORY: **100% Engineering (Design component).