Courses
  /  
Descriptions
EECS 358: Intro to Parallel Computing

Quarter Offered

Spring : 2-3:20 TuTh ; G. Memik

Prerequisites

EECS 361 and EECS 211 or 230 or have graduate student standing

Description

Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, distributed memory, message-passing data-parallel architectures, and programming.

REQUIRED TEXTS: A. Grama, A. Gupta, G. Karypis, and Vipin Kumar, Introduction to Parallel Computing , Addison Wesley, 2 nd edition, 2003

REFERENCE TEXTS:

  • I. Foster, Designing and Building Parallel Programs, Addison Wesley 1995.
  • B. Bauer, Practical Parallel Programming , Academic Press 1992.
  • W. Gropp et al, Using MPI: Portable Parallel Programming with the Message Passing Interface , MIT Press 1994.
  • C. Koelbel et al, The High-Performance Fortran Handbook , MIT Press 1994.

COURSE COORDINATOR: Prof. Gokhan Memik

COURSE GOALS: To provide an introduction to the field of parallel computing. The goals are to provide an overview of the three basic types of parallel computing: shared memory, distributed memory message-passing, and data parallel computing, with hands-on experience with real parallel programming on actual parallel machines.

PREREQUISITES BY TOPIC:

  1. An overview of computer architecture
  2. Basic concepts of processors, ALUs, memories, caches, input-output
  3. Basic to intermediate concepts on programming of serial computers using C or Fortran
  4. Simple concepts of data structures like arrays and link lists in programs
  5. Some knowledge of scientific and engineering applications

DETAILED COURSE TOPICS:

  • Week 1 : Introduction to parallel computing: motivation for parallel computing, options of parallel computing, economics of parallel computing, basic concepts of parallel algorithms. Introduction to parallel programming: data and task parallelism, coarse and fine grain parallelism, performance of parallel programs, load balancing and scheduling, analysis of simple parallel programs.
  • Week 2 : Overview of shared memory parallel architectures: memory organization, interconnect organization, cache coherence, case studies of machines such as SGI Challenge, IBM J-30, HP/Convex Exemplar. Introduction to shared memory parallel programming: shared memory model, process creation and destruction, mutual exclusion, locks, barriers.
  • Week 3: Explicit shared memory programming: loop scheduling, static and dynamic, loop parallelization strategies. Shared memory parallel programming: use of PTHREADS libraries, case studies of explicit parallel programming, computation of PI, matrix multiplication, solution of partial differential equations.
  • Week 4 : Implicit shared memory parallel programming: use of compiler directives for parallel programming, DOALL and DOACROSS and PRAGMA directives for loop level parallelism, parallel programming examples using directives.
  • Week 5 : Distributed memory multicomputer architectures: overview of distributed memory parallel machines, message passing schemes, store and forward versus wormhole routing, interconnection networks, case studies of parallel machines such as Intel Paragon, IBM SP-2, Thinking Machine CM-5. Global Communication operations in distributed memory machines: one-to-all broadcast, reduction, shift, scatter, gather operations, analysis of performance of above operations on various parallel architectures.
  • Week 6 : Introduction to message-passing programming: basics of message passing, global and local addresses, single-program multiple data (SPMD programs) introduction to Message Passing Interface (MPI). Intermediate concepts in message passing programming: global and local addresses, loop scheduling for parallel loops. Advanced message-passing concepts: topologies, and decompositions, case studies of example applications, matrix multiplication, solution of partial differential equations.
  • Week 7 : Introduction to SIMD parallel architectures: Single-instruction multiple data stream architectures, control and data units, interconnection networks, case studies of machines such as Thinking Machines CM-2, CM-5 and Masspar MP-2. Introduction to data parallel programming: Fortran-90, array sections, array operations, array intrinsic operations.
  • Week 8 : Introduction to High Performance Fortran (HPF): FORALL directives, INDEPENDENT directives, simple parallel programs. High Performance Fortran data distribution and alignment directives, simple parallel programming examples, matrix multiplication, solution of partial differential equations.
  • Week 9 : Methodology for Parallel Algorithm Design: concurrency, locality, scalability, modularity; partitioning, agglomeration, communication, mapping, performance analysis of parallel algorithms. Parallel Matrix Algorithms: matrix representations, parallel dense matrix operations, matrix-vector, matrix-matrix multiplication, solutions of linear system of equations.
  • Week 10 : Parallel Sparse Matrix Solvers: sparse matrix representations, parallel iterative methods, parallel direct methods. Parallel Search Algorithms: optimization methods, parallel best first search, parallel depth-first search, speedup anomalies.

COMPUTER USAGE: Students get hands-on parallel programming experience on 3 parallel machines at the Electrical and Computer Engineering Department, including a 16 processor IBM SP-2 distributed memory machine, an 8 processor IBM J-40 shared memory machine, and an 8-processor SGI Origin 2000 distributed shared memory multiprocessor. In addition, students will use the machines in the Wilkinson Lab as a cluster as well as access a dedicated cluster with 16 processors for the final project.

HOMEWORK ASSIGNMENTS:

  • Homework 1: Design problems dealing with shared memory parallel programming, examples of program transformations to parallelize loops, use of explicit and implicit parallel programs using both the SGI directives and PTHREADS.
  • Homework 2: Design problems dealing with distributed memory message-passing parallel programming, use of MPI, analysis of communication patterns.
  • Homework 3: Design programs related to data parallel programming, use of High Performance Fortran, data layouts and alignments.
  • Homework 4: Design of parallel algorithms for various problems including matrix operations on dense and sparse matrices, analysis of parallel algorithms.

LABORATORY PROJECTS:

  • Lab 1: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using both explicit parallel programs with PTHREADS and implicit parallel programs using SGI directions, experiments on SGI Origin 2000 and IBM J-40 shared memory multiprocessors.
  • Lab 2: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using message-passing parallel programming using C or Fortran with MPI message passing, and experiments on IBM SP-2 distributed memory and SGI Origin 2000 multiprocessor for portability.
  • Lab 3: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using data parallel programming with High Performance Fortran and experiments on the IBM SP-2 distributed memory multiprocessor.

GRADES:

  • Four homeworks - 20 %
  • Three labs - 20 %
  • Midterm exam - 30%
  • Final exam - 30%

COURSE OBJECTIVES: When a student completes this course, s/he should be able to:

  1. Solve a given problem using parallel computing. Analyze the problem for various ways of parallelization, and design the best parallel algorithm.
  2. Have a broad understanding of shared memory parallel architectures and programming.
  3. Design a shared memory parallel program for a given parallel algorithm using both explicit and implicit parallel programming, measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
  4. Have a broad understanding of distributed memory parallel architectures and programming.
  5. Design a message-passing distributed memory parallel program for a given parallel algorithm using the portable Message-Passing Interface (MPI), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.
  6. Have a broad understanding of data parallel architectures and programming.
  7. Design a data parallel program for a given parallel algorithm using High Performance Fortran (HPF), measure real speedups, identify bottlenecks, and devise improvements to the parallel program.

ABET CONTENT CATEGORY: 100% Engineering (Design component).