CSCI 3366 (Parallel and Distributed Programming), Fall 2021:
Homework 4

Credit:: 30 points.

Reading

Be sure you have read, or at least skimmed, chapters 1 through 5 of the textbook.

Your mission for this assignment is write a parallel version of mathematician John Conway's “Game of Life”, as described briefly in class. (You can also find more information on the Web. The Wikipedia article seems good.)

The Game of Life is not so much a game in the usual sense as a set of rules for a cellular automaton: There are no players, and once the initial configuration is established, everything that happens is determined by the game's rules. The game is “played” on a rectangular grid of cells. Some cells are “live” (contain a simulated organism); others are “dead” (empty). At each time step, a new configuration is computed from the old configuration according to the following rules:

For each cell, we look at its eight neighbors (top, bottom, left, right, and the four diagonal neighbors) and count the number of cells that are live. (Note that this count is based on the configuration at the start of the time step.)
A dead cell with exactly three live neighbors becomes live; otherwise it stays dead.
A live cell with two or three live neighbors stays live; otherwise it becomes dead (of isolation or overcrowding).

This problem clearly(?) fits our Geometric Decomposition pattern and is fairly straightforward to parallelize. However, it's unlikely that parallelization will improve performance unless the board size is large, and for large boards inputting and displaying (or printing) board configurations gets unwieldy. But it might be interesting to experiment with randomly-generated board configurations and observe how the number of live cells changes over time (does it settle down to a stable number? what and how soon?), so we'll do that.

Details

Sequential program

(5 points)

To help you get started, I wrote a sequential C program with a simple text interface, with command-line arguments that specify:

input source (an input file or the keyword “random” followed by size, fraction of cells that should initially be “live”, and seed)
number of steps
print interval (P means to output updated board every P steps)
optionally, a name for an output file to contain initial and updated boards

The program prints (to standard output) only counts of “live” cells at each step; if an output file is specified, it also writes initial and updated board configurations to it.

The starter code defines data structures, gets input, sets up the board, and writes output, but omits the code that implements the actual algorithm. (Comments with the word “FIXME” show you where you need to make changes/additions.)

Code: game-of-life.c. You will also need timer.h to compile the program.
Sample input file: input_8x8.
Sample output file (using the above input and executing for 4 steps): output_8x8.

Start by filling in the parts of the code I left out and running the result a few times, to test that you understand how to do the computational part of the game. Note that this program is almost identical to one of the CSCI 1120 homeworks, so you probably have the basic logic. What's different:

I use short for the elements of the board rather than bool, since MPI lacks a Booolean data type.
I allow boards to not be square, so the MPI version doesn't need to store the whole board.

Parallel programs

(20 points)

Your next job is to write two parallel versions of this application, one for shared memory using OpenMP and one for distributed memory using MPI. Both parallel programs should produce exactly the same results as the original sequential program, except for timing information.

OpenMP program

This one should be fairly straightforward. As with the OpenMP programs for previous homeworks, have the program get the number of threads to use from environment variable OMP_NUM_THREADS and print with the timing information the number of threads used.

MPI program

This one is less straightforward, but doable. As with the MPI programs for previous homeworks, have the program print with the timing information the number of processes used. Suggestions:

Distribute the “board” among processes so that each process has a block of rows of the original board. (For example, with two processes one process would have the top half of the board and the other the bottom half.) (It's somewhat more customary when distributing a 2D data structure to distribute square blocks, but that makes exchanging boundary information much more complicated, with MPI at least; distributing blocks of rows is simpler.)
Have only process 0 open the output file and print to it. Other processes can send “their” rows to process 0, which can collect and print them. Since there's no need to do that if there's no output file, you might have process 0 broadcast a value indicating whether there is an output file.

Performance of parallel programs

(5 points)

Once you have working parallel code, experiment with input values until you get a problem size/configuration big enough to make it reasonable to hope for good speedups with multiple UEs. (Think a little about what will affect this most -- size of board, number of steps, interval between printing results.) Then time your two parallel programs for this problem size/configuration and different numbers of UEs and plot the results, as in Homework 2.

What to turn in and how

Turn in the following:

Source code for your sequential and parallel programs.
Results of measuring performance. For each of these programs, tell me what inputs you used for the program, which machine(s) you ran it on, and send me:
- A plot showing how execution time depends on number of UEs.
- Input data for the plot. A text file or files is fine for this.

Submit your program source code, plots, and input data by putting them in your “turn-in” folder.

Essay and pledge

Include with your assignment the following information.

For programming assignments, please put it a separate file. (I strongly prefer plain text, but if you insist you can put it in a PDF -- just no word-processor documents or Google Drive links please.) For written assignments, please put it in your main document.

Pledge

This should include the Honor Code pledge, or just the word “pledged”, plus at least one of the following about collaboration and help (as many as apply). Text in italics is explanatory or something for you to fill in; you don't need to repeat it!

I did not get outside help aside from course materials, including starter code, readings, sample programs, the instructor.
I worked with names of other students on this assignment.
I got help with this assignment from source of help -- ACM tutoring, another student in the course, etc. (Here, “help” means significant help, beyond a little assistance with tools or compiler errors.)
I got help from outside source -- a book other than the textbook (give title and author), a Web site (give its URL), etc.. (Here too, you only need to mention significant help -- you don't need to tell me that you looked up an error message on the Web, but if you found an algorithm or a code sketch, tell me about that.)
I provided help to names of students on this assignment. (And here too, you only need to tell me about significant help.)

Essay

This should be a brief essay (a sentence or two is fine, though you can write as much as you like) telling me what if anything you think you learned from the assignment, and what if anything you found interesting, difficult, or otherwise noteworthy.

2021-12-04

CSCI 3366 (Parallel and Distributed Programming), Fall 2021: Homework 4

CSCI 3366 (Parallel and Distributed Programming), Fall 2021:
Homework 4