CS 302 - Programming Assignment 5

This assignment makes use of the files contained in this zip file.

In the zip file there are the outlines of four C programs, serialWorker.c, processWorker-parent.c, processWorker-child.c, and threadWorker.c. Also in the zip file are ten data files called snark0.txt through snark9.txt.

The C programs are meant to simulate the workload that a server program might have. The "workload" that these programs are supposed to handle is filtering input files to output files, where "filtering" means copying an input file to an output file while changing the case of every n'th character (just like in the first homework assignment). The goal of the assignment is to implement and study three different strategies for doing a lot of filtering as quickly as possible.

The first program is serialWorker.c and it is supposed to implement the simplest strategy for filtering a large number of files. The strategy is to filter one file at a time. As explained in the file serialWorker.c, the program is supposed to randomly choose input files and filter them, one after another.

The second program is made up of the two files processWorker-parent.c and processWorker-child.c. Together, these files are supposed to implement a strategy of creating a separate worker process to do the filtering of each file that needs to be filtered. The file processWorker-parent.c will be the (master) process that creates the worker processes, and processWorker-child.c is the program that each worker process is instantiated from. The hope here is that many worker processes can run at the same time, thereby accomplishing the filtering of a large number of files in less time than the serial filtering strategy. In other words, this is a way of "parallelizing" the serial filtering strategy.

The last program, threadWorker.c, parallelizes the filtering work by creating multiple threads, one thread for each input file that needs to be filtered.

Follow the directions in each of the four C files and implement these three worker strategies. When you programs are completed, try them on small, medium, and large runs of input files (small is 5 to 15 files, medium is 15 to 30, and large is anything above 30). How do they perform? Which strategy is fastest? Does the data you collect from doing many runs of these programs agree with what you were expecting? How large of a run can you do?

Turn in a zip file containing your versions of the four C programs and an explanation of what you found when you ran the three different strategies. (Please do not send back to me the output files generated by the programs.)

This assignment is due Wednesday, March 9.


Return to the main homework page.
Return to the CS 302 home page.


compliments and criticisms