Goals:
1. To gain hands-on experience with fork(), exec(), and wait() system calls.
2. To master the basics of multi-process application development.
3. To appreciate the performance and fault-tolerance bene_ts of multi-process applications.
4. To implement a multi-process downloader application.
Overview
File downloaders are programs used for downloading _les from the Internet. In this assignment
you will implement two different types of multi-process downloaders (i.e., _le downloaders that
comprise multiple processes):
1. a serial _le downloader which downloads _les one by one.
2. a parallel _le downloader which dowloads multiple _les in parallel.
You will then compare the performance of the two types of downloaders.
Both downloaders will use the Linux wget program in order to perform the actual downloading.
The usage of the wget is simple: wget
. For example, running from command line
the following command:
wget http://releases.ubuntu.com/15.04/ubuntu-15.04-desktop-amd64.iso
will download the Ubuntu Linux iso image to the current directory. Before proceeding with the
assignment, you may want to take a moment to experiment with the wget command.
In your program, the parent process shall _rst read the _le, urls.txt, containing the URLs of
the _les to be downloaded. urls.txt shall have the following format:
.
.
.
For example:
http://releases.ubuntu.com/15.04/ubuntu-15.04-desktop-amd64.iso
http://releases.ubuntu.com/14.10/ubuntu-14.10-server-i386.iso
Next, the parent process shall fork the child processes. Each created child process shall use the
execlp() system call to replace its executable image with that of the wget program. The two
types downloaders are described in detail below.
The two downloaders shall be implemented as separate programs. The serial downloader
program shall be called serial.c (or .cpp extension if you use C++). The parallel downloader
program shall be called parallel.c (or .cpp extension if you use C++).
Serial Downloader
The serial downloader shall download _les one by one. After the parent process has read and
parsed the urls.txt _le, it shall proceed as follows:
1. The parent process forks o_ a child process.
2. The child uses execlp(/usr/bin/wget, wget,
, NULL) system call
in order to replace its program with wget program that will download the _rst _le in
urls.txt (i.e. the _le at URL
).
3. The parent executes a wait() system call until the child exits.
4. The parent forks o_ another child process which downloads the next _le speci_ed in
urls.txt.
5. Repeat the above steps until all _les are downloaded.
Parallel Downloader
1. The parent forks o_ n children, where n is the number of URLs in urls.txt.
2. Each child executes execlp(/usr/bin/wget, wget,
, NULL) system
call where each
is a distinct URL in urls.txt.
3. The parent calls wait() (n times in a row) and waits for all children to terminate.
4. The parent exits.
Please note:
_ While the parallel downloader executes, the outputs from di_erent children may intermin-
gle. This is acceptable.
_ fork.c _le posted on Titanium provides an example of using fork(), execlp(), and
wait() system calls. Please feel free to modify it in order to complete the above tasks.
_ Please make sure to error-check all system calls. This is very important in practice and can
also save you hours of debugging frustration. fork(), execlp(), and wait() will return -1
on error. Hence, you need to always check their return values and terminate your program
if the return value is -1. For example:
p i d t pid = f o r k ( )
i f ( pid < 0) f pe r r o r ( f o r k ) ; e x i t (1); g The perror() function above will print out fork followed by the explanation of the error. Performance Comparison Use the time program to measure the execution time for the two downloaders. For example: time ./serial real 0m10.009s user 0m0.008s sys 0m0.000s The column titled real gives the execution time in seconds. Please get the execution times for both downloaders using the following urls.txt _le: http://releases.ubuntu.com/15.04/ubuntu-15.04-desktop-amd64.iso http://releases.ubuntu.com/14.10/ubuntu-14.10-server-i386.iso Your execution times should be submitted along with your code (see the section titled Submis- sion Guidelines. In your submission, please include the answers to the following questions (you may need to do some research): 1. In the output of time, what is the di_erence between real, user, and sys times? 2. Which is longer: user time or sys time? Use your knoweldge to explain why. 3. When downloading the _les above, which downloader _nishes faster? Why do you think that is? 4. Repeat the experiment for 10 _les (any reasonably large-sized _les, e.g., 100 MB, will do). Is the downloader in the previous question still faster? If not so, why do you think that is? Technical Details The program shall be ran using the following command line: ./multi-search
Where
is the name of the _le containing the strings,
is
the number of child processes, and
is the string to search for. For example, ./multi-search
strings.txt abcd 10 tells the program to split the task of searching for string abcd in _le
string.txt amongst 10 child processes.SUBMISSION GUIDELINES:
_ This assignment MUST be completed using C or C++ on Linux.
_ Please hand in your source code electronically
_ You must make sure that the code compiles and runs correctly.
_ Write a README _le (text _le, do not submit a .doc _le) which contains
Your name and email address.
The programming language you used (i.e. C or C++).
How to execute your program.
The execution times for both downloaders.
The answers to all questions above.
Whether you implemented the extra credit.
Anything special about your submission that we should take note of.
_ Place all your _les under one directory with a unique name (such as p1-[userid] for
assignment 1, e.g. p1-m1).
_ Tar the contents of this directory using the following command. tar cvf [directory name].tar
[directory name] E.g. tar -cvf p1-m1.tar p1-m1/