SEE BODY TAG
Name:
Answer Key

SEE BODY TAG

SEE BODY TAG

 

SEE BODY TAG


Write programs to solve each of these problems. Include comments in each program with your name, "CompSci 491 Exam 2", and the date.

Upload each program separately, to the course website under "Exam 2".

  1. [25 pts] Write a program that uses OpenMP to run as many threads as possible. Each thread must print out the letter 'A', the total number of threads, and its own thread number; then use the sleep() function (from "unistd.h") to sleep for three seconds; then print out a blank line, and the letter 'B', the total threads, and its own thread number again.

    Here is an example run of the instructor's solution:

    $
    $ ./sleeper-omp
    parent thread starts:
    A: 8 total threads; this is thread 7
    A: 8 total threads; this is thread 4
    A: 8 total threads; this is thread 2
    A: 8 total threads; this is thread 5
    A: 8 total threads; this is thread 1
    A: 8 total threads; this is thread 3
    A: 8 total threads; this is thread 0
    A: 8 total threads; this is thread 6
    
    B: 8 total threads; this is thread 7
    
    B: 8 total threads; this is thread 2
    
    B: 8 total threads; this is thread 4
    
    B: 8 total threads; this is thread 5
    
    B: 8 total threads; this is thread 1
    
    B: 8 total threads; this is thread 3
    
    B: 8 total threads; this is thread 6
    
    B: 8 total threads; this is thread 0
    parent thread done.
    $
    

    (Notice that the order of the threads changes from their first outputs to their second.)

     
  2. [25 pts] Write a program that uses Pthreads to run five threads. Each thread must print out the letter 'A' and the value of its single parameter, cast to a text string. Then use the sleep() function (from "unistd.h") to sleep for two seconds; then print out a blank line, and the letter 'B' and its parameter again.

    Here is an example run of the instructor's solution:

    $
    $ ./sleeper-pthread
    parent thread starts:
    A: alpha
    A: gamma
    A: beta
    A: delta
    A: epsilon
    
    B: gamma
    
    B: epsilon
    
    B: alpha
    
    B: delta
    
    B: beta
    parent thread done.
    $
    

    (Notice that the order of the threads changes from their first outputs to their second.)

     
  3. [25 pts] Here is the main body of a program that creates a large array of random values, then calls a function to sum it up.

    Copy it as-is:

    /*
    /*
    * Generate an array of integers, then call a function that sums them up.
    * Exam 2, 2019-11-10
    */
    #include <stdio.h>
    #include <stdlib.h>     // malloc(), atoi()
    #include <time.h>       // clock()
    
    long unsigned sumup(long unsigned *array, long unsigned size);
    
    int main(int argc, char **argv)
    {
        if (argc < 2) {
            fprintf(stderr, "usage: %s <power of 2>  [seed]\n", argv[0]);
            return 1;
        }
    
        int power = atoi(argv[1]);
        long unsigned nvalues = 1L << (long unsigned)power;
    
        int seed = (argc >= 3) ? atoi(argv[2]) : 7;
        srandom(seed);
    
        fprintf(stderr, "%lu values / %lu bytes (seed %i)\n",
                nvalues, nvalues*sizeof(long unsigned), seed);
    
        long unsigned start = clock();
    
        long unsigned *data = malloc(nvalues * sizeof(long unsigned));   // can make BIG arrays
    
        for (long unsigned i = 0; i < nvalues; i++)
            data[i] = (long unsigned)random() >> 2 ;
    
    
        // FIND THE SUM OF THE ARRAY VALUES
        long unsigned sumit = clock();
    
        long unsigned sum = sumup( data, nvalues );
        double average = (double)sum / (double)nvalues;
    
        long unsigned stop = clock();
    
        fprintf(stdout, "Sum of %lu values is %lu\n", nvalues, sum);
        fprintf(stdout, "Average of %lu values is %lf\n", nvalues, average);
    
        fprintf(stderr, "%f total compute seconds: %f filling seconds, %f sumup seconds.\n",
                (float)(stop - start)/(float)CLOCKS_PER_SEC,
                (float)(sumit - start)/(float)CLOCKS_PER_SEC,
                (float)(stop - sumit)/(float)CLOCKS_PER_SEC);
        return 0;
    }
    

    Write the parallelized function double sumup(float *array, long unsigned size) to sum up the values (the easy part). Your function must use OpenMP.

    • DO NOT parallelize the loop that generates the random data; you should make no changes to main(). Just write a parallelized function to do the summing.
    • Hint: You will need a simple critical section.
    • You can put your function in a separate file, or modify the main file if you prefer.
    • You don't have to write a serial version, but it might be helpful to you. (It's trivial, just a for loop.)

    Here are timings for the instructor's solution on a laptop, for reference:

    256M values

    The program's single argument is a power of 2, used to determine the number of values. An argument of "28" should run fairly quickly, for testing purposes:

    $ gcc -Wall  -o sumrands-serial  sumrands-main.c  sumrands-serial.c
    $ gcc -Wall  -o sumrands-omp  sumrands-main.c  sumrands-omp.c  -fopenmp
    $
    $ time ./sumrands-serial 28
    268435456 values / 2147483648 bytes (seed 7)
    2.667056 total compute seconds: 1.987923 filling seconds, 0.679133 sumup seconds.
    Sum of 268435456 values is 72057918657108261
    Average of 268435456 values is 268436665.300683
    
    real	0m2.729s
    user	0m2.288s
    sys	0m0.440s
    $
    $ time ./sumrands-omp 28
    268435456 values / 2147483648 bytes (seed 7)
    2.852838 total compute seconds: 1.966147 filling seconds, 0.886691 sumup seconds.
    Using 8 threads.
    Sum of 268435456 values is 72057918657108261
    Average of 268435456 values is 268436665.300683
    
    real	0m2.144s
    user	0m2.428s
    sys	0m0.500s
    $
    

    The OpenMP version uses slightly more clocks to sum up, but slightly less real time. (About two-thirds of the user time is spent in the serial array-filling loop at the beginning.)

    2G values

    An argument of "31" produces more interesting times for comparison. (An argument of "32" or larger makes even more interesting times, but don't try this until you're finished with the problem!)

    $
    $ time ./sumrands-serial 31
    2147483648 values / 17179869184 bytes (seed 7)
    38.613407 total compute seconds: 16.428843 filling seconds, 22.184566 sumup seconds.
    Sum of 2147483648 values is 576446126628202761
    Average of 2147483648 values is 268428645.389249
    
    real	3m0.697s
    user	0m24.348s
    sys	0m15.076s
    $
    $ time ./sumrands-omp 31
    2147483648 values / 17179869184 bytes (seed 7)
    27.672680 total compute seconds: 18.896782 filling seconds, 8.775898 sumup seconds.
    Using 8 threads.
    Sum of 2147483648 values is 576446126628202761
    Average of 2147483648 values is 268428645.389249
    
    real	1m6.100s
    user	0m22.192s
    sys	0m6.052s
    $
    

    The serial version shows more sumup time, contributing to a higher total compute time than the OpenMP version; it also has an anomalously long real time.

     
  4. [25 pts] Make another copy of the "sumrands-main.c" program above, and name it "sumrands-pthreads.c". Modify the program to use 16 pthreads.
    • The timing code and the array-creating code in main() should be left alone.
    • Replace the single call to the sumup() function with the code needed to start the pthreads.
    • Add a struct that contains a pointer to the data array, the size of the data array, and anything else the thread may need.
    • You can put your function in a separate file or not, as you prefer.
    • You can use a critical section and a mutex to communicate partial sums back to the parent thread, but you can also include the partial sum in the struct that holds the function arguments and just let the parent get all the partial threads from those structs.

    Here are timings for the instructor's solution on a laptop, for reference:

    Test runs: 256M and 2G values:

    These times are comparable to the "OpenMP" times shown in the previous problem.

    $ gcc -Wall  -o sumrands-pthreads  -lpthread
    $
    $ time ./sumrands-pthreads 28
    268435456 values / 2147483648 bytes (7 seed)
    5.075000 total compute seconds: 2.047424 filling seconds, 3.027576 sumup seconds.
    Using 16 threads.
    Sum of 268435456 values is 17726168783105492820
    Average of 268435456 values is 66035124596.601326
    
    real	0m2.526s
    user	0m4.708s
    sys	0m0.428s
    $
    $ time ./sumrands-pthreads 31
    2147483648 values / 17179869184 bytes (7 seed)
    44.982281 total compute seconds: 20.212025 filling seconds, 24.770256 sumup seconds.
    Using 16 threads.
    Sum of 2147483648 values is 12682107303619871384
    Average of 2147483648 values is 5905566412.778511
    
    real	1m4.665s
    user	0m39.332s
    sys	0m6.284s
    $
    

    The 16-pthreads version compares closely in real time to the OpenMP version (and much better than the serial version), although its apparent sumup time is substantially more.