Chapter 1 solutions - 1.3, 1.4, 1.6, 1.7 #---------------------------------------------------------------- 1.3. divisor = 2; core_difference = 1; sum = my_value; while ( divisor <= number of cores ) { if ( my_rank % divisor == 0 ) { partner = my_rank + core_difference; receive value from partner core; sum += received value; } else { partner = my_rank - core_difference; send my sum to partner core; } divisor *= 2; core_difference *= 2; } #---------------------------------------------------------------- 1.4. bitmask = 1; divisor = 2; sum = my_value; while ( bitmask < number of cores ) { partner = my_rank ^ bitmask; if ( my_rank % divisor == 0 ) { receive value from partner core; sum += received value; } else { send my_sum to partner core; } bitmask <<= 1; divisor *= 2; } #---------------------------------------------------------------- 1.6. (a) The number of receives is p − 1, and the number of additions is p − 1. (b) The number of receives is log2(p), and the number of additions is log2(p). (c) p Original Tree-Structured 2 1 1 4 3 2 8 7 3 16 15 4 32 31 5 64 63 6 128 127 7 256 255 8 512 511 9 1024 1023 10 #---------------------------------------------------------------- 1.7. The example is a combination of task- and data- parallelism. In each phase of the tree-structured global sum, the cores are computing partial sums. This can be seen as data-parallelism. Also, in each phase, there are two types of tasks. Some cores are sending their sums and some are receiving another cores partial sum. This can be seen as task-parallelism. #----------------------------------------------------------------