# An Introduction to Parallel Programming

Chapter01 - Home
Exercises: 1 - 2 - 3 - 4 - 5 - 6 - 7

Exercise. This is the solution to exercise 1.1 in the book.

Solution. Consider that the cores are numbered $0\dots p-1$ ($p$ cores) and the numbers to be added have indices $0\dots n-1$ ($n$ numbers, $n\ge p$). We would obviously like to have as smallest a difference in the number of sums executed by any one processor. This difference is zero when $n$ is divisible by $p$ and one when $n$ is not divisible by $p$. If there are ${p}_{1}$ cores executing $a$ sums and ${p}_{2}$ cores executing $a+1$ sums, then:

$\begin{array}{rcll}p& =& {p}_{1}+{p}_{2}& \text{}\\ n& =& {p}_{1}\cdot a+{p}_{2}\cdot \left(a+1\right)& \text{}\end{array}$

We want to choose $a$ such that $|{p}_{1}-{p}_{2}|$ is minimized. We have that:

$\begin{array}{rcll}n& =& a\cdot p+{p}_{2}& \text{}\\ n& =& a\cdot p+p-{p}_{1}& \text{}\end{array}$

Therefore:

${p}_{2}-{p}_{1}=2n-p-2a\cdot p$

thus we need that:

$|2n-\left(2a+1\right)p|=min$

This expression is minimized when:

$a=⌊\frac{n}{p}⌋$

Therefore,

$\begin{array}{rcll}{p}_{2}& =& n-p⌊\frac{n}{p}⌋=n\phantom{\rule{0.2em}{0ex}}mod\phantom{\rule{0.2em}{0ex}}p& \text{}\\ {p}_{1}& =& p-{p}_{1}=p-\left(n\phantom{\rule{0.2em}{0ex}}mod\phantom{\rule{0.2em}{0ex}}p\right)& \text{}\end{array}$

and thus, the first $n\phantom{\rule{0.2em}{0ex}}mod\phantom{\rule{0.2em}{0ex}}p$ cores execute $⌈\frac{n}{p}⌉$ sums, while the remaining cores execute $⌊\frac{n}{p}⌋$ sums.

which can be simplified to:

and even further to:

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <math.h>

int main(int argc, char **argv){

int N_numbers, N_cores, i, floor_value, ceil_value, remainder_value;

if (argc<5){
printf("Usage: %s -N <numbers> -p <cores>\n", argv[0]);
return -1;
}

for (i=1; i<argc; i++){
if ( (0==strncmp(argv[i],"-N",2)) && (argc>(i+1)) && isdigit(argv[i+1][0]) ){
sscanf(argv[i+1], "%d", &N_numbers);
i++;
} else
if ( (0==strncmp(argv[i],"-p",2)) && (argc>(i+1)) && isdigit(argv[i+1][0]) ){
sscanf(argv[i+1], "%d", &N_cores);
i++;
} else {
printf("%s\n", "Unexpected argument. Exiting ...");
printf("Usage: %s -N <numbers> -p <cores>\n", argv[0]);
return -1;
}
}
floor_value = floor((double)N_numbers/(double)N_cores);
ceil_value = ceil((double)N_numbers/(double)N_cores);
remainder_value = N_numbers % N_cores;

N_cores = (N_numbers >= N_cores) ? N_cores : N_numbers;

for (i=0; i<N_cores; i++){
if (i <= remainder_value-1){
printf("Core %2d:%3d...%3d\n", i, i*ceil_value, (i+1)*ceil_value-1);
}else{
printf("Core %2d:%3d...%3d\n", i, i*floor_value+remainder_value, (i+1)*floor_value+remainder_value-1);
}
}
return 0;
}