Random Access Machine (RAM)
The random access memory is a theoretical model of computation that is used to analyze sequential algorithms.
- Infinity Memory: the machine has an infinite amount of memory cells;
- Arbitrary Integer Size: each memory cell can store an integer of arbitrary size;
- Uniform Cost Model: each basic operation (arithmetic, memory access, etc.) takes exactly one time unit.
RAM Complexity Analysis
The complexity of an algorithm is measured by:
- Time complexity (T(n)): number of steps to complete the algorithm with input size n.
- Space complexity (S(n)): number of memory cells used during the execution of the algorithm with input size n.
Parallel Random Access Machine (PRAM)
The parallel random access machine is a theoretical model of computation that is used to analyze parallel algorithms.
M' = \langle M, X, Y, A \rangle
- M: is a set of RAM machines (processors) with the ability to recognize its own index;
- X: is a set of input cells shared by all the processors;
- Y: is a set of output cells shared by all the processors;
- A: is a set of shared memory cells.
Each time step, each processor can simultaneously perform:
- Read from shared memory (A, X);
- Perform an internal computation;
- Write to shared memory (A, Y).
PRAM Classification
PRAMs are classified on how they handle concurrent access to the shared memory.
Read from the shared memory can happen in two ways:
- Exclusive Read (ER): All processors can read distinct memory cells at the same time.
- Concurrent Read (CR): Two or more processors can access the same memory cell at the same time.
Write in the shared memory can happen in two ways:
- Exclusive Write (EW): All processors can write in distinct memory cells at the same time.
- Concurrent Write (CW): Two or more processors can
access the same memory cell at the same time.
- Priority: each processor has a priority level. The highest is the one allow to write.
- Common: all processors must write the same value.
- Random: one of the processors is chosen randomly to write.
The classification is done by combining the two types of access.
PRAM Complexity
The complexity of a PRAM algorithm is measured by:
Metric | Symbol | Formula | Context / Meaning |
---|---|---|---|
Elapsed Time | T_p(n) | - | The number of steps (time) to complete the algorithm of size n using p processors. |
Sequential Time | T^*(n) | - | The time complexity of the best known sequential algorithm for the same problem. \mathbf{T^*(n) \neq T_1(n)}. |
Speedup | SU_p(n) | \frac{T^*(n)}{T_p(n)} | Measures how much faster the parallel algorithm is compared to the best sequential one. Ideally, SU_p(n) \approx p (linear speedup). |
Efficiency | E_p(n) | \frac{SU_p(n)}{p} = \frac{T^*(n)}{p \cdot T_p(n)} | Measures the average utilization of the p processors. 0 \leq E_p(n) \leq 1. An efficiency close to 1 is considered optimal. |
Cost / Work | C_p(n) = W(n) | p \cdot T_p(n) | The total number of operations performed by all p processors during the execution. |
Scaling
The scaling is the ability of an algorithm to efficiently use an increasing number of processors.
The scaling is the relationship between three parameters:
- Problem Size (n): the size of the input data;
- Number of Processors (p): the number of processors used in the computation;
- Time (T): the time taken to complete the computation.
The scaling is classified in two types:
Strong Scaling
In strong scaling the problem size (n) is kept constant while the number of processors (p) is increased. The goal is to reduce the time (T) taken to complete the computation.
To analyze strong scaling, we use Amdahl’s Law:
SU_p(n) = \frac{1}{(1 - f) + \frac{f}{p}}
- f: fraction of the algorithm that can be parallelized;
- 1 - f: fraction of the algorithm that is sequential (cannot be parallelized).
Due to the sequential part, there is a limit to the speedup that can be achieved by adding more processors.
\lim_{p \to \infty} SU_p(n) = \frac{1}{1 - f}
This means that even with an infinite number of processors, the speedup is limited by the sequential portion of the algorithm.
Ideally, for strong scaling, we want:
T \propto \frac{1}{p}
Weak Scaling
In weak scaling, the problem size (n) is increased proportionally with the number of processors (p), keeping the workload per processor constant. The goal is to maintain a constant time (T) as both n and p increase.
To analyze weak scaling, we use Gustafson’s Law:
SU_p(n) = s + P \cdot (1 - s)
- s: fraction of the algorithm that is sequential (cannot be parallelized);
- 1 - s: fraction of the algorithm that can be parallelized.
This law suggests that as we increase the number of processors, the speedup can grow linearly with p, assuming the parallelizable portion is significant.
Ideally, for weak scaling, we want:
T \propto P