All modern computers can calculate (within the processor) much faster than
they can access external memory.
Techniques to improve the performance of algorithms have to take this into account
(simplified crash course):
- Efficient use of registers:
Lots of operations with the same scalars are good.
- Cache optimization:
Lots of operations with the same vectors are good.
- Vectorization (on dedicated vector machines or PCs):
``Smooth'' access to arrays and simple operations are good.
- Parallelization:
Local access to array regions is good.
Hydrodynamics and parallelization:
- A bounded numerical domain of dependence allows straightforward domain decomposition
and distribution over different machines.
Radiation transport and parallelization:
- Non-local radiation transport (integration along rays) requires more communication
between processors and limits the performance on parallel computers.