The 12-processor machine ``zeipel'' from
Hewlett-Packard
is a ``V2500 PA 2.0'' system.
Now, there is a first success to force the compiler to accept the OpenMP
directives in CO5BOLD. Yet, when running on several processors, only some
routines (e.g. rhd_shortrad_dirsimple1
) in CO5BOLD can benefit while
others (rhd_shortrad_dirsimple2
, rhd_shortrad_dirsimple3
) are
significantly slower than on one processor.
In addition, the single-processor performance is not very good, partly because
the achievable optimization level is not very high.
Some macros, which seem to be necessary:
+U77
: Link proper library to make the machine understand
e.g. call flush(6)
.
+cpp=yes
: Switch on the C preprocessor. Note that all Fortran90 files have to
end with ``.f90''. The ``.F90'' suffix does not seem to work.
+Oparallel +Oopenmp +Onoautopar
: Try to enable parallelization with
OpenMP directives, disable auto-parallelization.
+Onoinline
: Disables inlining. This can simplify things. With a proper
choice of routine versions inlining is not really necessary anymore.
+O3 +Olimit
: General optimization with limited resource usage
during compilation. Some modules should only be compiled with +O2
, others
compile even with +O3 +Onolimit
.
The UIO modules and the string handling module should be compiled in debug mode. A proposed compiling sequence is (MSrad does not compile; all other modules are activated):