CO5BOLD has been compiled and tested on up to 8 processors on the SGI 2000 machine at TAC in Copenhagen and the SGI 3800 machine at the NSC in Linköping. More recently, the code was used on the UKAFF machines (see Sect. 3.7.17) and the computers at CINES.
See e.g. the excellent SGI Fortran90 manual (or here) Information about the CINES machines can be found under CINES, or CINES : Introduction au calcul scientifique.
Important switches are:
-macro_expand
: Enable macro expansion
-mp
: Enable parallelization with OpenMP directives
-INLINE:aggressive=ON -INLINE:list -INLINE:preempt=ON
: General keywords
for inlining
-INLINE:must=...
: Optimization: routines that should be
inlined (see Sect. 3.7.2).
-Ofast -OPT:Olimit=0
: General optimization.
On older compiler versions -O3
was the achievable optimum.
-IPA:plimit=5500
: Even more optimization. This option requires lots of
memory (> 1 GByte). To get it it might be necessary to ask for more than one
processor for the compilation (especially on the CINES machines).
-CG:longbranch_limit=60000
: This switch limits needed compiler resources.
It is suggested by the compiler (on the CINES machines) itself.
-Drhd_roe1d_step_l01=1
: Slight performance improvement