CO5BOLD has been compiled and tested on up to 8 processors on the SGI 2000 machine at TAC in Copenhagen and the SGI 3800 machine at the NSC in Linköping. The code was used on the UKAFF machines (see Sect. 3.7.17) and the computers at CINES.
See e.g. the excellent SGI Fortran90 manual (or here) Information about the CINES machines can be found under CINES, or CINES : Introduction au calcul scientifique.
Important switches are:
-macro_expand: Enable macro expansion
-mp: Enable parallelization with OpenMP directives
-INLINE:aggressive=ON -INLINE:list -INLINE:preempt=ON: General keywords for inlining
-INLINE:must=...: Optimization: routines that should be inlined (see Sect. 3.7.2).
-Ofast -OPT:Olimit=0: General optimization. On older compiler versions
-O3was the achievable optimum.
-IPA:plimit=5500: Even more optimization. This option requires lots of memory (> 1 GByte). To get it it might be necessary to ask for more than one processor for the compilation (especially on the CINES machines).
-CG:longbranch_limit=60000: This switch limits needed compiler resources. It is suggested by the compiler (on the CINES machines) itself.
-Drhd_roe1d_step_l01=1: Slight performance improvement