To ensure a good performance on modern CPUs vectorization has to be enabled
usually by enabling a sufficiently high optimization level with for instance
-O3
,
-Ofast
,
-fast
,
etc, see Sect.4.6.
The performance can be improved by choosing optimal branches in CO5BOLD, i.e., by determining the optimal values for
rhd_box_arrays01
: pointer or (better) allocatable arrays
in the data structure ``box'',
see Sect.4.4.1.4. The classical implementation with pointer
arrays works in most (all?) cases and is the default.
However, the implementation with allocatable arrays is a bit faster but causes runtime errors
with several - current - compilers.
gasinter_l01
: array treatment
in the EOS routines,
see Sect.4.4.3.1.
gasinter_l02
: pointer or (better) allocatable arrays
in the data structure for the EOS tables,
see Sect.4.4.3.2.
opta_switch_l01
: array treatment
in the opacity routines,
see Sect.4.4.4.1.
rhd_hyd_entropyfix_p01
: loop with ``if..then..else'' or masks
in rhd_hyd_module.F90
,
see Sect.4.4.5.2.
rhd_hyd_upwind_p01
: loop with ``if..then..else'' or masks
in rhd_hyd_module.F90
,
see Sect.4.4.5.3.
rhd_roe1d_flux_l01
: treatment of upwind direction
in rhd_hyd_module.F90
,
see Sect.4.4.5.4.
rhd_shortrad_operator_l01
: operator version
in rhd_shortrad_module.F90
,
see Sect.4.4.7.10.
rhd_shortrad_operator_l02
: way of operator inlining
in rhd_shortrad_module.F90
,
see Sect.4.4.7.11.
rhd_shortrad_dtauop_l01
: operator version
in rhd_shortrad_module.F90
,
see Sect.4.4.7.12.
rhd_shortrad_dtauop_l02
: way of operator inlining
in rhd_shortrad_module.F90
,
see Sect.4.4.7.13.
rhd_shortrad_dir1_l01
: direct integration or transpose for nearly vertical rays
in rhd_shortrad_module.F90
,
see Sect.4.4.7.15.
MSrad_raytas
: loop type
in MSrad3D.F90
,
see Sect.4.4.7.20.