3.4 Compiler Macros

Some of the modules of the COBOLD code (with suffix ``.F90'') employ compiler macros
to switch between code versions during compile time.
Typically you define at least on of the three switches `rhd_r01`

, `rhd_r02`

,
or `rhd_r03`

to choose a radiation transport module.
The others have reasonable default values.
To find the combination with the optimal performance, you should look into Sect. 3.5

`rhd_r01`

: in`rhd.F90`

, (``rhd radiation 01'').

Switch to include LHDrad radiation transport module. It uses long characteristics and is restricted to an equidistant grid and open boundaries at all surfaces (old ``supergiant module''). Values:`undefined`

: (default) LHDrad routines are deactivated.`1`

: LHDrad routines are recognized by the compiler.

`rhd_r02`

: in`rhd.F90`

, (``rhd radiation 02'').

Switch to include MSrad radiation transport module. It uses long characteristics. The lateral boundaries have to be periodic. Top and bottom can be closed or open (``solar module''). Values:`undefined`

: (default) MSrad routines are deactivated.`1`

: MSrad routines are recognized by the compiler.

`rhd_r03`

: in`rhd.F90`

, (``rhd radiation 03'').

Switch to include SHORTrad radiation transport module. It uses short characteristics and is restricted to an equidistant grid and open boundaries at all surfaces (new ``supergiant module''). Values:`undefined`

: (default) SHORTrad routines are deactivated.`1`

: SHORTrad routines are recognized by the compiler.

`IDF`

: in`rhd_hyd_module.F90`

, (``Integer Delta Flux'').

Number of padding cells for flux-like variables. This number was introduced to check whether the increase of the size of vectors for flux-like quantities (defined at cell boundaries) can improve the performance (especially on a CRAY machine). The gain is marginal (if present at all). The parameter is usually set to zero or left undefined. Values:`0`

: (default) no padding cells`1,2,3,`

...: extra padding cells

`rhd_hyd_gravcorr_p01`

: in`rhd_hyd_module.F90`

, (``rhd hydrodynamics gravitation correction parameter 01'').

This parameter controls the way the Roe solver handles the source terms due to gravity. A different choice results in different simulation results and not just in slightly faster (or slower) code. The problem is that the original Roe solver interpretes the pressure gradient in a hydrostatic stratification a fluctuation due to shock waves. In case of strong stratification this can lead to weird effects. With activated correction the Roe solver treats only the deviations from a hydrostatic stratification as due to waves (or shocks). Several correction formulas have been tried. The latest is the recommended default. Values:`0`

: No pressure correction terms in Roe solver.`1`

: Simple correction with rhomean, no new average pressure.`2`

: Simple correction with rhomean, new average pressure.`3`

: Correction with local rho, limited, new average pressure.`4`

: Correction with local rho, new (different formula) average pressure.`5`

: (default) Correction with local rho, new limit, new average pressure.

`rhd_hyd_entropyfix_p01`

: in`rhd_hyd_module.F90`

, (``rhd hydrodynamics entropy fix parameter 01'').

The entropy fix can be done in one of two ways to get optimum performance (with essentially the same results). Values:`0`

: (default) ``if...then...else'' construction`1`

: use a mask and the signum function

`rhd_hyd_upwind_p01`

: in`rhd_hyd_module.F90`

, (``rhd hydrodynamics upwind parameter 01'').

The determination of the upwind direction can be done in one of two ways to get optimum performance (with essentially the same results). Values:`0`

: (default) ``if...then...else'' construction`1`

: use a mask and the signum function

`rhd_hyd_roe1d_l01`

: in`rhd_hyd_module.F90`

, (``rhd hydrodynamics roe 1 dimension loop 01'').

The computation of the Roe fluxes can be done by either of two sets of routines to find the set which gives optimum performance (with essentially the same results). Values:`0`

: (default) lots of small routines acting on scalars, inlining needed, cache reuse is optimized`1`

: routines acting on arrays, more temporary arrays necessary, vectorization is easier

`rhd_bound_t01`

: in`rhd_hyd_module.F90`

, (``rhd bound timing 01'').

Produce timing information for ``inner boundary'' routine (central potential) or lower and upper boundary routines (constant gravitation). It can be used together with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_roe1d_flux_t01`

: in`rhd_hyd_module.F90`

, (``rhd roe 1 dimension flux timing 01'').

Produce timing information for the routine which computes the Roe fluxes. It should not be used in conjunction with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_roe1d_step_t01`

: in`rhd_hyd_module.F90`

, (``rhd roe 1 dimension step timing 01'').

Produce timing information for the routine which performs the Roe step. It should not be used in conjunction with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_h02`

: in`rhd.F90`

,`rhd_hyd_module.F90`

, (``rhd hydrodynamics 02'').

Values:`undefined`

: (default) Skip routine`vanLeer1d_step`

during compilation`defined`

: Compile routine`vanLeer1d_step`

`rhd_vis_density_p01`

: in`rhd_vis_module.F90`

, (``rhd viscosity density parameter 01'').

Choose formula for density average at cell boundary in tensor viscosity routines. Values:`0`

: rhomean=min(rholeft,rhoright)`1`

: (default) rhomean=0.5 * (rholeft + rhoright)

`rhd_vis_t01`

: in`rhd_vis_module.F90`

, (``rhd viscosity timing 01'').

Produce timing information for 2D/3D tensor viscosity routines. It should not be used in conjunction with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_rad3d_toray_l01`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions to ray loop 01'').

There might be a performance gain by splitting the main loop in routine`rhd_rad3d_toray`

into three separate loops. Typically, one big loop is to be preferred. Values:`undefined`

: (default) One big loop`defined`

: Three smaller loops

`rhd_rad3d_fromray_l01`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions from ray loop 01'').

There might be a performance gain by splitting a big loop in routine`rhd_rad3d_fromray`

into two separate loops. Typically, one big loop is to be preferred. Values:`undefined`

: (default) One big loop`defined`

: Two smaller loops

`rhd_rad3d_r02`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions radiation 02'').

Module`rhd_lhdrad_module`

contains a routine for the handling of periodic boundaries. It is in an experimental state and is deactivated by default. Values:`undefined`

: (default) Skip routine`rhd_rad3d_dirper`

during compilation`defined`

: Compile routine`rhd_rad3d_dirper`

`rhd_rad3d_solve_t01`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions solve timing 01'').

Produce timing information for the routines which solves the 1D radiation transport equation along single ray. This routine is called very frequently. The timing measurement might slow it down somewhat. It should not be used in conjunction with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_rad3d_dir_t01`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions direction timing 01'').

Produce timing information for the routines which solves the radiation transport equation for one direction field. The timing measurement are called very frequently and might slow down the code. It should not be used in conjunction with OpenMP. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_rad3d_step_t01`

: in`rhd_lhdrad_module.F90`

, (``rhd radiation 3 dimensions step timing 01'').

Produce timing information with main 3D radiation transport routine. It can be used together with OpenMP and should cause no noticeable performance loss. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_shortrad_operator_l01`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation operator loop 01'').

Choose type of short characteristics operator. Values:`0`

: simple test operator, fast but results are utterly useless!`1`

: (default) case distinction with ``if..then..else'' construct`2`

: case distinction with masks (weights 0.0 or 1.0)

`rhd_shortrad_formal_l01`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation formal loop 01'').

Select version of loop splitting for exp(-dtau) computation. Values:`0`

: (default)`dtauhalf`

,`exp_mdtauhalf`

,`expl2t_mdtauhalf`

are computed in single loop`1`

: (`dtauhalf`

,`exp_mdtauhalf`

), (`expl2t_mdtauhalf`

) are computed in separate loops. This prevents the SUN1 machine (Sunfire, Solaris, Forte 6.2) from doing some performance degrading optimization

`rhd_shortrad_dir1_l01`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation direction 1 loop 01'').

Choose routine version for rays in x1 direction. Values:`0`

: (default) Use routine with permuted indices for rays in x1 direction. In this case the innermost loop index is the third array index. The transposition of arrays is not needed but some machines (e.g. SUN1) do not like this index arrangement.`1`

: Transpose arrays and use routine`rhd_shortrad_dir3`

for rays in x1 direction. The extra step for the transposition of some arrays (and the reverse procedure) needs some time. But now the routine with the optimum index ordering can be used.

`rhd_shortrad_dir_l02`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation direction loop 02'').

Determine position of PARALLEL statement relative to outer loop in`rhd_shortrad_dirX`

. Both settings give the same results but might show a different performance on a specific machine. Values:`0`

: (default) PARALLEL statement inside of outer loop`1`

: PARALLEL statement outside of outer loop

`rhd_shortrad_formal_t01`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation formal timing 01'').

Produce timing information for routine which gives the formal solution of the radiation transport equation with the help of short characteristics. It can be used together with OpenMP and should cause no noticeable performance loss. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time

`rhd_shortrad_step_t01`

: in`rhd_shortrad_module.F90`

, (``rhd short-characteristics radiation step timing 01'').

Produce timing information for main short characteristics routine. It can be used together with OpenMP and should cause no noticeable performance loss. Values:`undefined`

: (default) no timing information`defined`

: call subroutines to measure elapsed time