-
Notifications
You must be signed in to change notification settings - Fork 60
Code style guide
MOM6 makes use of Fortran2003 with some judicious use of Fortran2008 constructs, but it does not use any extensions beyond what is in those standards. To help keep the code readily understandable, this page makes recommendations about how to use or not use the various features of the modern Fortran language.
Code style is typically a personal choice, but when styles clash it can lead to discord. These standards have been adopted in an attempt to promote harmony and clarity.
- No tabs
- No trailing white space
- Indents should be consistent (same amount used in contiguous blocks)
- Preferred indent is 2 spaces
- "preferred" might understate the reaction invoked by other indent amounts! 😉
- Continuation lines should be indented at least 4 spaces, but more space can be used if it helps align lines
- No white space between a function name and the opening parenthesis
- White space after all language token
-
if(a==0)
is legal fortran but bad style. Useif (a==0)
instead. -
if (a == 0)
is even better, since==
is a language token.
-
- Use space around the equal sign in variable assignment, but not when using a named optional argument
-
a = b
is strongly preferred overa=b
- One exception is loop indices, where
do i=is,ie
is acceptable
- One exception is loop indices, where
-
call fn(arg1, arg_name=a)
is strongly preferred overcall fn(arg1, arg_name = a)
-
Some compilers handle very long lines of code gracefully, but MOM6 needs to adhere to the Fortran standard, which is 132 characters for code, after any macro expansion. MOM6 does use macros for some memory declarations, so we need to build in some added space in setting MOM6 guidelines:
- The target maximum length for MOM6 code lines (exclusive of comments) is 100 characters
- 80 character lines can be much easier to read if printed; smaller lines are encouraged where they make sense
- No lines of MOM6 code should exceed 120 characters, including comments
- Local variable declarations appear after all the dummy argument declarations, and in the case of a function the return value. We often use
! Local variables
to delineate between the argument and local variable declarations. - Local variables should preferably be descriptive multi-character names meaningful in their context, e.g.
del_rho_int
(delta rho at interface). - If using a highly abbreviated or short name, the declaration MUST be commented.
- Multi-word names should use snake_case (e.g.
delta_rho
).
-
do
andif
constructs should be terminated with the combinedenddo
andendif
statements, respectively. - All other block end statements separate the
end
token (e.g.end program [label]
)- Examples:
program
,module
,type
,subroutine
,function
,interface
,select
- Examples:
-
i
,j
,k
are used for cell-center, layer-center references, e.g.h(i,j,k)
,T(i+1,j,k)
. -
I
,J
are used for staggered, cell-edge references, e.g.u(I,j,k)
,v(i,J,k)
,q(I,J,k)
,u(I-1,j,k)
. We use a north-east staggering convention so theI
means i+1/2 andI-1
means i-1/2. -
K
is used for the interface (between layer) references, e.g.del_t(i,j,K) = T(i,j,K+1) - T(i,j,K)
. The vertical staggering is such that interfaceK=1
is above layerk=1
so thatK
means k-1/2 andK+1
means k+1/2.
- Absolutely NO!
- There are a few exceptions which are strictly for debugging non-shared memory applications. Do not use these as an excuse for adding module data.
- Modules may use interfaces, data-types, and constant parameters from other modules via module use statements
- Modules may not use variables from other modules via use statements
- All MOM variables are passed around as explicit arguments in interfaces.
- All module use statements must include the
, only
modifier
- Absolutely NO!
- All MOM6 modules must declare
implicit none ; private
- Top-level drivers (i.e., files declaring a program main()) only need
implicit none
- Top-level drivers (i.e., files declaring a program main()) only need
- We do not permit scalar-style expressions without the colon notation, e.g.
-
tv%S = 0.
is forbidden.
-
- We do allow array syntax for whole array initialization, e.g.
tv%S(:,:,:) = 0.
- We do allow array syntax for identical copies, e.g.
S_tmp(:,:,:) = tv%S(:,:,:)
- We do not allow whole array-syntax for math expressions that include halos because halos are not guaranteed to have valid data:
-
tmp(:,:) = 1.0 / G%areaT(:,:)
might have zeros in the halo region. -
call post_data(id_AT, G%areaT(:,:)*tv%(T(:,:,1))
is wrong because it can use uninitialized data in halos.
-
- All needed data is passed via arguments to subroutines and functions, or as the returned value of a function.
- All arguments must have declared intent, with the exception of pointers: i.e.
intent(in)
,intent(out)
,intent(inout)
. - Opaque types are preferred, i.e. referencing members of types defined in other modules is discouraged.
- Do it when you are writing the code in the first place!
- All subroutines, functions, arguments, and elements of public types should be described in with Doxygen comments.
- All real variables should have a full physical description, including units.
- All comments should be clearly written and grammatically correct; American spelling is preferred.
Divisions are prone to NaNs and relatively expensive. An optimizing compiler will often rearrange math which makes debugging divisions by zero harder to catch.
- Many common reciprocals are pre-computed
- Use
Q(i,j) * G%IareaT(i,j)
instead ofQ(i,j) / G%areaT(i,j)
.
- Use
- Never write
B / C * D
which is ambiguous to humans (not the compiler)- Use
( B * D ) / C
- Use
- Never double divide:
A / ( A + B / C)
- Use
( A * C ) / ( A * C + B)
- Use
Floating point operations are sensitive to the order of operations (associativity), which can not generally be guaranteed due to compiler serialization and optimization.
Addition operations must be done in pairs. When more than one addition is required, the order should be specified using parentheses.
- This is bad:
z = a + b + c
- This is good:
z = (a + b) + c
Ideally, the order of operation should be chosen to give the best accuracy. For example, if a = 1.
b = -1.
and c = 1.e-20
, then the order should be chosen to preserve the residual value.
- This is bad:
a + (b + c) == 0.
- This is good:
(a + b) + c == 1.e-20
Not only does this impact reproducibility, but the second choice is more accurate and avoids a potential division by zero.
All operations should be ordered, but no particular ordering is enforced. Contributors are encouraged to consider the most accurate order of operations.
We avoid the Fortran sum()
intrinsic since the result is dependent on the order of operations within the summation. Using explicit loops allows us to define the order of summation. So
a = sum(b(:))
should be
a = 0.
do k = 1, nz
a = a + b(k)
enddo
The prod()
and matmul()
intrinsics should also not be used.
Floating point operations across MPI ranks are volatile, since the order can change depending on the state of the network. Functions such as MPI_Reduce
will not generally be reproducible when used for floating point arithmetic.
When performing summations over MPI ranks, use the reproducing_sum
function.
use MOM_coms, only: reproducing_sum
...
sum = reproducing_sum(array(:,:))
Multiplication is also non-associative and thus not reproducible, but the impact is typically small. Results may depend on the order of operations, most often in the least significant bit of the fractional component.
In single precision, if a = b = 1 + 2**-23
and c = 1.5
, then the following calculations differ:
(a*b)*c = 1.50000036
a*(b*c) = 1.50000048
Parentheses in multiplication operations are currently not enforced, but contributors should consider using them when applicable.
Use of transcendental functions, such as trigonometric functions, non-integer powers, and logarithms, are often implementation-dependent and should be avoided when possible.
The exponent operator, a**b
should be used sparingly, since compilers will often internally replace it with pow(a, b)
, which is often computed as a transcendental function, exp(b * log(a))
. Even small integral power, such as a**3
, have been known to be replaced with pow(a, 3)
. To maximize reproducibility, integral powers should be explicitly computed, e.g. a3 = a * a * a
.
Square roots (a**0.5
) should always use the sqrt()
intrinsic. An IEEE-754 compliant sqrt
function must be exactly rounded.
Cube roots (a**(1./3.)
) should be avoided, the MOM6 intrinsic cuberoot
should be used. This is not exactly rounded, but it is reproducible.