Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous Integration with Nvidia compiler #435

Closed
wants to merge 127 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
127 commits
Select commit Hold shift + click to select a range
8107c70
Foundatio for Nvidia CI script
dustinswales Feb 29, 2024
271710b
Update CI
dustinswales Feb 29, 2024
2321196
Update CI
dustinswales Feb 29, 2024
0bfbe99
Update CI
dustinswales Feb 29, 2024
c651163
Update CI
dustinswales Feb 29, 2024
915bd81
Update CI
dustinswales Feb 29, 2024
6aba5e8
Update CI
dustinswales Feb 29, 2024
4ec0f3d
Update CI
dustinswales Feb 29, 2024
3474475
Update CI
dustinswales Feb 29, 2024
2065f86
Update CI
dustinswales Feb 29, 2024
abf0755
Update CI
dustinswales Feb 29, 2024
53260ae
Update CI
dustinswales Feb 29, 2024
09bad92
Update CI
dustinswales Feb 29, 2024
95fac48
Update CI
dustinswales Feb 29, 2024
79a3e84
Update CI
dustinswales Feb 29, 2024
4e242d6
Update CI
dustinswales Feb 29, 2024
0621092
Update CI
dustinswales Feb 29, 2024
0913488
Update CI
dustinswales Feb 29, 2024
e3ef930
Update CI
dustinswales Feb 29, 2024
27c6225
Update CI
dustinswales Feb 29, 2024
9eeb2cc
Update CI
dustinswales Feb 29, 2024
ab09969
Update CI
dustinswales Feb 29, 2024
29d6e4e
Update CI
dustinswales Mar 1, 2024
17f0e73
Update CI
dustinswales Mar 1, 2024
4ca35b2
Update CI
dustinswales Mar 1, 2024
69d0945
Update CI
dustinswales Mar 1, 2024
66c067f
Update CI
dustinswales Mar 1, 2024
584ae01
Update CI
dustinswales Mar 1, 2024
98eaa8a
Update CI
dustinswales Mar 1, 2024
e95f704
Update CI
dustinswales Mar 1, 2024
62a4ca6
Update CI
dustinswales Mar 1, 2024
78365ad
Update CI
dustinswales Mar 1, 2024
7cf9e40
Update CI
dustinswales Mar 1, 2024
eb53bde
Update CI
dustinswales Mar 1, 2024
55921f7
Update CI
dustinswales Mar 1, 2024
cf076f9
Update CI
dustinswales Mar 1, 2024
1ef18b3
Update CI
dustinswales Mar 1, 2024
20e4948
Update CI
dustinswales Mar 1, 2024
629908e
Update CI
dustinswales Mar 1, 2024
31cae53
Update CI
dustinswales Mar 1, 2024
880165d
Update CI
dustinswales Mar 1, 2024
3fc5f81
Update CI
dustinswales Mar 1, 2024
2194f2e
Update CI
dustinswales Mar 1, 2024
3bd3796
Update CI
dustinswales Mar 1, 2024
7d9c4af
Update CI
dustinswales Mar 1, 2024
19081c6
Update CI
dustinswales Mar 1, 2024
45b76c4
Update CI
dustinswales Mar 1, 2024
0d8c9e4
Update CI
dustinswales Mar 1, 2024
3365312
Update CI
dustinswales Mar 1, 2024
8ded5a0
Update CI
dustinswales Mar 1, 2024
d06dd31
Update CI
dustinswales Mar 1, 2024
52a22f2
Update CI
dustinswales Mar 1, 2024
fe42708
Update CI
dustinswales Mar 1, 2024
a6955ab
Update CI
dustinswales Mar 1, 2024
c94ba5a
Update CMakeLists for Nvidia support
dustinswales Mar 1, 2024
7ff6081
Update CMakeLists for Nvidia support
dustinswales Mar 1, 2024
b448eb4
Update CI
dustinswales Mar 1, 2024
b268880
Update CI
dustinswales Mar 1, 2024
f18a65a
Update CI
dustinswales Mar 1, 2024
af05846
Update CI
dustinswales Mar 1, 2024
47f0a6c
Updated CI
dustinswales Mar 1, 2024
d2c04ed
Update cmakelists
dustinswales Mar 4, 2024
90b7fea
Update CI
dustinswales Mar 4, 2024
ddb3591
Update CI
dustinswales Mar 4, 2024
7ef27e8
Update CI
dustinswales Mar 4, 2024
dbc4334
Update CI
Mar 5, 2024
a38f5c0
Update CI
Mar 5, 2024
3f79dea
Update CI
Mar 5, 2024
0089031
Update CI
Mar 5, 2024
8031de3
Update CI
Mar 5, 2024
789711c
Update CI
Mar 5, 2024
28267f8
Update CI
Mar 5, 2024
1574900
Update CI
Mar 5, 2024
923fa46
Update CI
Mar 5, 2024
9768b94
Update CI
Mar 5, 2024
e93fb03
Update CI
Mar 5, 2024
74428b9
Update CI
Mar 5, 2024
2f6c166
Update CI
Mar 5, 2024
5e97c33
Update CI
Mar 5, 2024
9286df0
Update CI
Mar 5, 2024
fa714f4
Update CI
Mar 5, 2024
de7cde1
Update CI
Mar 5, 2024
22f18fc
Update CI
Mar 5, 2024
990e36b
Update CI
Mar 5, 2024
bf45ad5
Update CI
Mar 5, 2024
f6673e7
Update CI
Mar 5, 2024
8f57ff0
Update CI
Mar 5, 2024
407280d
Update CI
Mar 5, 2024
3d10750
Update CI
Mar 6, 2024
cf8830f
Update CI
Mar 6, 2024
bb798ac
Update CI
Mar 6, 2024
398a292
Update CI
Mar 6, 2024
6bc8277
Update CI
Mar 6, 2024
24e091e
Update CI
Mar 6, 2024
b1ce93c
Update CI
Mar 6, 2024
9df5652
Update CI
Mar 6, 2024
fb7f34f
Update CI
Mar 6, 2024
af52b76
Update CI
Mar 6, 2024
035a765
Update CI
Mar 6, 2024
d7a6bff
Update CI
Mar 6, 2024
fbf5856
Update CI
Mar 6, 2024
897cfca
Update CI
Mar 6, 2024
3eab670
Update CI
Mar 6, 2024
f030d72
Update CI
Mar 6, 2024
442f354
Update CI
Mar 6, 2024
b901cd4
Update CI
Mar 6, 2024
7970fdc
Update CI
Mar 6, 2024
09928ca
Update CI
dustinswales Mar 6, 2024
5a46115
Add file for Nvidia RTS
dustinswales Mar 6, 2024
f89b1d6
Update CI
dustinswales Mar 6, 2024
51b7baa
Update CI
dustinswales Mar 6, 2024
d712d56
Revert change to CMakeLists
dustinswales Mar 6, 2024
0b2ee57
Revert change to CMakeLists
dustinswales Mar 6, 2024
16c7938
Revert change to CMakeLists
dustinswales Mar 6, 2024
9fe09eb
Revert change to CMakeLists
dustinswales Mar 6, 2024
659d780
Update CI
dustinswales Mar 6, 2024
229473c
Update CI
dustinswales Mar 6, 2024
ba034f4
Update CI
dustinswales Mar 6, 2024
59b73cd
Update CI
dustinswales Mar 6, 2024
c12544c
Update CI
dustinswales Mar 6, 2024
f82f995
Update CI
dustinswales Mar 6, 2024
8d6a1b5
Update CI
dustinswales Mar 6, 2024
9521536
Update CI
dustinswales Mar 7, 2024
999d6cb
Update CI
dustinswales Mar 7, 2024
4d71d1e
Update CI
dustinswales Mar 7, 2024
208bed3
Update CI
dustinswales Mar 7, 2024
7e58c6d
Update CI
dustinswales Mar 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 264 additions & 0 deletions .github/workflows/ci_build_scm_ubuntu_22.04_nvidia.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
name: CI test to build the CCPP-SCM on ubuntu v22.04

on: [push,pull_request,workflow_dispatch]

jobs:

build_scm:
# The type of runner that the job will run on
runs-on: ubuntu-22.04
strategy:
matrix:
fortran-compiler: [nvfortran]
build-type: [Release]#, Debug]
enable-gpu-acc: [False, True]
py-version: [3.7.13, 3.9.12]

# Environmental variables
env:
NETCDF: /home/runner/netcdf
bacio_ROOT: /home/runner/bacio
sp_ROOT: /home/runner/NCEPLIBS-sp
w3emc_ROOT: /home/runner/myw3emc
SCM_ROOT: /home/runner/work/ccpp-scm/ccpp-scm
zlib_ROOT: /home/runner/zlib
HDF5_ROOT: /home/runner/hdf5
suites: SCM_GFS_v15p2,SCM_GFS_v16,SCM_GFS_v17_p8,SCM_HRRR,SCM_RRFS_v1beta,SCM_RAP,SCM_WoFS_v0
suites_ps: SCM_GFS_v15p2_ps,SCM_GFS_v16_ps,SCM_GFS_v17_p8_ps,SCM_HRRR_ps,SCM_RRFS_v1beta_ps,SCM_RAP_ps,SCM_WoFS_v0_ps

# Workflow steps
steps:

#######################################################################################
# Cleanup space
#######################################################################################
- name: Check space (pre)
run: |
df -h

- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: false

# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: false
dotnet: false
haskell: true
large-packages: true
docker-images: false
swap-storage: false

- name: Check space (post)
run: |
df -h

#######################################################################################
# Initial
#######################################################################################
- name: Checkout SCM code (into /home/runner/work/ccpp-scm/)
uses: actions/checkout@v3

- name: Initialize submodules
run: git submodule update --init --recursive

#######################################################################################
# Python setup
#######################################################################################
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: ${{matrix.py-version}}

- name: Add conda to system path
run: |
echo $CONDA/bin >> $GITHUB_PATH

- name: Install NetCDF Python libraries
run: |
conda install --yes -c conda-forge h5py>=3.4 netCDF4 f90nml

#######################################################################################
# Install Nvidia.
#######################################################################################

- name: Nvidia setup compilers.
env:
NVCOMPILERS: /home/runner/hpc_sdk
NVARCH: Linux_x86_64
NVHPC_SILENT: true
NVHPC_INSTALL_DIR: /home/runner/hpc_sdk
NVHPC_INSTALL_TYPE: network
NVHPC_INSTALL_LOCAL_DIR: /home/runner/hpc_sdk
run: |
mkdir /home/runner/hpc_sdk && cd /home/runner/hpc_sdk
wget -q https://developer.download.nvidia.com/hpc-sdk/24.1/nvhpc_2024_241_Linux_x86_64_cuda_12.3.tar.gz
tar xpzf nvhpc_2024_241_Linux_x86_64_cuda_12.3.tar.gz
nvhpc_2024_241_Linux_x86_64_cuda_12.3/install
export PATH=${PATH}:${NVCOMPILERS}/${NVARCH}/24.1/compilers/bin
export MANPATH=${MANPATH}:${NVCOMPILERS}/${NVARCH}/24.1/compilers/man
echo "The nvfortran installed is:"
nvfortran --version
echo "The path to nvfortran is:"
command -v nvfortran
echo "Removing tarball"
rm nvhpc_2024_241_Linux_x86_64_cuda_12.3.tar.gz

- name: Set environment for Nvidia compiler.
run: |
echo "CC=/home/runner/hpc_sdk/Linux_x86_64/24.1/compilers/bin/nvc" >> $GITHUB_ENV
echo "FC=/home/runner/hpc_sdk/Linux_x86_64/24.1/compilers/bin/nvfortran" >> $GITHUB_ENV
echo "CMAKE_C_COMPILER=/home/runner/hpc_sdk/Linux_x86_64/24.1/compilers/bin/nvc" >> $GITHUB_ENV
echo "CMAKE_Fortran_COMPILER=/home/runner/hpc_sdk/Linux_x86_64/24.1/compilers/bin/nvfortran" >> $GITHUB_ENV

#######################################################################################
# Install FORTRAN dependencies
#######################################################################################

- name: Install zlib
env:
CFLAGS: -fPIC
run: |
wget https://github.com/madler/zlib/releases/download/v1.2.13/zlib-1.2.13.tar.gz
tar -zxvf zlib-1.2.13.tar.gz
cd zlib-1.2.13
./configure --prefix=${zlib_ROOT}
make
make install
echo "LD_LIBRARY_PATH=$zlib_ROOT/lib:$LD_LIBRARY_PATH" >> $GITHUB_ENV

- name: Install HDF5
env:
CPPFLAGS: -I${zlib_ROOT}/include
LDFLAGS: -L${zlib_ROOT}/lib
run: |
wget https://github.com/HDFGroup/hdf5/archive/refs/tags/hdf5-1_14_1-2.tar.gz
tar -zxvf hdf5-1_14_1-2.tar.gz
cd hdf5-hdf5-1_14_1-2
./configure --prefix=${HDF5_ROOT} --with-zlib=${zlib_ROOT}
make -j4
make install
echo "LD_LIBRARY_PATH=$HDF5_ROOT/lib:$LD_LIBRARY_PATH" >> $GITHUB_ENV
echo "PATH=$HDF5_ROOT/lib:$PATH" >> $GITHUB_ENV

- name: Install Curl
run: |
sudo apt-get install curl
sudo apt-get install libssl-dev libcurl4-openssl-dev

- name: Cache NetCDF C library
id: cache-netcdf-c
uses: actions/cache@v3
with:
path: /home/runner/netcdf-c
key: cache-netcdf-c-${{matrix.fortran-compiler}}-key

- name: Install NetCDF C library
if: steps.cache-netcdf-c.outputs.cache-hit != 'true'
run: |
wget https://github.com/Unidata/netcdf-c/archive/refs/tags/v4.7.4.tar.gz
tar -zvxf v4.7.4.tar.gz
cd netcdf-c-4.7.4
CPPFLAGS="-I/home/runner/hdf5/include -I/home/runner/zlib/include" LDFLAGS="-L/home/runner/hdf5/lib -L/home/runner/zlib/lib" ./configure --prefix=${NETCDF}
make
make install
echo "LD_LIBRARY_PATH=$NETCDF/lib:$LD_LIBRARY_PATH" >> $GITHUB_ENV
echo "PATH=$NETCDF/lib:$PATH" >> $GITHUB_ENV

- name: Cache NetCDF Fortran library
id: cache-netcdf-fortran
uses: actions/cache@v3
with:
path: /home/runner/netcdf-fortran
key: cache-netcdf-fortran-${{matrix.fortran-compiler}}-key

- name: Install NetCDF Fortran library
if: steps.cache-netcdf-fortran.outputs.cache-hit != 'true'
run: |
wget https://github.com/Unidata/netcdf-fortran/archive/refs/tags/v4.6.1.tar.gz
tar -zvxf v4.6.1.tar.gz
cd netcdf-fortran-4.6.1
FCFLAGS="-fPIC" FFLAGS="-fPIC" CPPFLAGS="-I/home/runner/hdf5/include -I/home/runner/zlib/include -I/home/runner/netcdf/include" LDFLAGS="-L/home/runner/hdf5/lib -L/home/runner/zlib/lib -L/home/runner/netcdf/lib" ./configure --prefix=${NETCDF}
make
make install

- name: Cache bacio library v2.4.1
id: cache-bacio-fortran
uses: actions/cache@v3
with:
path: /home/runner/bacio
key: cache-bacio-fortran-${{matrix.fortran-compiler}}-key

- name: Install bacio library v2.4.1
if: steps.cache-bacio-fortran.outputs.cache-hit != 'true'
run: |
git clone --branch v2.4.1 https://github.com/NOAA-EMC/NCEPLIBS-bacio.git bacio
cd bacio && mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=${bacio_ROOT} ../
make -j2
make install
echo "bacio_DIR=/home/runner/bacio/lib/cmake/bacio" >> $GITHUB_ENV

- name: Cache SP-library v2.3.3
id: cache-sp-fortran
uses: actions/cache@v3
with:
path: /home/runner/NCEPLIBS-sp
key: cache-sp-fortran-${{matrix.fortran-compiler}}-key

- name: Install SP-library v2.3.3
if: steps.cache-sp-fortran.outputs.cache-hit != 'true'
run: |
git clone --branch v2.3.3 https://github.com/NOAA-EMC/NCEPLIBS-sp.git NCEPLIBS-sp
cd NCEPLIBS-sp && mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=${sp_ROOT} ../
make -j2
make install
echo "sp_DIR=/home/runner/NCEPLIBS-sp/lib/cmake/sp" >> $GITHUB_ENV

- name: Cache w3emc library v2.9.2
id: cache-w3emc-fortran
uses: actions/cache@v3
with:
path: /home/runner/myw3emc
key: cache-w3emc-fortran-${{matrix.fortran-compiler}}-key

- name: Install w3emc library v2.9.2
if: steps.cache-w3emc-fortran.outputs.cache-hit != 'true'
run: |
git clone --branch v2.9.2 https://github.com/NOAA-EMC/NCEPLIBS-w3emc.git NCEPLIBS-w3emc
cd NCEPLIBS-w3emc && mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=${w3emc_ROOT} ../
make -j2
make install
echo "w3emc_DIR=/home/runner/myw3emc/lib/cmake/w3emc" >> $GITHUB_ENV

#######################################################################################
# Build and run SCM regression tests (ccpp-scm/test/rt_test_cases.py)
#######################################################################################

- name: Configure build with CMake
run: |
cd ${SCM_ROOT}/scm
mkdir bin && cd bin
cmake -DCCPP_SUITES=${suites},${suites_ps} -DCMAKE_BUILD_TYPE=${{matrix.build-type}} -DENABLE_NVIDIA_OPENACC=${{matrix.enable-gpu-acc}} ../src

- name: Build SCM.
run: |
cd ${SCM_ROOT}/scm/bin
make -j4

- name: Download data for SCM
run: |
cd ${SCM_ROOT}
./contrib/get_all_static_data.sh
./contrib/get_thompson_tables.sh

- name: Run SCM RTs (w/o GPU)
if: contains(matrix.enable-gpu-acc, 'False')
run: |
cd ${SCM_ROOT}/scm/bin
./run_scm.py --file /home/runner/work/ccpp-scm/ccpp-scm/test/rt_test_cases.py --runtime_mult 0.1 -v
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grantfirl @mkavulich @scrasmussen
This is incomplete, it only runs the RTs, it doesn't download baselines and compare them. We should add these steps in and store a copy of the Nvidia Baselines on the DTC FTP server, analogous to the GNU based ones we have there.
So instead of rt-baselines-Release.zip and rt-baselines-Debug.zip on the DTC FTP server we will have:
rt-baselines-Debug.GNU.zip, rt-baselines-Release.GNU.zip, rt-baselines-Debug.NVHPC.zip, rt-baselines-Release.NVHPC.zip.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, no rt-baselines-Debug.NVHPC.zip, this test is off because it fails due to a naming conflict (mersenne_twister).

31 changes: 31 additions & 0 deletions scm/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,37 @@ elseif (${CMAKE_Fortran_COMPILER_ID} MATCHES "Intel")
set(CMAKE_Fortran_FLAGS_RELEASE "-O2 -fPIC" CACHE STRING "" FORCE)
set(CMAKE_C_FLAGS_BITFORBIT "-O2 -fPIC" CACHE STRING "" FORCE)
set(CMAKE_Fortran_FLAGS_BITFORBIT "-O2 -fPIC" CACHE STRING "" FORCE)

elseif (${CMAKE_Fortran_COMPILER_ID} MATCHES "NVHPC")
if(ENABLE_NVIDIA_OPENACC MATCHES "true")
set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} -acc -Minfo=accel")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -acc -Minfo=accel")
else()
set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS}")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS}")
endif()

if(NOT 32BIT)
set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} -r8")
endif()

if (${CMAKE_BUILD_TYPE} MATCHES "Debug")
set(CMAKE_Fortran_FLAGS_DEBUG "${CMAKE_Fortran_FLAGS_DEBUG} -O0 -g")
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -O0 -g")
else()
set(CMAKE_Fortran_FLAGS "${CMAKE_Fortran_FLAGS} -O2")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O2")
endif()

set(MPI_C_COMPILER mpicc)
set(MPI_CXX_COMPILER mpicxx)
set(MPI_Fortran_COMPILER mpif90)

set(CMAKE_C_FLAGS_RELEASE "-O2 -fPIC" CACHE STRING "" FORCE)
set(CMAKE_Fortran_FLAGS_RELEASE "-O2 -fPIC" CACHE STRING "" FORCE)
set(CMAKE_C_FLAGS_BITFORBIT "-O2 -fPIC" CACHE STRING "" FORCE)
set(CMAKE_Fortran_FLAGS_BITFORBIT "-O2 -fPIC" CACHE STRING "" FORCE)

else (${CMAKE_Fortran_COMPILER_ID} MATCHES "GNU")
message (FATAL_ERROR "This program has only been compiled with gfortran and ifort. If another compiler is needed, the appropriate flags must be added in ${CMAKE_SOURCE_DIR}/CMakeLists.txt")
endif (${CMAKE_Fortran_COMPILER_ID} MATCHES "GNU")
Expand Down
9 changes: 9 additions & 0 deletions test/rt_test_cases_nvidia.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
run_list = [\
#----------------------------------------------------------------------------------------------------------------------------------------------
# CCPP-SCM v6 supported suites
#----------------------------------------------------------------------------------------------------------------------------------------------
{"case": "arm_sgp_summer_1997_A", "suite": "SCM_RAP"}, \
{"case": "twpice", "suite": "SCM_RAP"}, \
{"case": "bomex", "suite": "SCM_RAP"}, \
{"case": "astex", "suite": "SCM_RAP"}, \
{"case": "LASSO_2016051812", "suite": "SCM_RAP"}]
Loading