Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to test S1 performance #24

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
35 changes: 35 additions & 0 deletions .github/workflows/test_S1_performance.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# documentation: https://help.github.com/en/articles/workflow-syntax-for-github-actions
name: Test performance of public S1 servers (native access)
on: [pull_request]
# Declare default permissions as read only.
permissions: read-all
jobs:
pilot_repo_native:
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
APPLICATION:
- TensorFlow
steps:
- name: Check out repository
uses: actions/checkout@93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 # v3.1.0
with:
persist-credentials: false

- name: Mount EESSI CernVM-FS pilot repository
uses: cvmfs-contrib/github-action-cvmfs@d4641d0d591c9a5c3be23835ced2fb648b44c04b # v3.1
with:
cvmfs_config_package: https://github.com/EESSI/filesystem-layer/releases/download/latest/cvmfs-config-eessi_latest_all.deb
cvmfs_http_proxy: DIRECT
cvmfs_repositories: pilot.eessi-hpc.org

- name: Run public S1 performance test
run: |
APPLICATION=${{matrix.APPLICATION}} ./scripts/test_S1_performance.sh

- name: Archive S1 performance results
uses: actions/upload-artifact@0b7f8abb1508181956e8e162db84b466c27e18ce # v3.1.2
with:
name: S1-performance-results
path: S1_performance_check.json
2 changes: 1 addition & 1 deletion Bioconductor/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

module load R-bundle-Bioconductor/3.11-foss-2020a-R-4.0.0

time Rscript dna.R
time -p Rscript dna.R
2 changes: 1 addition & 1 deletion GROMACS/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ fi
rm -f ener.edr logfile.log

# note: downscaled to just 1k steps (full run is 10k steps)
time gmx mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 1000 -g logfile
time -p gmx mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 1000 -g logfile
2 changes: 1 addition & 1 deletion OpenFOAM/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ foamDictionary -entry runTimeModifiable -set "false" system/controlDict
foamDictionary -entry functions -set "{}" system/controlDict

mpirun --oversubscribe -np $NP potentialFoam -parallel 2>&1 | tee log.potentialFoam
time mpirun --oversubscribe -np $NP simpleFoam -parallel 2>&1 | tee log.simpleFoam
time -p mpirun --oversubscribe -np $NP simpleFoam -parallel 2>&1 | tee log.simpleFoam

echo "cleanup..."
rm -rf $WORKDIR
2 changes: 1 addition & 1 deletion TensorFlow/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

module load TensorFlow/2.3.1-foss-2020a-Python-3.8.2

time python TensorFlow-2.x_mnist-test.py
time -p python TensorFlow-2.x_mnist-test.py
74 changes: 74 additions & 0 deletions scripts/test_S1_performance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/bin/bash

# Check for pre-existing configuration
local_eessi_config="/etc/cvmfs/domain.d/eessi-hpc.org.local"
if [ -f $local_eessi_config ]; then
echo "File $local_eessi_config exists, moving it temporarily"
sudo mv $local_eessi_config ${local_eessi_config}.tmp
restore_eessi_config=true
else
echo "File $local_eessi_config does not exist."
restore_eessi_config=false
fi

# Add the capability to clean up after ourselves
function cleanup()
{
if [ "$restore_eessi_config" = true ] ; then
echo "Restoring original $local_eessi_config"
sudo mv ${local_eessi_config}.tmp $local_eessi_config
else
echo "Removing $local_eessi_config"
sudo rm $local_eessi_config
fi
# disable our traps
trap - SIGINT
trap - EXIT
# exit as normal
echo "Finished cleaning up, exiting"
exit
}
trap cleanup SIGINT
trap cleanup EXIT

# Function used to help stitch json objects together
function join_by {
local d=${1-} f=${2-}
if shift 2; then
printf %s "$f" "${@/#/$d}"
fi
}

# Test a particular S1 and return a valid json object
function test_S1 {
# Edit the config file to point to a single S1 option, e.g.,
echo 'CVMFS_SERVER_URL="'"$1"'"' | sudo tee $local_eessi_config > /dev/null
# Reconfigure CVMFS
sudo cvmfs_config setup
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a way to check the user has the required sudo rights

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the idea to run this in a container, with bind mounts to empty directories for /var/lib/cvmfs & co (so empty cache), which would totally alleviate the need for sudo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't my idea...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually doesn't seem like running it in a container can work:

Apptainer> cvmfs_config stat pilot.eessi-hpc.org    
pilot.eessi-hpc.org not mounted
Apptainer> source /cvmfs/pilot.eessi-hpc.org/latest/init/bash 
Found EESSI pilot repo @ /cvmfs/pilot.eessi-hpc.org/versions/2021.12!
archspec says aarch64/graviton2
Using aarch64/graviton2 as software subdirectory.
Using /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/aarch64/graviton2/modules/all as the directory to be added to MODULEPATH.
Found Lmod configuration file at /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/aarch64/graviton2/.lmod/lmodrc.lua
Initializing Lmod...
Prepending /cvmfs/pilot.eessi-hpc.org/versions/2021.12/software/linux/aarch64/graviton2/modules/all to $MODULEPATH...
Environment set up to use EESSI pilot software stack, have fun!
[EESSI pilot 2021.12] $ cvmfs_config stat pilot.eessi-hpc.org
pilot.eessi-hpc.org not mounted
[EESSI pilot 2021.12] $ cvmfs_config wipecache
root privileges required
[EESSI pilot 2021.12] $ cvmfs_config setup    
root privileges required

I can get away with not wiping the cache since I control it, and I guess I can just overwrite the EESSI configuration file with a bind mount, but I am only left with timing the command if I can't run cvmfs_config stat

Copy link
Member Author

@ocaisa ocaisa Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, more than that, I need to be able to run cvmfs_config setup to put the new configuration in place. Otherwise I need to bind mount a new configuration and run each S1 in a separate container execution step.

I really don't see the worth, we would lose a lot of valuable info and this implementation will run just fine as a CI job.

# Wipe the cache and run the example (from github.com/EESSI/eessi-demo)
sudo cvmfs_config wipecache >& /dev/null
# Run the example
cd $(dirname $(realpath $BASH_SOURCE))/../$2
# Just print the real time
realtime=$({ ./run.sh > /dev/null ; } 2> >(grep real | awk '{print $2}'))
bandwidth=( $(cvmfs_config stat pilot.eessi-hpc.org | column -t -H 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20 ) )
cache_usage=( $( cvmfs_config stat pilot.eessi-hpc.org | column -t -H 1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,17,18,19,20 ))
total_files=$(sudo cvmfs_talk -i pilot.eessi-hpc.org cache list |wc -l)
software_files=$(sudo cvmfs_talk -i pilot.eessi-hpc.org cache list | grep '/software/'|wc -l)
# Print json output
echo -n "{\"$1\": {\"time\":\"$realtime\",\"speed\":\"${bandwidth[1]}\",\"speed_unit\":\"${bandwidth[0]}\",\"data\":\"${cache_usage[1]}\",\"data_unit\":\"${cache_usage[0]}\",\"application\":\"$2\",\"total_files\":\"${total_files}\",\"software_files\":\"${software_files}\",\"arch\":\"$EESSI_SOFTWARE_SUBDIR\" }}"
}

# Initialise EESSI
source /cvmfs/pilot.eessi-hpc.org/latest/init/bash > /dev/null
# Grab the date we do this and use that as out key for json output
application="${APPLICATION:-TensorFlow}"
date=$(date -I)
json_array=()
for s1server in $(grep CVMFS_SERVER_URL /etc/cvmfs/domain.d/eessi-hpc.org.conf | grep -o '".*"' | sed 's/"//g' | tr ';' '\n'); do
json_array+=("$(test_S1 "$s1server" "$application")")
done
# Store all the (json) output in a single string so we can also stick it in a file
json_output="$(echo -e "{\"$date\":[\n")$(join_by ,$'\n' "${json_array[@]}")$(echo -e "\n]}")"
echo -e "$json_output" > $(dirname $(realpath $BASH_SOURCE))/../S1_performance_check.json
echo -e "$json_output"