-
Notifications
You must be signed in to change notification settings - Fork 2
Cold Flow Runs
Note: starting tet meshes in Box: icp-torch/meshes/cold-flow-mod2-outlet
(I uploaded new versions with higher-order elements [order=3] for the 50K and 117K sizes). Until discrepancy is better understood on the precision differences between cpu/gpu results, recommendation is to start these runs off on Quartz. Suggest focusing on the following two run configs; make sure to save HDF5 restart files in backup directories periodically.
-
Flow rate - 40slpm
- mesh ->
cold-flow-mod2-outlet.order3.50K.msh
- pressure-based outlet
- Example Quartz timing(s), Git Version: 4fde4fa
- 9 nodes/320 MPI tasks
- 0.074 secs/iteration (p1), Initial time-step: 1.7903731e-07s (CFL=0.5)
- 0.152 secs/iteration (p2), Initial time-step: 1.0742239e-07s (CFL=0.3)
- 0.302 secs/iteration (p3), Initial time-step: 4.2968955e-08s (CFL=0.12)
- 20 nodes/320 tasks
- 0.037 secs/iteration (p1) [~3.1 wall clock days/flow thru]
- 0.072 secs/iteration (p2) [~10.1 wall clock days/flow thru]
- 0.141 secs/iteration (p3) [~49.4 wall clock days/flow thru]
-
40 nodes/1440 MPI tasks <--- let's start here
- 0.021 secs/iteration (p1) [~1.76 wall clock days/flow thru]
- 0.039 secs/iteration (p2) [~5.5 wall clock days/flow thru]
- 0.078 secs/iteration (p3) [~27 wall clock days/flow thru]
- reduced to 0.06418 secs/iteration (p3) with
be4bbfb
version
- reduced to 0.06418 secs/iteration (p3) with
- 9 nodes/320 MPI tasks
- mesh ->
-
[Startup Run 1] - start here, can restart with .h5 files from karl [COMPLETE]
- Start with
CFL=0.5
,p=1
on 40 nodes/1440 MPI tasks - With p=1, have 823,060 total unknowns
- Run ~1-2 flow thru's
- [Update 7/26/21]:
- Ran ~4.4M iterations till nan'd using non-reflecting pressure outlet
- Restarted from ~4.2 iterations using reflecting pressure outlet BC and ran till 12M iterations (t=1.77 secs)
- Start with
-
[Continuation Run 2]
- Pick up where Run 1 completed and raise order, will need to add
RESTART_FROM_AUX
option to runfile -
CFL=0.3
,p=2
- With p=2, have 2,057,650 total unknowns
- Run this for an additional flow thru
- Enable averaging and gather some results; decide on whether to then run
p=3
or alternate mesh - [Update 7/26/21]:
- Restarting from
p=1
at t=1.77 secs - 0.039 secs/iteration; dt=8.716927e-08
- Restarting from
- [Update 8/6/21]:
- Completed 28.5M iterations (0.033287 secs/iter with two sets of perf mods)
-
t=3.193, so have a full flow thru at
p=2
-> 1.42 secs
- [Update 8/17/21]:
- Enabled averaging at i=28500010 (every 25th timestep), t=3.19 secs
- Ran till i=47500000 iters, t=4.82 secs (1.63 secs with averaging)
- Pick up where Run 1 completed and raise order, will need to add
-
[Continuation Run 3]
- Pick up where Run 2 completed and raise order
-
CFL=0.12
,p=3
- With p=3, have 4,115,300 total unknowns
- [Update 8/4/21]:
- Restarting from
p=2
at t=2.89306- On 60 nodes (2160 cores) -> 0.0488 secs/iteration, dt=3.4272543e-08
- [~16.5 wall clock days/flow thru]
- Restarting from
- enabled averaging at i=32000000 (every 25th timestep), t=3.13 secs
- [Run 1] - 40slpm
- mesh ->
cold-flow-mod2-outlet.order3.117K.msh
- pressure-based outlet
- mesh ->
To build, I just run singularity on the login node. I have a convenience script to launch the container with the following contents:
CONTAINER=mfem-mv2-psm:4.2.tps.r1
singularity exec -B /p/lustre2/$USER:/home/ohpc \
-B /var/run/munge/:/var/run/munge/ \
-B /etc/slurm:/etc/slurm \
--env SLURM_JOB_ID=$SLURM_JOB_ID \
--cleanenv /usr/workspace/utaustin/containers/$CONTAINER /bin/bash --login
Once in the container, I change to the directory in my Quartz $HOME where tps is cloned and configure similarly to Marvin as follows:
./configure CXXFLAGS="-g -O3 -I$PETSC_INC" LDFLAGS="-L$PETSC_LIB -lpetsc"