Cold Flow Runs

Cold Flow Validation Run Startups (July 14)

Note: starting tet meshes in Box: icp-torch/meshes/cold-flow-mod2-outlet (I uploaded new versions with higher-order elements [order=3] for the 50K and 117K sizes). Until discrepancy is better understood on the precision differences between cpu/gpu results, recommendation is to start these runs off on Quartz. Suggest focusing on the following two run configs; make sure to save HDF5 restart files in backup directories periodically.

Config A: 50K mesh

Flow rate - 40slpm
- mesh -> cold-flow-mod2-outlet.order3.50K.msh
- pressure-based outlet
- Example Quartz timing(s), Git Version: 4fde4fa
  - 9 nodes/320 MPI tasks
    - 0.074 secs/iteration (p1), Initial time-step: 1.7903731e-07s (CFL=0.5)
    - 0.152 secs/iteration (p2), Initial time-step: 1.0742239e-07s (CFL=0.3)
    - 0.302 secs/iteration (p3), Initial time-step: 4.2968955e-08s (CFL=0.12)
  - 20 nodes/320 tasks
    - 0.037 secs/iteration (p1) [~3.1 wall clock days/flow thru]
    - 0.072 secs/iteration (p2) [~10.1 wall clock days/flow thru]
    - 0.141 secs/iteration (p3) [~49.4 wall clock days/flow thru]
  - 40 nodes/1440 MPI tasks <--- let's start here
    - 0.021 secs/iteration (p1) [~1.76 wall clock days/flow thru]
    - 0.039 secs/iteration (p2) [~5.5 wall clock days/flow thru]
    - 0.078 secs/iteration (p3) [~27 wall clock days/flow thru]
      - reduced to 0.06418 secs/iteration (p3) with be4bbfb version
[Startup Run 1] - start here, can restart with .h5 files from karl [COMPLETE]
- Start with CFL=0.5, p=1 on 40 nodes/1440 MPI tasks
- With p=1, have 823,060 total unknowns
- Run ~1-2 flow thru's
- [Update 7/26/21]:
  - Ran ~4.4M iterations till nan'd using non-reflecting pressure outlet
  - Restarted from ~4.2 iterations using reflecting pressure outlet BC and ran till 12M iterations (t=1.77 secs)
[Continuation Run 2]
- Pick up where Run 1 completed and raise order, will need to add RESTART_FROM_AUX option to runfile
- CFL=0.3, p=2
- With p=2, have 2,057,650 total unknowns
- Run this for an additional flow thru
- Enable averaging and gather some results; decide on whether to then run p=3 or alternate mesh
- [Update 7/26/21]:
  - Restarting from p=1 at t=1.77 secs
  - 0.039 secs/iteration; dt=8.716927e-08
- [Update 8/6/21]:
  - Completed 28.5M iterations (0.033287 secs/iter with two sets of perf mods)
  - t=3.193, so have a full flow thru at p=2 -> 1.42 secs
- [Update 8/17/21]:
  - Enabled averaging at i=28500010 (every 25th timestep), t=3.19 secs
  - Ran till i=47500000 iters, t=4.82 secs (1.63 secs with averaging)
[Continuation Run 3]
- Pick up where Run 2 completed and raise order
- CFL=0.12, p=3
- With p=3, have 4,115,300 total unknowns
- [Update 8/4/21]:
  - Restarting from p=2 at t=2.89306
    - On 60 nodes (2160 cores) -> 0.0488 secs/iteration, dt=3.4272543e-08
    - [~16.5 wall clock days/flow thru]
- enabled averaging at i=32000000 (every 25th timestep), t=3.13 secs

Config B: 117K mesh

[Run 1] - 40slpm
- mesh -> cold-flow-mod2-outlet.order3.117K.msh
- pressure-based outlet

TPS build options I use with container on Quartz:

To build, I just run singularity on the login node. I have a convenience script to launch the container with the following contents:

CONTAINER=mfem-mv2-psm:4.2.tps.r1

singularity exec -B /p/lustre2/$USER:/home/ohpc \
	    -B /var/run/munge/:/var/run/munge/ \
	    -B /etc/slurm:/etc/slurm \
	    --env SLURM_JOB_ID=$SLURM_JOB_ID \
	    --cleanenv /usr/workspace/utaustin/containers/$CONTAINER /bin/bash --login

Once in the container, I change to the directory in my Quartz $HOME where tps is cloned and configure similarly to Marvin as follows:

./configure CXXFLAGS="-g -O3 -I$PETSC_INC" LDFLAGS="-L$PETSC_LIB -lpetsc"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cold Flow Runs

Cold Flow Validation Run Startups (July 14)

Config A: 50K mesh

Config B: 117K mesh

TPS build options I use with container on Quartz:

Clone this wiki locally