Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDSC: PKG - expanse/0.17.3/cpu/b - Missing gaussian (example application) #37

Open
nwolter opened this issue Mar 8, 2023 · 14 comments
Open

Comments

@nwolter
Copy link

nwolter commented Mar 8, 2023

This is an example application

@mkandes
Copy link
Member

mkandes commented Mar 9, 2023

@nwolter - When you get the chance, can you please add the spack_cpu, spack_gpu, spack_dev_cpu, spack_dev_gpu users to the gaussian group? Also, please add spack_dev_cpu, spack_dev_gpu to use300 as well.

@nwolter
Copy link
Author

nwolter commented Mar 9, 2023

@mkandes Which allocation(s) in use300? I assume you only need the ACCESS Expanse nodes, NOT industry rack or comet?
Added user to gaussian group whill update by 8:05am(PT) today. ADded spack_dev_gpu to sddxgpu, and spack_dev_cpu to sddxps

@nwolter nwolter changed the title SDSC: PKG - expanse/0.17.3/cpu/a - Missing gaussian SDSC: PKG - expanse/0.17.3/cpu/a - Missing gaussian (example application) Mar 9, 2023
@mkandes
Copy link
Member

mkandes commented Mar 10, 2023

@nwolter - What are sddxps and sddxgpu?

@nwolter
Copy link
Author

nwolter commented Mar 10, 2023

It is a local discretionary fund on the NSF nodes.

@mkandes
Copy link
Member

mkandes commented Mar 10, 2023

@jerrypgreenberg - Gaussian 16-C.02 build is failing on a patch file. Have you had any success with it?

Input spec
--------------------------------
[    ]  [email protected]%[email protected]~binary~cuda

Concretized
--------------------------------
tuc6kpe  [    ]  [email protected]%[email protected] cflags="-fast" cxxflags="-fast" fflags="-fast" ~binary~cuda cuda_arch=none patches=9ebaf94ac170596aaa32e43d665e81ea94b6bfed083dce8f0a8587de5d226a00,b864b2723afd9cbb50ef6139def38cd9ed97986b1c15c8bbb850595c33e41281 arch=linux-rocky8-zen2

==> Installing gaussian-16-C.02-tuc6kpe2fs2huaqn4hg7nexw66p3zr5z
==> No binary for gaussian-16-C.02-tuc6kpe2fs2huaqn4hg7nexw66p3zr5z found: installing from source
==> Warning: Expected user 527834 to own /scratch/spack_cpu, but it is owned by 0
The text leading up to this was:
--------------------------
|--- gaussian-16.C.01/g16/bsd/gau-hname 2016-08-02 06:22:04.000000000 -0700
|+++ patch-files/gau-hname      2019-08-13 13:03:50.716835428 -0700
--------------------------
File to patch: 
Skip this patch? [y] 
1 out of 1 hunk ignored
==> Fetching file:///cm/shared/apps/spack/0.17.3/cpu/a/etc/spack/sdsc/expanse/0.17.3/cpu/specs/nvhpc%4021.3/gaussian-16-C.02.tar.gz
==> Patch /cm/shared/apps/spack/0.17.3/cpu/a/var/spack/repos/sdsc/packages/gaussian/gau-hname.patch failed.
==> Error: ProcessError: Command exited with status 1:
    '/bin/patch' '-s' '-p' '1' '-i' '/cm/shared/apps/spack/0.17.3/cpu/a/var/spack/repos/sdsc/packages/gaussian/gau-hname.patch' '-d' '.'
==> Error: Terminating after first install failure: ProcessError: Command exited with status 1:
    '/bin/patch' '-s' '-p' '1' '-i' '/cm/shared/apps/spack/0.17.3/cpu/a/var/spack/repos/sdsc/packages/gaussian/gau-hname.patch' '-d' '.'
real 40.70
user 38.62
sys 2.97
ERROR: spack install failed.

@jerrypgreenberg
Copy link
Member

jerrypgreenberg commented Mar 11, 2023 via email

@jerrypgreenberg
Copy link
Member

jerrypgreenberg commented Mar 11, 2023 via email

@mkandes
Copy link
Member

mkandes commented Mar 11, 2023 via email

@mkandes
Copy link
Member

mkandes commented Mar 24, 2023

@jerrypgreenberg - I'm still running into issues with the new gaussian 16-C.02 package on Expanse. I'll make another pass at it again tomorrow before the meeting. But let me summarize some initial issues noticed with the GPU build.

The first indication of a problem with the build in Spack is this OS Error.

# ==> Error: OSError: No such file or directory: 'g16'
#
# /cm/shared/apps/spack/0.17.3/gpu/a/var/spack/repos/sdsc/packages/gaussian/package.py:49, in install:
#         46                install_tree('basis',join_path(prefix.g16,'basis'))
#         47                mkdirp('tests')
#         48                install_tree('tests',join_path(prefix.g16,'tests'))
#  >>     49                Executable('find')('.','-maxdepth','1','-type','f','-executable','-exec','cp','{}',prefix.g16,';')
#         50        else:
#         51            install_tree('.',prefix.g16)
#         52        if '+cuda' in spec:
#
# See build log for details:
#  /scratch/spack_gpu/job_21309243/spack-stage/spack-stage/spack-stage-gaussian-16-C.02-czgfng2ak7cssotpwuw5eeqzi47jg76s/spack-build-out.txt

However, I believe this is a bit of a red herring as Spack does appear to unpack the tarball and attempt to build the package. But there are a few problems with build process once you check the errors in the spack-build-out.txt file. Probably the most important is that the build process cannot find 'pgcc' ...

cp ../bsd/mdutil.c bsd
make -f ../bsd/g16.make JUNK1=JUNK mdutil.o
pgcc  -I/scratch/spack_gpu/job_21309428/spack-stage/spack-stage/spack-stage-gaussian-16-C.02-czgfng2ak7cssotpwuw5eeqzi47jg76s/spack-src/g16 -DDEFMAXRES=25000 -DDEFMAXSEC=2500 -DI64 -DP64 -DPACK64 -DUSE_I2 -DGAUSS_PAR -DGAUSS_THPAR -D_OPENMP_ -D_OPENMP_MM_ -DCHECK_ARG_OVERLAP -DDEFMAXSHL=250000 -DDEFMAXATM=250000 -D_EM64T_ -DNO_SBRK '-DX86_TYPE=S' -DDEFMXGWIN=64 -DDEFMAXNZ=250000 -DDEFNREPFD=32 -DDEFNVDIM=257 -DR4ETIME   -DDEFARCREC=1024 -DMERGE_LOOPS -D_I386_   -DLITTLE_END -DSTUPID_ATLAS -DDEFMAXXCVAR=150 -DDEFMAXIOP=200 -DDEFMAXCOORDINFO=32 -DDEFMAXSUB=80 -DDEFMAXCHR=1024 -DDEFMAXKJL=8 -DDEFMOMEGA=5 -DDEFNOMEGA=6 -DDEFNSCADF=12 -DDEFMAXXCNAME=25 -DDEFLMAX=13 -DDEFMINB1P=100000000 -DDEFXGN3MIN=1 -DDEFISEC=16 -DDEFJSEC=512 -DDEFKSEC=128 -DDEFN3MIN=10 -DDEFNBOMAXBAS=10000 -DDEFMAXHEV=2000 -DDEFCACHE=88 -DDEFLWORK=1 -DDEFMAXLECP=10 -DDEFMAXFUNIT=5 -DDEFMAXFFILE=10000 -DDEFMAXFPS=1300 -DDEFMAXINFO=200 -DDEFMAXOP=384 -DDEFMAXTIT=100 -DDEFMAXRTE=4000 -DDEFMAXREDTYPE=3 -DDEFMAXREDINDEX=4 -DDEFMAXOV=500 -DDEFMXDNXC=8 -DDEFMXTYXC=10 -DDEFICTDBG=0 -D_ALIGN_CORE_ -DCA1_DGEMM -DCA2_DGEMM -DCAB_DGEMM -DLV_DSP -DO_BKSPEF -DSETCDMP_OK -DDEFMXTS=2500 -DDEFMXBOND=12 -DDEFMXSPH=250 -DDEFMXINV=2500 -DDEFMXSLPAR=300 -DDEFMXSATYP=4 -DEXT_LSEEK -DAPPEND_ACC -DGAUSS_ACC  -DGCONJG=DConjg -DGCMPLX=DCmplx -DGREAL=DReal -DGIMAG=DImag -DGConjg=DConjg -DGCmplx=DCmplx -DGReal=DReal -DGImag=DImag      -O3   -Mcuda=cuda10.0,flushz,unroll,nollvm,fma -acc=nowait -ta=tesla:cuda10.0,cc35,cc60,cc70,flushz,nollvm,unroll,fma,v32mode  -c bsd/mdutil.c
make: pgcc: Command not found
make: *** [../bsd/g16.make:801: mdutil.o] Error 127

... even though it should be available for the build ...

[mkandes@login01 ~]$ module list

Currently Loaded Modules:
  1) slurm/expanse/21.08.8   2) nvhpc/21.9/4xco23d   3) cuda/11.2.2/wk2625v

[mkandes@login01 ~]$ which pgcc
/cm/shared/apps/spack/0.17.3/gpu/a/opt/spack/linux-rocky8-skylake_avx512/gcc-8.5.0/nvhpc-21.9-4xco23dnhjhodonoufuduj4obaewort7/Linux_x86_64/21.9/compilers/bin/pgcc
[mkandes@login01 ~]$

Secondarily, you'll also notice that this build step is attempting to compile against CUDA 10.0 and attempting to support a bunch of CUDA compute capabilities I did not ask Spack to build gaussian for. This is the actual Spack spec ...

[email protected] % [email protected] ~binary +cuda cuda_arch=70,80 "^[email protected]/$(spack find --format '{hash:7}' [email protected] % ${SPACK_COMPILER})"

Do you have the standard output file generated by your gaussian 16-C.02 spec build script?

@jerrypgreenberg
Copy link
Member

jerrypgreenberg commented Mar 25, 2023 via email

@mkandes
Copy link
Member

mkandes commented Mar 25, 2023

@jerrypgreenberg - Not yet. I did notice you mentioned this is a possible key piece. However, it's not clear to me how this is different than say including module load nvhpc/21.3 in the *.sh spec build script, which I did give try. But I'll take another look at the details here. Can you explain why this is required? While this is probably okay for this one off package / compiler combination, but again we'd rather not make central changes to a *.yaml file for one package if we don't have to

@jerrypgreenberg
Copy link
Member

jerrypgreenberg commented Mar 25, 2023 via email

@mkandes
Copy link
Member

mkandes commented Mar 25, 2023

@jerrypgreenberg - I actually already defined the cc, c++, f77, and fc lines in the yaml file to point explicitly at the pgi compiler symlinks as well --- in addition to explicitly loading the nvhpc/21.3 module within the spec build script. Anyhow, I'll give the module line a try and see if that helps. Note, however, I'm still a bit more concerned about the -ta=tesla:cuda10.0,cc35,cc60,cc70,flushz,nollvm,unroll,fma,v32mode options being incorrect when compared to the requirements in the Spack spec.

@jerrypgreenberg
Copy link
Member

jerrypgreenberg commented Mar 27, 2023 via email

@mkandes mkandes changed the title SDSC: PKG - expanse/0.17.3/cpu/a - Missing gaussian (example application) SDSC: PKG - expanse/0.17.3/cpu/b - Missing gaussian (example application) May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants