Skip to content

meeting 2024 06 06

Bob Dröge edited this page Jun 6, 2024 · 3 revisions

Notes for 2024-06-06 meeting

  • date & time: Thu 6 June 2024 - 14:00 CEST (13:00 UTC)
    • (every first Thursday of the month)
  • venue: (online, see mail for meeting link, or ask in Slack)
  • agenda:
    • Quick introduction by new people
    • EESSI-related meetings and events in last month
    • Progress update per EESSI layer
    • Update on build-and-deploy bot
    • Update on EESSI production repository software.eessi.io
    • Update on EESSI documentation
    • Update on EESSI test suite
    • Additional EESSI repositories: dev.eessi.io, riscv.eessi.io
    • EESSI on macOS
    • AWS/Azure sponsorship update
    • Upcoming/recent events: ISC’24
    • Q&A

Slides

Meeting notes

(by Bob/Kenneth)

Quick introduction by new people

EESSI-related meetings in last month

(see slides)

Progress update per EESSI layer

Filesystem layer

(see slides)

  • Ansible playbook for Stratum-1 now uses our fork of ansible-cvmfs role
  • Bob is planning to kickstart discussion on setting up proper monitoring for our Stratum 1 servers (disk usage, network bandwidth usage, load, etc.)
Compatibility layer

(see slides)

Software layer

(see slides)

  • We still need to document the EESSI-extend module, but it works really well
  • Both Kenneth and Julian are working on Extrae, but tests are failing in the make check step
  • Thomas is working on PyTorch-bundle, but it's failing due to librosa not being able to find a library, as Python's ctypes library doesn't return full paths to libraries. Related issues:
  • The --from-commit feature doesn't fully work yet (it has trouble finding dependencies), is currently being fixed in EasyBuild
  • installing GPU software
    • (Caspar) some software will require a newer CUDA CC than 6.0, which would be a problem for the fallback installations in generic/accel
      • maybe drop cc60 part from generic/accel/nvidia/cc60?
      • also consider fat builds under generic/accel?
    • (Kurt) PyTorch may require newer CUDA CC or GPU drivers
Build-and-deploy bot

(see slides)

  • Automatic cleanup moves the job directories of merged PRs to some sort of trash bin, which can be purged later
  • v0.5.0 of bot not used yet on EESSI build cluster
    • waiting for merge on PRs that update bot configuration
software.eessi.io repository

(see slides)

EESSI documentation

(see slides)

  • Lots of work was done on the documentation, also because of the hackathon which had a strong focus on merging documentation PRs
  • The documentation now includes an automatically generated page with the available software: https://www.eessi.io/docs/available_software/overview/
    • Sites can easily use the same tooling to make similar overviews for their stacks
  • And we've also added a blog: https://www.eessi.io/docs/blog/
  • We should mention the mailing list on the website
EESSI test suite

(see slides)

Additional EESSI repositories: dev.eessi.io, riscv.eessi.io

(see slides)

EESSI on macOS

(see slides)

AWS/Azure sponsored credits

(see slides)

  • (Alan) we should look into splitting up they way sponsored credits are consumed in both AWS & Azure
    • to avoid that people who get access to things they don't need, and make

Events

(see slides)

Q&A

  • Next meeting: July 4
Clone this wiki locally