Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset consistency, validation, execution comparison and benchmarking between hosting environments #46

Open
19 tasks
JimCircadian opened this issue Apr 6, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@JimCircadian
Copy link
Member

JimCircadian commented Apr 6, 2023

Previously, dev16 runs which were a set intended to be comparative across multiple HPC platforms (BAS and JASMIN) were marred by various issues, such as limited wall times, an underlying data consistency issue, IO issues and suchlike...

...on reflection, these were abandoned to favour a development push in the project that would allow multiple environments to be validated, step-by-step, when executed with identical configurations across different underlying platforms. This issue captures the necessary tasks to go through, improve various elements of running such workflows, implement consistency checking between the data stores and generated assets from executions on different HPC platforms and demonstrate the workflow through production of a notebook that can be run on HPCa and HPCb and then compare those runs using the tooling.

Creating a new run that is smaller and consistent on both HPCs whilst we solve the problems that stopped dev16 from working (it was fairly large!) There are also requests to do full tilt training runs for a conservation project which mean several long running pipeline issues need sorting.

In the first instance, we should use demonstrators that are small and to the point, as there are future runs that will scale the usage considerably. We should capture high level discussion in this issue and address functional requirements, discussion and performance improvements within the issues spread across the repositories.

There is a lot to capture in here and many issues that can be absorbed into this project, so they might not all be linked in yet

  • Dataset validation
  • Data / execution pipeline issues to address
    • Automated analysis reporting for data in pipeline environments as part of runs (assuming low cost/impact on performance)
    • Missing dates wrapup and documentation - need to ensure that outputs for this are comparable
    • Linear trend overproduction issue - ensure linear trend outputs are comparable between environments
    • Parameter validation for ENV files and configuration comparisons - if these don't match, we shouldn't be too expectant that the preprocessed or cached data will!
    • Ensuring metadata captures adequately the source platform and is displayed in downstream applications (e.g. icenet-application)
  • Add new dataset definition that works for dual hemisphere training runs - this is captured somewhere in the pipeline/library as the original configurations can't easily encompass dual hemisphere runs
  • Add basic benchmarking framework at various execution stages
    • icenet CLI commands, start to finish
    • Clarify/review outputs from dataset generation
    • icenet-pipeline reference implementation
  • Solve pipeline issue relating to wall times (JASMIN) and premature / unstable training runs - if it ends prematurely, we need to pick up and go again
    • Automate resubmission, model-ensemble can repeat based on external conditions, so for the reference implementation we can get this automated
  • Demonstrator notebook for validating consistency of environments across multiple platform (TODO: capture in icenet-notebooks issue)
    • Run end-to-end in BAS
    • Run end-to-end in JASMIN
    • Perform consistency check across all available HPCs

Some rules of thumb:

  • This is to be part of the 0.3 development push, don't retrofit to the existing 0.2.* series of developments
  • It is not possible to use direct file checking, all validation and comparison must be some level of naive statistical comparison - we have to account for acceptable differences across files due to platform
  • Never rely on pinned underlying environments, we don't want that to be a prerequisite as different HPCs will have differing requirements to host the pipeline
  • Outputs should be machine and human parsable and easy to transfer for comparison, if possible (e.g. JSON)
@JimCircadian JimCircadian self-assigned this Apr 6, 2023
@JimCircadian JimCircadian changed the title New model run for comparsions between single and dual hemisphere training Dev16 fixes and model run for comparsions between single and dual hemisphere training Apr 6, 2023
@JimCircadian JimCircadian changed the title Dev16 fixes and model run for comparsions between single and dual hemisphere training Dataset fixups, data validation and comparsion model runs between single and dual hemisphere training Jun 21, 2023
@JimCircadian JimCircadian removed their assignment Jul 24, 2023
@JimCircadian JimCircadian changed the title Dataset fixups, data validation and comparsion model runs between single and dual hemisphere training Dataset consistency, validation and comparsion between environments and hosts Dec 29, 2023
@JimCircadian JimCircadian added enhancement New feature or request bug Something isn't working and removed bug Something isn't working labels Dec 29, 2023
@JimCircadian JimCircadian changed the title Dataset consistency, validation and comparsion between environments and hosts Dataset consistency, validation and comparsion between hosting environments Jan 2, 2024
@JimCircadian
Copy link
Member Author

@bnubald we should have a chat about this, but I've reworked the issue to explain the primary goal. This links into various other streams of work but will be the priority moving forward. Ping me a DM to discuss further.

All contributions to individual issues welcome from others! 😆

@JimCircadian JimCircadian changed the title Dataset consistency, validation and comparsion between hosting environments Dataset consistency, validation, execution comparison and benchmarking between hosting environments Jan 2, 2024
@bnubald bnubald moved this to Ready in IceNet Roadmap Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

2 participants