Skip to content

Prototype for T6.5

Matteo Bunino edited this page Apr 18, 2024 · 21 revisions

Notes 📝

Emulate DT workflows in an e2e manner. Validate Ai workflows component by iteratively integrating interTwin use cases.

Split the DT workflow in two main steps: prep-processing and ML training. Optionally, we are going to integrate quality/validation capabilities provided by SQAaaS. ML deployment is going to be included in the future.

graph TD
 %% Nodes
 preproc(Pre-processing)
 ai(ML training)
 ai_depl(ML deployment)
 qual(Quality\nSQAaaS)
 reg[(Models registry:\npre-trained ML models)]

 %% Workflow
    preproc --> ai --> ai_depl --> qual

 %% Connections
 ai -.-> |Saves to| reg
 ai_depl -.-> |Loads from| reg
 qual -.-> |Validates| ai_depl

 %%click preproc href "obsidian://vault/CERN/InterTwin/Proj/Prototype/Preprocessing"
 %%click ai href "obsidian://vault/CERN/InterTwin/Proj//Prototype/ML%20training"
Loading
  • Solid arrows: workflow direction
  • Dashed arrows: interactions

Analysis of ML training workflow

A training workflow is a composition of parametrized transformations applied to some input data $X_i$. For instance:

  • Pre-processing: $f(,\cdot,;\gamma_i)$
  • ML training: $g(,\cdot,;\theta_i)$ ($\theta_i$ are the hyperparams, or hyperparams ranges)
  • Quality and validation: $h(,\cdot,;\phi_i)$

image

When data $X_i$ is fixed, we tune the parameters to improve some "goodness" metrics. Differently from standard workflows, ML training has some features which make it nontrivial to express using DAG notation:

  • has loops
  • has human in the loop

Conversely, the deployment requires no tuning, thus it is more streamlined:

image

Clone this wiki locally