Tip: Use the
ml
script of the root package for cleaner terminal output. E.g.,pnpm run ml train:gnn
.
All
.sh
scripts must be run from this directory.
pnpm run conda:load
Note: It is not possible to activate a Conda environment via a
package.json
-script. Instead, use the following command to activate the environment. It is not necessary to run this command if you are using theturbo
commands.
source scripts/conda-activate.sh
Note: Make sure that the
cm2ml
environment is activated.
pnpm run conda:save
Use the encode:*
and train:*
turbo
-tasks to encode and train the model.
- Create a script for the encoding in the
scripts
directory, e.g.,encode-{ENCODING}.sh
. For reduced execution time, consider using Bun.
Example for encoding UML raw graphs:
bun node_modules/@cm2ml/cli/bin/cm2ml.mjs batch-uml-raw-graph ../models/uml/dataset
- Create the
package.json
script.
{
"scripts": {
"encode:{ENCODING}": "source scripts/encode-{ENCODING}.sh"
}
}
- Create the Turbo task in
turbo.json
. It must depend on^build
to ensure that it's using the latest version of the framework. Also, it must use the corresponding script as input and the generated dataset files as output.
{
"pipeline": {
"encode:{ENCODING}": {
"inputs": ["scripts/encode-{ENCODING}.sh"],
"outputs": [".input/{ENCODING}_train.json", ".input/{ENCODING}_validation.json", ".input/{ENCODING}_test.json"],
"dependsOn": ["^build"]
}
}
}
-
Implement your evaluation in
./{EVALUATION}/src/{EVALUATION}.py
. -
Create a script for the evaluation in the
scripts
directory, e.g.,train-{EVALUATION}.sh
. For reduced execution time, consider using Bun.
Example:
source scripts/conda-activate.sh
python {EVALUATION}/src/{EVALUATION}.py {ENCODING}_train.json {ENCODING}_validation.json {ENCODING}_test.json
- Create the
package.json
script.
{
"scripts": {
"train:{EVALUATION}": "source scripts/train-{EVALUATION}.sh"
}
}
- Create the Turbo task in
turbo.json
. It must depend on the task of the dataset it uses and use the corresponding script, as well as the output of the encoding task and the Python source as inputs.
{
"pipeline": {
"train:{EVALUATION}": {
"inputs": ["scripts/train-{EVALUATION}.sh", ".input/{ENCODING}_train.json", ".input/{ENCODING}_validation.json", ".input/{ENCODING}_test.json", "{EVALUATION}/src/**"],
"dependsOn": ["encode:{ENCODING}"]
}
}
}