-
-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/mlnet rebased on master
#527
base: master
Are you sure you want to change the base?
Conversation
Assuming that any exception during `run_cmd` is unrecoverable.
@LittleLittleCloud I rebased your work on the latest The error I got was about an incorrect invocation of However, I also have a question about the usage of Anyway, even after the change there seems to be a failures. On github I get error code 143 (likely memory), on AWS I get the following:
ps. Is there a way to influence the inference time of the final models? E.g., a constraint or preset which favors faster models, or having inference time explicitly as secondary objective? If we do get the opportunity to benchmark MLNet, we might be interested of evaluating the system with different trade-offs. |
Hey Pieter
As for your concern on `--test-dataset`. The `--test-dataset` will be used to evaluate selected model after AutoML finished. So it won’t be participated in model selection. If `--test-dataset` is not provided, the model will be evaluated on validation dataset which is subsampled from `--dataset`
The error looks like because the model file is not found? Is there a way for me to confirm the model file is generated and located under `/tmp/tmp3_b7uplj/0/0.zip`?
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows
From: Pieter ***@***.***>
Sent: Wednesday, May 31, 2023 9:40 AM
To: ***@***.***>
Cc: XiaoYun ***@***.***>; ***@***.***>
Subject: Re: [openml/automlbenchmark] Fix/mlnet rebased on `master` (PR #527)
@LittleLittleCloud<https://github.com/LittleLittleCloud> I rebased your work on the latest master, which resulted in clearer error messages.
The error I got<https://github.com/openml/automlbenchmark/actions/runs/5134583469/jobs/9238807788#step:7:211> was about an incorrect invocation of mlnet and it not expecting the --test-dataset parameter. I don't understand why it complains about the --test-dataset argument though, from the documentation<https://learn.microsoft.com/en-us/dotnet/machine-learning/reference/ml-net-cli-reference#test-dataset> it seems like a valid argument.
However, I also have a question about the usage of --test-dataset here. It reads to me like the --test-dataset will actually be used to perform model selection (e.g., train a bunch of models on --dataset, then select the best based on results on --test-dataset). Is that correct? If so, that is not in line with the benchmark design. Model selection should happen without knowledge of the test dataset. I think the correct usage is for MLNet to be invoked with --dataset only, and consequently do model selection based on an internally chosen validation set (e.g., k-fold cv). Then the test dataset is only provided for the mlnet predict command. I applied this "fix" on this branch. Please let me know if I understand the usage correctly and if this seems like the right (new) invocation.
Anyway, even after the change there seems to be a failures. On github I get error code 143 (likely memory), on AWS I get the following:
[INFO] [openml.datasets.dataset:15:59:47.186] pickle write APSFailure
[INFO] [frameworks.MLNet.exec:15:59:54.601] train dataset: /s3bucket/input/org/openml/www/datasets/41138/dataset_train_0.csv
[INFO] [frameworks.MLNet.exec:15:59:54.601] test dataset: /s3bucket/input/org/openml/www/datasets/41138/dataset_test_0.csv
[INFO] [amlb.utils.process:15:59:54.601] Running cmd `/repo/frameworks/MLNet/lib/mlnet classification --dataset /s3bucket/input/org/openml/www/datasets/41138/dataset_train_0.csv --train-time 600 --label-col 0 --output /tmp/tmp3_b7uplj --name 0 --verbosity diag --log-file-
path /tmp/tmp3_b7uplj/0/log.txt`
[INFO] [amlb.print:15:59:54.972] Set log level to Trace
[INFO] [amlb.print:15:59:54.972] Set log file path to /tmp/tmp3_b7uplj/0/log.txt
[INFO] [amlb.print:15:59:59.934] Start Training
[INFO] [amlb.print:16:00:00.019] start multiclass classification
[INFO] [amlb.print:16:00:00.021] Evaluate Metric: MacroAccuracy
[INFO] [amlb.print:16:00:00.021] Available Trainers: LGBM,FASTFOREST,FASTTREE,LBFGS,SDCA
[INFO] [amlb.print:16:00:00.023] Training time in second: 600
[INFO] [amlb.print:16:00:12.420] | Trainer MacroAccuracy Duration |
[INFO] [amlb.print:16:00:12.420] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:00:12.422] |0 FastTreeOva 0.6877 12.1490 |
[INFO] [amlb.print:16:00:12.430] found best trial - trial id: 0
[INFO] [amlb.print:16:00:28.029] |1 FastForestOva 0.8392 15.5940 |
[INFO] [amlb.print:16:00:28.032] found best trial - trial id: 1
[INFO] [amlb.print:16:00:58.398] |2 LbfgsMaximumEntropyMulti 0.8336 30.3630 |
...
<manually removed remainder of training output>
...
[INFO] [amlb.print:16:09:59.936] [Source=AutoMLExperiment, Kind=Info] cancel training because cancellation token is invoked...
[INFO] [amlb.print:16:09:59.941] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.943] | Experiment Results |
[INFO] [amlb.print:16:09:59.943] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.946] | Summary |
[INFO] [amlb.print:16:09:59.946] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.949] |ML Task: multiclass classification |
[INFO] [amlb.print:16:09:59.949] |Dataset: /s3bucket/input/org/openml/www/datasets/41138/dataset_train_0.csv|
[INFO] [amlb.print:16:09:59.951] |Label : class |
[INFO] [amlb.print:16:09:59.954] |Total experiment time : 599.0000 Secs |
[INFO] [amlb.print:16:09:59.954] |Total number of models explored: 19 |
[INFO] [amlb.print:16:09:59.958] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.958] | Top 5 models explored |
[INFO] [amlb.print:16:09:59.961] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.961] | Trainer MacroAccuracy Duration |
[INFO] [amlb.print:16:09:59.963] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:09:59.965] |12 FastTreeOva 0.8784 22.6530 |
[INFO] [amlb.print:16:09:59.966] |9 FastTreeOva 0.8504 14.7040 |
[INFO] [amlb.print:16:09:59.968] |5 LbfgsLogisticRegressionOva 0.8413 58.8520 |
[INFO] [amlb.print:16:09:59.968] |17 FastForestOva 0.8394 16.8820 |
[INFO] [amlb.print:16:09:59.971] |1 FastForestOva 0.8392 15.5940 |
[INFO] [amlb.print:16:09:59.971] |--------------------------------------------------------------------|
[INFO] [amlb.print:16:10:00.080] save 0.mbconfig to /tmp/tmp3_b7uplj/0
[INFO] [amlb.print:16:10:01.453] Generating a console project for the best pipeline at location : /tmp/tmp3_b7uplj/0
[INFO] [amlb.print:16:10:01.474]
[INFO] [amlb.print:16:10:01.474]
[INFO] [amlb.print:16:10:01.475]
[INFO] [amlb.utils.process:16:10:01.476] Running cmd `/repo/frameworks/MLNet/lib/mlnet predict --task-type classification --model /tmp/tmp3_b7uplj/0/0.zip --dataset /s3bucket/input/org/openml/www/datasets/41138/dataset_test_0.csv --label-col class > /tmp/tmp3_b7uplj/0/prediction.txt`
[ERROR] [amlb.utils.process:16:10:01.849] File does not exist: /tmp/tmp3_b7uplj/0/0.zip
ps. Is there a way to influence the inference time of the final models? E.g., a constraint or preset which favors faster models, or having inference time explicitly as secondary objective? If we do get the opportunity to benchmark MLNet, we might be interested of evaluating the system with different trade-offs.
—
Reply to this email directly, view it on GitHub<#527 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEAYLOSJSHCLSVIU2I6XMEDXI5YBDANCNFSM6AAAAAAYVTXHNM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
You might be able to test the integration script locally (I currently cannot) and poke around. From my understanding, it should also be possible to log in to the EC2 instance to verify the existence of the file, but unfortunately I'm too busy to do that myself the next few days. |
Rebase #522 on top of master. Did not want to force push a rebase on someone else's branch so set up a new one instead.