title

openreview

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Training and Cross-Validating Machine Learning Pipelines with Limited Memory

4LkaPSHUQQ

While automated machine learning (AutoML) can save human labor in finding well-performing pipelines, it often suffers from two problems: overfitting and using excessive resources. Unfortunately, the solutions are often at odds: cross-validation helps reduce overfitting at the expense of more resources; conversely, preprocessing on a separate compute cluster and then cross-validating only the final predictor saves resources at the expense of more overfitting. This paper shows how to train and cross-validate entire pipelines on a single moderate machine with limited memory by using monoids, which are associative, thus providing a flexible way for handling large data one batch at a time. To facilitate AutoML, our approach is designed to support the common sklearn APIs used by many AutoML systems for pipelines, training, cross-validation, and several operators. Abstracted behind those APIs, our approach uses task graphs to extend the benefits of monoids from operators to pipelines, and provides a multi-backend implementation. Overall, our approach lets users train and cross-validate pipelines on simple and inexpensive compute infrastructure.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

hirzel24a

0

Training and Cross-Validating Machine Learning Pipelines with Limited Memory

13/1

25

13/1-25

13

false

Hirzel, Martin and Kate, Kiran and Mandel, Louis and Shinnar, Avraham

given	family
Martin	Hirzel

given	family
Kiran	Kate

given	family
Louis	Mandel

given	family
Avraham	Shinnar

2024-10-09

Proceedings of the Third International Conference on Automated Machine Learning

256

inproceedings

date-parts

2024

10

9

https://raw.githubusercontent.com/mlresearch/v256/main/assets/hirzel24a/hirzel24a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-10-09-hirzel24a.md

2024-10-09-hirzel24a.md

Files

2024-10-09-hirzel24a.md

Latest commit

History

2024-10-09-hirzel24a.md

File metadata and controls