Skip to content

DataSystemsGroupUT/AutoML_Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 

Repository files navigation

Survey on End-To-End Machine Learning Automation

In this repository, we present the references mentioned in a comprehensive survey for the state-of-the-art efforts in tackling the automation of Machine Learning AutoML, wether through fully automation to the role of data scientist or using some aiding tools that minimize the role of human in the loop. First, we focus on the Combined Algorithm Selection, and Hyperparameter Tuning (CASH) problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline from data understanding till model deployment. Furthermore, we provide a comprehensive coverage for the various tools and frameworks that have been introduced in this domain.

Table of Contents & Organization:

This repository will be organized into 6 separate sections:


Meta-Learning Techniques for AutoML search problem:

Meta-learning can be described as the process of leaning from previous experience gained during applying various learning algorithms on different kinds of data, and hence reducing the needed time to learn new tasks.

  • 2018 | Meta-Learning: A Survey. | Vanschoren | CoRR | PDF
  • 2008 | Metalearning: Applications to data mining | Brazdil et al. | Springer Science & Business Media | PDF

Learning From Model Evaluation

  • Surrogate Models

    • 2018 | Scalable Gaussian process-based transfer surrogates for hyperparameter optimization. | Wistuba et al. | Journal of ML | PDF
  • Warm-Started Multi-task Learning

    • 2017 | Multiple adaptive Bayesian linear regression for scalable Bayesian optimization with warm start. | Perrone et al. | PDF
  • Relative Landmarks

    • 2001 | An evaluation of landmarking variants. | Furnkranz and Petrak | ECML/PKDD | PDF

Learning From Task Properties

  • Using Meta-Features

    • 2019 | SmartML: A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Machine Learning Algorithms. | Maher and Sakr | EDBT | PDF
    • 2017 | On the predictive power of meta-features in OpenML. | Bilalli et al. | IJAMC | PDF
    • 2013 | Collaborative hyperparameter tuning. | Bardenet et al. | ICML | PDF
  • Using Meta-Models

    • 2018 | Predicting hyperparameters from meta-features in binary classification problems. | Nisioti et al. | ICML | PDF
    • 2014 | Automatic classifier selection for non-experts. Pattern Analysis and Applications. | Reif et al. | PDF
    • 2012 | Imagenet classification with deep convolutional neural networks. | Krizhevsky et al. | NIPS | PDF
    • 2008 | Predicting the performance of learning algorithms using support vector machines as meta-regressors. | Guerra et al. | ICANN | PDF
    • 2008 | Metalearning-a tutorial. | Giraud-Carrier | ICMLA | PDF
    • 2004 | Metalearning: Applications to data mining. | Soares et al. | Springer Science & Business Media | PDF
    • 2004 | Selection of time series forecasting models based on performance information. | dos Santos et al. | HIS | PDF
    • 2003 | Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. | Brazdil et al. | Journal of ML | PDF
    • 2002 | Combination of task description strategies and case base properties for meta-learning. | Kopf and Iglezakis | PDF

Learning From Prior Models

  • Transfer Learning

    • 2014 | How transferable are features in deep neural networks? | Yosinski et al. | NIPS | PDF
    • 2014 | CNN features offthe-shelf: an astounding baseline for recognition. | Sharif Razavian et al. | IEEE CVPR | PDF
    • 2014 | Decaf: A deep convolutional activation feature for generic visual recognition. | Donahue et al. | ICML | PDF
    • 2012 | Imagenet classification with deep convolutional neural networks. | Krizhevsky et al. | NIPS | PDF
    • 2012 | Deep learning of representations for unsupervised and transfer learning. | Bengio | ICML | PDF
    • 2010 | A survey on transfer learning. | Pan and Yang | IEEE TKDE | PDF
    • 1995 | Learning many related tasks at the same time with backpropagation. | Caruana | NIPS | PDF
    • 1995 | Learning internal representations. | Baxter | PDF
  • Few-Shot Learning

    • 2017 | Prototypical networks for few-shot learning. | Snell et al. | NIPS | PDF
    • 2017 | Meta-Learning: A Survey. | Vanschoren | CoRR | PDF
    • 2016 | Optimization as a model for few-shot learning. | Ravi and Larochelle | PDF

Neural Architecture Search Problem

Neural Architecture Search (NAS) is a fundamental step in automating the machine learning process and has been successfully used to design the model architecture for image and language tasks.

  • 2018 | Progressive neural architecture search. | Liu et al. | ECCV | PDF
  • 2018 | Efficient architecture search by network transformation. | Cai et al. | AAAI | PDF
  • 2018 | Learning transferable architectures for scalable image recognition. | Zoph et al. | IEEE CVPR | PDF
  • 2017 | Hierarchical representations for efficient architecture search. | Liu et al. | PDF
  • 2016 | Neural architecture search with reinforcement learning. | Zoph and Le | PDF
  • 2009 | Learning deep architectures for AI. | Bengio et al. | PDF
  • Random Search

    • 2019 | Random Search and Reproducibility for Neural Architecture Search. | Li and Talwalkar | PDF
    • 2017 | Train Longer, Generalize Better: Closing the Generalization Gap in Large Batch Training of Neural Networks. | Hoffer et al. | NIPS | PDF
  • Reinforcement Learning

    • 2019 | Neural architecture search with reinforcement learning. | Zoph and Le | PDF
    • 2019 | Designing neural network architectures using reinforcement learning. | Baker et al. | PDF
  • Evolutionary Methods

    • 2019 | Evolutionary Neural AutoML for Deep Learning. | Liang et al. | PDF
    • 2019 | Evolving deep neural networks. | Miikkulainen et al. | PDF
    • 2018 | a multi-objective genetic algorithm for neural architecture search. | Lu et al. | PDF
    • 2018 | Efficient multi-objective neural architecture search via lamarckian evolution. | Elsken et al. | PDF
    • 2018 | Regularized evolution for image classifier architecture search. | Real et al. | PDF
    • 2017 | Large-scale evolution of image classifiers | Real et al. | ICML | PDF
    • 2017 | Hierarchical representations for efficient architecture search. | Liu et al. | PDF
    • 2009 | A hypercube-based encoding for evolving large-scale neural networks. | Stanley et al. | Artificial Life | PDF
    • 2002 | Evolving neural networks through augmenting topologies. | Stanley and Miikkulainen | Evolutionary Computation | PDF
  • Gradient Based Methods

    • 2018 | Differentiable neural network architecture search. | Shin et al. | PDF
    • 2018 | Darts: Differentiable architecture search. | Liu et al. | PDF
    • 2018 | MaskConnect: Connectivity Learning by Gradient Descent. | Ahmed and Torresani | PDF
  • Bayesian Optimization

    • 2018 | Towards reproducible neural architecture and hyperparameter search. | Klein et al. | PDF
    • 2018 | Neural Architecture Search with Bayesian Optimisation and Optimal Transport | Kandasamy et al. | NIPS | PDF
    • 2016 | Towards automatically-tuned neural networks. | Mendoza et al. | PMLR | PDF
    • 2015 | Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. | Domhan et al. | IJCAI | PDF
    • 2014 | Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces. | Swersky et al. | PDF
    • 2013 | Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. | Bergstra et al. | PDF Github (Hyperopt)
    • 2011 | Algorithms for hyper-parameter optimization. | Bergstra et al. | NIPS | PDF

Hyper-Parameter Optimization

After choosing the model pipeline algorithm(s) with the highest potential for achieving the top performance on the input dataset, the next step is tuning the hyper-parameters of such model in order to further optimize the model performance. It is worth mentioning that some tools have democratized the space of different learning algorithms in discrete number of model pipelines. So, the model selection itself can be considered as a categorical parameter that needs to be tuned in the first place before modifying its hyper-parameters.

Black Box Optimization

  • Grid and Random Search

    • 2017 | Design and analysis of experiments. | Montgomery | PDF
    • 2015 | Adaptive control processes: a guided tour. | Bellman | PDF
    • 2012 | Random search for hyper-parameter optimization. | Bergstra and Bengio | JMLR | PDF
  • Bayesian Optimization

    • 2018 | Bohb: Robust and efficient hyperparameter optimization at scale. | Falkner et al. | JMLR | PDF
    • 2017 | On the state of the art of evaluation in neural language models. | Melis et al. | PDF
    • 2015 | Automating model search for large scale machine learning. | Sparks et al. | ACM-SCC | PDF
    • 2015 | Scalable bayesian optimization using deep neural networks. | Snoek et al. | ICML | PDF
    • 2014 | Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. | Martinez-Cantin | JMLR | PDF
    • 2013 | Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. | Bergstra et al. | PDF
    • 2013 | Towards an empirical foundation for assessing bayesian optimization of hyperparameters. | Eggensperger et al. | NIPS | PDF
    • 2013 | Improving deep neural networks for LVCSR using rectified linear units and dropout. | Dahl et al. | IEEE-ICASSP | PDF
    • 2012 | Practical bayesian optimization of machine learning algorithms. | Snoek et al. | NIPS | PDF Github (Spearmint)
    • 2011 | Sequential model-based optimization for general algorithm configuration. | Hutter et al. | LION | PDF Github
    • 2011 | Algorithms for hyper-parameter optimization. | Bergstra et al. | NIPS | PDF
    • 1998 | Efficient global optimization of expensive black-box functions. | Jones et al. | PDF
    • 1978 | Adaptive control processes: a guided tour. | Mockus et al. | PDF
    • 1975 | Single-step Bayesian search method for an extremum of functions of a single variable. | Zhilinskas | PDF
    • 1964 | A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. | Kushner | PDF
  • Simulated Annealing

    • 1983 | Optimization by simulated annealing. | Kirkpatrick et al. | Science | PDF
  • Genetic Algorithms

    • 1992 | Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. | Holland et al. | PDF

Multi-Fidelity Optimization

  • 2019 | Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning. | Wu et al. | PDF
  • 2019 | Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion. | Hu et al. | PDF
  • 2016 | Review of multi-fidelity models. | Fernandez-Godino | PDF
  • 2012 | Provably convergent multifidelity optimization algorithm not requiring high-fidelity derivatives. | March and Willcox | AIAA | PDF
  • Modeling Learning Curve

    • 2017 | Learning curve prediction with Bayesian neural networks. | Klein et al. | ICLR | PDF
    • 2015 | Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. | Domhan et al. | IJCAI | PDF
    • 1998 | Efficient global optimization of expensive black-box functions. | Jones et al. | JGO | PDF
  • Bandit Based

    • 2018 | Massively parallel hyperparameter tuning. | Li et al. | AISTATS | PDF
    • 2016 | Non-stochastic Best Arm Identification and Hyperparameter Optimization. | Jamieson and Talwalkar | AISTATS | PDF
    • 2016 | Hyperband: A novel bandit-based approach to hyperparameter optimization. | Kirkpatrick et al. | JMLR | PDF Github Github (Distributed Hyperband - BOHB)

AutoML Tools and Frameworks

  • Centralized Frameworks

Date Language Training Framework Optimization Method ML Tasks Meta-Learning UI Open Source
AutoWeka 2013 Java Weka Bayesian Optimization Single-label classification regression × Github 'Tool'
HyperOpt-Sklearn 2014 Python Scikit-Learn Bayesian Optimization, Simulated Annealing, and Random Search Single-label classification regression × × Github
AutoSklearn 2015 Python Scikit-Learn Bayesian Optimization Single-label classification regression × Github 'Tool'
TPOT 2016 Python Scikit-Learn Genetic Algorithm Single-label classification regression × × Github
Recipe 2017 Python Scikit-Learn Grammer-Based Genetic Algorithm Single-label classification × Github
Auto-Meka 2018 Java Meka Grammer-Based Genetic Algorithm Multi-label classification × Github
ML-Plan 2018 Java Weka / Scikit-Learn Hierarchical Task Planning Single-label classification × × Github
AutoStacker 2018 - - Genetic Algorithm Single-label classification × × ×
PMF 2018 Python Scikit-Learn Collaborative Filtering and Bayesian Optimization Single-label classification × Github
AlphaD3M 2018 - - Reinforcement Learning Single-label classification regression × ×
SmartML 2019 R Different R Packages Bayesian Optimization Single-label classification Github
VDS 2019 - - Cost-Based Multi-Armed Bandits and Bayesian Optimization Single-label classification, regression, image classification, audio classification, graph matching ×
OBOE 2019 Python Scikit-Learn Collaborative Filtering Single-label classification × Github
Auptimizer 2019 Random, Grid, Hyperband, Hyperopt, Spearmint Single-label classification x × Github
iSmartML 2019 Python Scikit-Learn Bayesian Optimization Single-label classification regression Github 'Tool'
  • Distributed Frameworks

Date Language Training Framework Optimization Method Meta-Learning UI Open Source PDF
MLBase 2013 Scala SparkMlib Cost-based Multi-Armed Bandits × × × Website PDF
ATM 2017 Python Scikit-Learn Hybrid Bayesian, and Multi-armed bandits Optimization × Github PDF
MLBox 2017 Python Scikit-Learn Keras Distributed Random search, and Tree-Parzen estimators × × Github ×
Rafiki 2018 Python Scikit-Learn TensorFlow Distributed random search, Bayesian Optimization × Github PDF
TransmogrifAI 2018 Scala SparkML Bayesian Optimization, and Random Search × × Github Website ×
ATMSeer 2019 Python Scikit-Learn On Top Of ATM Hybrid Bayesian, and Multi-armed bandits Optimization Github PDF
D-SmartML 2019 Scala SparkMlib Grid Search, Random Search, Hyperband x Github x
Databricks 2019 Python SparkMlib Hyperopt x × Website x
  • Cloud-Based Frameworks

    • Google AutoML | URL
    • Azure AutoML | URL
    • Amazon SageMaker | URL
  • NAS Frameworks

Date Supported Architectures Optimization Method Supported Frameworks UI Open Source PDF
AutoNet 2016 FCN SMAC PyTorch × Github PDF
Auto-Keras 2018 No Restrictions Network Morphism Keras Github PDF
enas 2018 CNN, RNN Reinforcement Learning TensorFlow × Github PDF
NAO 2018 CNN, RNN Gradient based optimization TensorFlow PyTorch × Github PDF
DARTS 2019 No Restrictions Gradient based optimization PyTorch × Github PDF
NNI 2019 No Restrictions Random and GridSearch, Different Bayesian Optimizations, Annealing, Network Morphism, Hyper-Band, Naive Evolution PyTorch, TensorFlow, Keras, Caffe2, CNTK, Chainer, Theano Github ×

Pre-Modeling and Post-Modeling Aiding Tools

While current different AutoML tools and frameworks have minimized the role of data scientist in the modeling part and saved much effort, there is still several aspects that need human intervention and interpretability in order to make the correct decisions that can enhance and affect the modeling steps. These aspects belongs to two main building blocks of the machine learning production pipeline: Pre-Modeling and PostModeling.

The aspects of these two building blocks can help on covering what is missed in current AutoML tools, and help data scientists in doing their job in a much easier, organized, and informative way.

Pre-Modeling

  • Data Understanding

    • Sanity Checking
      • 2017 | Controlling False Discoveries During Interactive Data Exploration. | Zhao et al. | SIGMOD | PDF
      • 2016 | Data Exploration with Zenvisage: An Expressive and Interactive Visual Analytics System. | Siddiqui et al. | VLDB | PDF | TOOL
      • 2015 | SEEDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics. | Vartak et al. | PVLDB | PDF | TOOL
    • Feature Based Analysis
      • 2016 | Visual Exploration of Machine Learning Results Using Data Cube Analysis. | Kahng et al. | HILDA | PDF
      • 2015 | Smart Drill-down: A New Data Exploration Operator. | Joglekar et al. | VLDB | PDF
    • Data Life-Cycle Analysis
      • 2017 | Ground: A Data Context Service | Hellerstein et al. | CIDR | PDF | URL
      • 2016 | ProvDB: A System for Lifecycle Management of Collaborative Analysis Workflows. | Miao et al. | CoRR | PDF | Github
      • 2016 | Goods: Organizing Google’s Datasets. | Halevy et al. | SIGMOD | PDF
  • Data Validation

    • Automatic Correction
      • 2017 | MacroBase: Prioritizing Attention in Fast Data. | Bailis et al. | SIGMOD | PDF | Github
      • 2015 | Data X-Ray: A Diagnostic Tool for Data Errors. | Wang et al. | SIGMOD | PDF
    • Automatic Alerting
      • 2009 | On Approximating Optimum Repairs for Functional Dependency Violations. | Kolahi and Lakshmanan | ICDT | PDF
      • 2005 | A Cost-based Model and Effective Heuristic for Repairing Constraints by Value Modification. | Bohannon et al. | SIGMOD | PDF
  • Data Preparation

    • Feature Addition
      • 2018 | Google Search Engine for Datasets | URL
      • 2014 | DataHub: Collaborative Data Science & Dataset Version Management at Scale. | Bhardwaj et al. | CoRR | PDF | URL
      • 2013 | OpenML: Networked Science in Machine Learning. | Vanschoren et al. | SIGKDD | PDF | URL
      • 2007 | UCI: Machine Learning Repository. | Dua, D. and Graff, C. | URL
    • Feature Synthesis
      • 2015 | Deep feature synthesis: Towards automating data science endeavors. | Kanter and Veeramachaneni | DSAA | PDF | Github

Post-Modeling

  • 2019 | Model Chimp | URL
  • 2018 | ML-Flow | URL
  • 2017 | Datmo | URL

AutoML Challenges

  • 2019 | Third AutoML Challenge | URL
  • 2018 | Second AutoML Challenge | URL
  • 2017 | First AutoML Challenge | URL

Contribute:

To contribute a change to add more references to our repository, you can follow these steps:

  1. Create a branch in git and make your changes.
  2. Push branch to github and issue pull request (PR).
  3. Discuss the pull request.
  4. We are going to review the request, and merge it to the repository.

Citation:

For more details, please refer to our Survey Paper PDF

Radwa El-Shawi, Mohamed Maher, Sherif Sakr., Automated Machine Learning: State-of-The-Art and Open Challenges (2019).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •