diff --git a/manuscript/01.3-ml-definitions.Rmd b/manuscript/01.3-ml-definitions.Rmd index 37f5f7c9..7afc8a13 100644 --- a/manuscript/01.3-ml-definitions.Rmd +++ b/manuscript/01.3-ml-definitions.Rmd @@ -16,25 +16,26 @@ The algorithm is guided by a score or loss function that is minimized. In the house value example, the machine minimizes the difference between the estimated house price and the predicted price. A fully trained machine learning model can then be used to make predictions for new instances. -Estimation of house prices, product recommendations, street sign detection, credit default prediction and fraud detection: -All these examples have in common that they can be solved by machine learning. +Estimation of house prices, product recommendations, street sign detection, credit default prediction and fraud detection, +all these are problems that can be solved by machine learning. + The tasks are different, but the approach is the same: Step 1: Data collection. The more, the better. The data must contain the outcome you want to predict and additional information from which to make the prediction. For a street sign detector ("Is there a street sign in the image?"), you would collect street images and label whether a street sign is visible or not. -For a credit default predictor, you need past data on actual loans, information on whether the customers were in default with their loans, and data that will help you make predictions, such as income, past credit defaults, and so on. +For a credit default predictor, you need past data on actual loans, information on defaulting and non-defaulting customers that will help you make predictions, such as income, past credit defaults, credit scores and so on. For an automatic house value estimator program, you could collect data from past house sales and information about the real estate such as size, location, and so on. -Step 2: Enter this information into a machine learning algorithm that generates a sign detector model, a credit rating model or a house value estimator. +Step 2: Enter this information into a machine learning algorithm that generates a street sign detector model, a credit rating model or a house value estimator. Step 3: Use model with new data. Integrate the model into a product or process, such as a self-driving car, a credit application process or a real estate marketplace website. Machines surpass humans in many tasks, such as playing chess (or more recently Go) or predicting the weather. -Even if the machine is as good as a human or a bit worse at a task, there remain great advantages in terms of speed, reproducibility and scaling. -A once implemented machine learning model can complete a task much faster than humans, reliably delivers consistent results and can be copied infinitely. +Even if the machine is as good as a human or slightly worse, there remain great advantages in terms of speed, reproducibility and scaling. +Once implemented, a machine learning model can complete a task much faster than humans, reliably deliver consistent results and be copied infinitely. Replicating a machine learning model on another machine is fast and cheap. -The training of a human for a task can take decades (especially when they are young) and is very costly. -A major disadvantage of using machine learning is that insights about the data and the task the machine solves is hidden in increasingly complex models. +Training a human for the same task could take decades (especially when they are young) and could be very costly. +A major disadvantage of using machine learning is that insights about the data and the task that the machine solves is hidden in increasingly complex models. You need millions of numbers to describe a deep neural network, and there is no way to understand the model in its entirety. Other models, such as the random forest, consist of hundreds of decision trees that "vote" for predictions. To understand how the decision was made, you would have to look into the votes and structures of each of the hundreds of trees. @@ -44,7 +45,7 @@ If you focus only on performance, you will automatically get more and more opaqu -The winning models on machine learning competitions are often ensembles of models or very complex models such as boosted trees or deep neural networks. +Competition winning machine learning models are often ensembles of models or very complex models such as boosted trees or deep neural networks. @@ -57,8 +58,8 @@ An algorithm can be considered as a recipe that defines the inputs, the output a Cooking recipes are algorithms where the ingredients are the inputs, the cooked food is the output, and the preparation and cooking steps are the algorithm instructions. -**Machine Learning** is a set of methods that allow computers to learn from data to make and improve predictions (for example cancer, weekly sales, credit default). -Machine learning is a paradigm shift from "normal programming" where all instructions must be explicitly given to the computer to "indirect programming" that takes place through providing data. +**Machine Learning** is a set of methods that allow computers to learn from data to make and improve predictions (for example predictions related to: cancer, weekly sales, credit defaults). +Machine learning is a paradigm shift from "normal programming" where all instructions must be explicitly given to the computer. Rather, Machine Learning "indirectly programs" a system by providing data. ```{r programing-vs-ml, echo = FALSE, fig.cap = ""} knitr::include_graphics("images/programing-ml.png") @@ -72,6 +73,7 @@ A **Machine Learning Model** is the learned program that maps inputs to predicti This can be a set of weights for a linear model or for a neural network. Other names for the rather unspecific word "model" are "predictor" or - depending on the task - "classifier" or "regression model". In formulas, the trained machine learning model is called $\hat{f}$ or $\hat{f}(x)$. + ```{r learner-definition, fig.cap = "A learner learns a model from labeled training data. The model is used to make predictions.", echo = FALSE} knitr::include_graphics("images/learner.png") @@ -79,8 +81,8 @@ knitr::include_graphics("images/learner.png") A **Black Box Model** is a system that does not reveal its internal mechanisms. -In machine learning, "black box" describes models that cannot be understood by looking at their parameters (e.g. a neural network). -The opposite of a black box is sometimes referred to as **White Box**, and is referred to in this book as [interpretable model](#simple). +In machine learning, the term "black box" describes models that cannot be understood by looking at their parameters (e.g. a neural network). +The opposite of a black box is sometimes referred to as **White Box**, and is called an [interpretable model](#simple) in this book. [Model-agnostic methods](#agnostic) for interpretability treat machine learning models as black boxes, even if they are not. ```{r black-box, echo = FALSE, fig.cap = ""} @@ -111,9 +113,9 @@ The **Target** is the information the machine learns to predict. In mathematical formulas, the target is usually called $y$ or $y_i$ for a single instance. A **Machine Learning Task** is the combination of a dataset with features and a target. -Depending on the type of the target, the task can be for example classification, regression, survival analysis, clustering, or outlier detection. +Depending on the type of the target, the task could be classification, regression, survival analysis, clustering, or outlier detection. -The **Prediction** is what the machine learning model "guesses" what the target value should be based on the given features. +The **Prediction** is the target value that the machine learning model "guesses" based on the given features. In this book, the model prediction is denoted by $\hat{f}(x^{(i)})$ or $\hat{y}$.