diff --git a/02-intro_R.qmd b/02-intro_R.qmd new file mode 100644 index 0000000..dc5b642 --- /dev/null +++ b/02-intro_R.qmd @@ -0,0 +1,97 @@ +# Introduction to R + +
+```{r, echo = F} +knitr::include_graphics("img/abacus.png") +``` +
+ +What you'll have learned by the end of the chapter: reading and writing, +exploring (and optionally visualising) data. + +## Reading in data with R + +Your first job is to actually get the following datasets into an R session. + +First install the `{rio}` package (if you don't have it already), then download +the following datasets: + +- [mtcars.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/mtcars.csv) +- [mtcars.dta](https://github.com/b-rodrigues/modern_R/raw/master/datasets/mtcars.dta) +- [mtcars.sas7bdat](https://github.com/b-rodrigues/modern_R/raw/master/datasets/mtcars.sas7bdat) +- [multi.xlsx](https://github.com/b-rodrigues/modern_R/raw/master/datasets/multi.xlsx) + +Also download the following 4 `csv` files and put them in a directory called +`unemployment`: + +- [unemp_2013.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/unemployment/unemp_2013.csv) +- [unemp_2014.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/unemployment/unemp_2014.csv) +- [unemp_2015.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/unemployment/unemp_2015.csv) +- [unemp_2016.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/unemployment/unemp_2016.csv) + +Finally, download this one as well, but put it in a folder called `problem`: + +- [mtcars.csv](https://raw.githubusercontent.com/b-rodrigues/modern_R/master/datasets/problems/mtcars.csv) + +and take a look at chapter 3 of my other book, [Modern R with the +{tidyverse}](https://b-rodrigues.github.io/modern_R/reading-and-writing-data.html) +and follow along. This will teach you to import and export data. + +`{rio}` is some kind of wrapper around many packages. You can keep using +`{rio}`, but it is also a good idea to know which packages are used under the +hood by `{rio}`. For this, you can take a look at this +[vignette](https://cran.r-project.org/web/packages/rio/vignettes/rio.html). + +If you need to import very large datasets (potentially several GBs), you might +want to look at packages like `{vroom}` ([this +benchmark](https://vroom.r-lib.org/articles/benchmarks.html#reading-delimited-files) +shows a 1.5G csv file getting imported in seconds by `{vroom}`. For even larger +files, take a look at `{arrow}` [here](https://arrow.apache.org/docs/r/). This +package is able to efficiently read very large files (`csv`, `json`, `parquet` +and `feather` formats). + +## A little aside on pipes + +Since R version 4.1, a forward pipe `|>` is included in the standard library of +the language. It allows to do this: + +```{r} + +4 |> + sqrt() + +``` + +Before R version 4.1, there was already a forward pipe, introduced with the +`{magrittr}` package (and automatically loaded by many other packages from the +*tidyverse*, like `{dplyr}`): + +```{r} +library(dplyr) + +4 %>% + sqrt() + +``` + +Both expressions above are equivalent to `sqrt(4)`. You will see why this is +useful very soon. For now, just know this exists and try to get used to it. + +## Exploring and cleaning data with R + +Take a look at [chapter +4](https://b-rodrigues.github.io/modern_R/descriptive-statistics-and-data-manipulation.html#a-first-taste-of-data-manipulation-with-dplyr) +of my other book, ideally you should study the entirety of the chapter, but for +our purposes you should really focus on sections 4.3, 4.4, 4.5.3, 4.5.4, +(optionally 4.7) and 4.8. + + +## Data visualization + +We're not going to focus on visualization due to lack of time. If you need to +create graphs, read [chapter +5](https://b-rodrigues.github.io/modern_R/graphs.html). + +## Further reading + +[R for Data Science](https://r4ds.had.co.nz/) diff --git a/_quarto.yml b/_quarto.yml index 4f98dfb..0050a03 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -13,6 +13,7 @@ book: downloads: [pdf, epub] chapters: - index.qmd + - 02-intro_R.qmd page-navigation: true bibliography: references.bib