-
Notifications
You must be signed in to change notification settings - Fork 0
/
w2_loading_data.Rmd
187 lines (114 loc) · 3.97 KB
/
w2_loading_data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
---
title: "Week 2 - Loading data"
author: "Nicholas Good ([email protected])"
output:
rmarkdown::html_document:
toc: true
toc_float: true
theme: yeti
---
```{r global_options, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE, results = 'hide')
```
---
# Set-up
* Let's organize our R session:
```{r}
# 1. create a new R notebook for this exercise
# 2. make sure you have this week's data folder downloaded from Google Drive
# 3. check you working directory matches the project directory
```
---
# Text files
R includes numerous `read.` functions that can be used to load common files formats. The package `readr` also includes functions to load common files formats.
---
# Useful functions
Try running these functions, is there anything you need to change?
```{r}
getwd()
list.files(path = "../data/w2_data", full.names = TRUE)
?read.csv
```
---
## Loading a .csv
```{r}
# store the file path in an object
path <- "../data/w2_data/R for fire data analysis tutorials - Form Responses.csv"
# use the read.csv function to load the file
# what is each function argument doing?
responses <- read.csv(file = path, header = TRUE, sep = ",")
# explore the data you've just loaded using these functions
head(responses)
class(responses)
```
* Now try using the `readr` package
* Do you need to install the `readr` library? Try running `"readr" %in% rownames(installed.packages())` in your console.
* Do you need to update the `list.files` command?
* After loading the files, try typing the name of the file object into your console, what do you see?
* What class in the loaded object?
```{r}
library(readr)
path <- "../data/w2_data/R for fire data analysis tutorials - Form Responses.csv"
responses <- read_csv(file = path, col_names = TRUE)
class(responses)
```
---
# Exercise 1
Loading an ARCTAS file
* Load the file `ARCTAS-mrg60-dc8_merge_20080409_R14.ict`.
* Open the file in your favorite text editor.
* What is the delimiter?
* Which load function should you use?
* Which line should you load first?
* Which libraries do you need?
---
# Messy files
You may come across files that are don't load nicely. For these you can use the `readr::read_lines` or the `base::readLines` function.
---
# Multiple files
Frequently you will want to load multiple files of the same format.
* We'll need to tell the function which files to load. You can use the `pattern =` argument to specify which files to include.
```{r}
files <- list.files("../data/w2_data", full.names = TRUE, pattern = "R14.ict$")
```
* We can now use `lapply` to load each file named in the `files` object
* What class of object is created?
```{r}
library(readr)
multiple_files = lapply(files,
read_csv, skip = 387, col_types = cols())
```
* You can use the `dplyr` function `bind_rows` to create a single `data_frame`:
```{r}
library(readr)
library(dplyr)
multiple_files = lapply(files,
read_csv, skip = 387, col_types = cols()) %>%
bind_rows()
```
---
# Other programs
If you would like to import data from other programming environments the are numerous R packages to facilitate this.
## Stats programs
The `haven` library has functions such as `read_sas()`, `read_sav()` and read `read_dta()` for SAS, SPSS, and STATA file types for example.
## Matlab
The `R.matlab` package has a `readMat()` function for loading .mat files.
---
# Saving data
## As R objects
You can R objects in a format R will recognize or as text for example.
```{r}
save(responses, file = "../data/responses.RData")
saveRDS(responses, file = "../data/responses.RDS")
```
You can read R files like so:
```{r}
load(file = "../data/responses.RData")
responses_reloaded <- readRDS(responses, file = "../data/responses.RDS")
```
## As plain text
```{r}
write.csv(responses, file = "../data/saved_responses_utils.csv")
write_delim(responses, path = "../data/saved_responses_readr.csv", delim = "&")
```
---