-
Notifications
You must be signed in to change notification settings - Fork 0
/
w1_first_data_set.Rmd
156 lines (102 loc) · 3.47 KB
/
w1_first_data_set.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
---
title: "Week 1 - Exploring our first data set"
author: "Nicholas Good ([email protected])"
output:
rmarkdown::html_document:
toc: true
toc_float: true
theme: yeti
---
```{r global_options, include=FALSE}
knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE, results = 'hide')
```
# ARCTAS
In our first example we'll look at some data from the [ARCTAS campaign](https://www.nasa.gov/mission_pages/arctas/smoke_plumes.html).
---
# Libraries
We'll be using the some of the `tidyverse` packages for this exercise. Include the `tidyverse` library in your notebook:
```{r}
library(tidyverse)
```
---
# Source files
Often we'll want to access code we've saved elsewhere. We can use the `source()` function to include r scripts. For today's exercise you'll need to download a script called `r_functions`. Navigate to the course's [shared google drive folder](https://tinyurl.com/R-for-fire-data) and download the scripts folder. Make a copy of the scripts folder in your working directory. Then run the following line of code:
```{r}
source("scripts/r_functions.R")
```
R will look for the script `r_functions.R` in the folder `scripts`. Check the console for any error messages.
---
# Loading the data
We're going to look at some data from the [Arctas](https://www-air.larc.nasa.gov/cgi-bin/arcstat-c) project. You'll need to download the [dataset](https://tinyurl.com/R-for-fire-data).
![](images/arctas.PNG)
<br>
* Look at the code below. Where do you think R is going to look for the dataset when loading?
* Let's check out the help documentation for some of the functions used in the code segment below (`lapply`, `bind_rows`).
* Try running the code. An `object` named `arctas_data` should appear in your working environment.
```{r}
arctas_data <- lapply(list.files("../data/arctas",
pattern = "_R14.ict$",
full.names = TRUE),
read_ict_file) %>%
bind_rows()
```
---
# Working with dataframes
* Determine the class of the object `arctas_data`.
```{r}
class(arctas_data)
```
Data frames are a common way to stored data in R. Each column has a name, each column is the same length, each column element has the same class.
* Determine the class of the first column:
```{r}
class(arctas_data$utc)
```
* How many rows and columns are there in the dataset?
```{r}
nrow(arctas_data)
ncol(arctas_data)
```
* extract the names of each colum
```{r}
colnames(arctas_data)
```
* extract the first row
```{r}
first_row <- arctas_data[1,]
```
* extract the first column
```{r}
first_col <- arctas_data[,1]
```
* extract the `no` mixing ratio in the fourth column
```{r}
no_col_4 <- arctas_data$no[4]
```
* extract the `no` mixing ratios for flight number 10
```{r}
no_flight_10 <- arctas_data$no[which(arctas_data$flight == 10)]
```
* Try typing `arctas_data` into your console and then press `enter`. What happens?
---
# Summary statistics
* Calculate the minimum and maximum `no` values. What do you notice?
```{r}
no_min <- min(arctas_data$no)
no_max <- max(arctas_data$no)
```
* Let's clean up the data
```{r}
arctas_data[arctas_data < -1e+8] <- NA
plot(arctas_data$no)
```
* Recalculate the minimum and maximum `no` values. What do you notice?
* Let's try this instead:
```{r}
no_min <- min(arctas_data$no, na.rm = TRUE)
no_max <- max(arctas_data$no, na.rm = TRUE)
```
* And finally this:
```{r}
summary(arctas_data$no)
```
---