-
Notifications
You must be signed in to change notification settings - Fork 0
/
in_WorkedExample_Workings.Rmd
339 lines (224 loc) · 6.76 KB
/
in_WorkedExample_Workings.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
---
title: "Worked Example - Background workings"
subtitle: <h5 style="font-style:normal">GEOG-364 - Spatial Analysis</h4>
output:
html_document:
toc: true
toc_depth: 4
toc_float: true
theme: flatly
---
<style>
p.comment {
background-color: #DBDBDB;
padding: 10px;
border: 1px solid black;
margin-left: 0px;
border-radius: 5px;
font-style: normal;
}
h1.title {
font-weight: bold;
font-family: Arial;
}
h2.title {
font-family: Arial;
}
</style>
<style type="text/css">
TOC {
font-size: 12px;
font-family: Arial;
}
</style>
\
```{r setup, include=FALSE,echo=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message = FALSE,dpi = 300, dev = "svg")
# invisible data read
library(tidyverse)
library(sp)
library(sf)
library(readxl)
library(skimr)
library(tmap)
library(viridis)
library(usmap)
library(rnaturalearth)
library(ggpubr)
library(gridExtra)
library(ceramic)
library(leaflet)
library(leaflet.extras)
library(leaflet.providers)
library(raster)
library(ggmap)
library(dismo)
library(maps)
as.Date_origin <- function(x){
format.Date(as.Date(x, origin = '1970-01-01'), format = '%b-%d')
}
Sys.setenv(MAPBOX_API_KEY="pk.eyJ1IjoiaGdyZWF0cmV4IiwiYSI6ImNrbmdpNDN6djBkZTQydXBhNDc1enJjZnkifQ.xQWYvOFK5BKQoNWoOacDgA")
```
<style>
p.comment {
background-color: #DBDBDB;
padding: 10px;
border: 1px solid black;
margin-left: 0px;
border-radius: 5px;
font-style: normal;
}
h1.title {
font-weight: bold;
font-family: Arial;
}
h2.title {
font-family: Arial;
}
</style>
<style type="text/css">
#TOC {
font-size: 11px;
font-family: Arial;
}
</style>
# What is this?
In part 2 (to be uploaded), I show my actual example report analysing farmers markets in Iowa. So you can see what I would actually submit in the next section.
These are my background workings - all my text are things I might think in my head, you don't need to write all of this. But I hope you can follow this and do something similar for your own work.
<br><br>
# 1. Set up and libraries
My example tutorial is set in Iowa on farmers markets. First, set up libraries
```{r, results=FALSE,warning=FALSE,message=FALSE}
library(sp)
library(sf)
library(tidyverse)
library(tmap)
library(readxl)
library(skimr)
library(spatstat)
```
<br><br>
# 2. Read in the data
Now I will read in my data, on farmers markets in Iowa. Sometimes adding in the na option helps R recognise missing values, so I will try that.
```{r}
IA_market <- readxl::read_excel("./Data/Farmers_Markets_Iowa.xlsx",na="NA")
```
```{r}
names(IA_market)
```
OK, I can see I have the columns above and when I click on the IA_market name in the Environment tab I can see more details. First, I'll take a quick look at the summary.
```{r}
summary(IA_market)
```
<br><br>
# 3. Initial quality control
From the summary above, I can see some weird issues.
## 3.1. Problem with X
Hmm there is a very small longitude value in X that can't exist
```{r}
# get rid of the row with the weird value
# I could also set it to NA
IA_market <- IA_market[which(IA_market$X > -990),]
# and print the summary
summary(IA_market[,c("X","Y")])
```
OK that looks better.
<br><br>
## 2. Remove all the rows with missing locations
I know I don't need any row with missing location data, so let's remove
```{r}
IA_market <- IA_market[complete.cases(IA_market$X), ]
IA_market <- IA_market[complete.cases(IA_market$Y), ]
```
<br><br>
# 4. Choose Columns
I don't need all this data, so let's keep things neat.
I quite like the glimpse function from tidyverse for this. Here are my columns (same as name command)
```{r}
glimpse(IA_market)
```
- I know from google that FID is a federal identifier that might be useful, so I'll keep that.
- Looking at the data, "location" is the full address, so let's remove that.
- All the data is in Iowa, so let's remove State
- Open_Hours and Open Dates feels too complicated, so i'm removing them
```{r}
IA_market <- IA_market[,c("X","Y","FID",
"City","County",
"Market_Name",
"Weekday")]
summary(IA_market)
```
<br><br>
# 5. Create families of grouped data
Some of my data like market name is going to be unique to each market. Others like weekday could be considered data groups/families. In R this is called a factor.
To know the difference, I am thinking about whether it is useful to make a summary table of counts for that value
So
```{r}
IA_market$City <- as.factor(IA_market$City)
IA_market$County <- as.factor(IA_market$County)
IA_market$Weekday <- as.factor(IA_market$Weekday)
```
Now when I summarise, I can immediately see that there are 12 markets on Fridays.
```{r}
summary(IA_market)
```
<br><br>
# 6. Rename columns
I know my X and Y data are really latitude and longitude, so I will rename them. to make sure I don't mess up, lots of printing out
```{r}
names(IA_market)[1]
names(IA_market)[1] <- "Longitude"
names(IA_market)[1]
names(IA_market)[2]
names(IA_market)[2] <- "Latitude"
names(IA_market)[2]
summary(IA_market)
```
<br><br>
# 7. Make a spatial version
OK I know my data is in Lat/Long, so when I make it spatial, I include the crs code 4326 (yours is almost certainly likely to be the same)
```{r}
# I know that it was in lat/long originally, hence the crs
IA_market.sf <- st_as_sf(IA_market,
coords=c("Longitude","Latitude"),
crs=4326)
# make a quick plot
tmap_mode("view")
qtm(st_geometry(IA_market.sf))
```
<br><br>
# 8. More quality control
You can see in the plot above that there appears to be one point that is not in Iowa. So now I will look at the long/lat columns. I can also filter my data in the View tab.
Note i'm doing this on the original data not the sf.
```{r}
summary(IA_market[,1:2])
hist(IA_market$Longitude)
```
I feel that maybe that point at Long \~ -88 is wrong. Let's take a look.
```{r}
IA_market[IA_market$Longitude > -90,]
```
I'm guessing there's a typo - so I could just google this and fix it! but for now, let's remove and try again.
```{r}
# choose all the other ones, note the <= instead of >
IA_market <- IA_market[which(IA_market$Longitude < -90),]
# RECREATE THE SPATIAL DATA
IA_market.sf <- st_as_sf(IA_market,coords=c("Longitude","Latitude"),
crs=4326)
# make a quick plot
tmap_mode("view")
qtm(IA_market.sf)
```
Much better. Now let's change to a local map projection of my choice.I am choosing UTM Iowa from Tutorial 9.
```{r}
# And change to a map projection of your choice.
IA_market.sf.utm <- st_transform(IA_market.sf,3744)
# make a quick plot
tmap_mode("view")
qtm(st_geometry(IA_market.sf.utm))
```
and let's make it a ppp variable using spatstat. Looking good!
```{r}
IA_market.ppp <- as.ppp(IA_market.sf.utm)
plot(IA_market.ppp)
```