README.qmd

---
format: gfm
# format: html
---

```{r}
#| include: false
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
library(tidyverse)
library(tmap)
library(stplanr)
```

The aim of this repo is to showcase ways of generating evidence to support strategic active network planning tools in England and beyond.

What follows is a language agnostic but fully reproducible (with R, see README.qmd for code) description of input datasets, processes and functions for generating estimates of active travel uptake down to the street level.
It will make use of some of the same input datasets that are used in the Propensity to Cycle Tool (PCT).
For full reproducibility, the code in this repo is developed in a Docker container using the [.devcontainer.json](https://containers.dev/implementors/json_reference/) format.

We will cover input datasets, processing steps, and outputs.

# Input datasets

The input datasets were extracted from the Propensity to Cycle Tool and underlying datasets.
Small input datasets are saved in the `input/` folder of this repo.

```{r}
remotes::install_cran("pct")
```

## Zone data

Zone data is available at many geographic levels, including large zones (e.g. MSOAs) and small zones (e.g. Output Areas).
MSOAs representing York are shown below.

```{r}
zones = pct::get_pct_zones(region = "north-yorkshire", geography = "msoa")
zones = zones |>
  filter(lad_name == "York")
zones |>
  transmute(active_trips = bicycle + foot) |>
  plot()
```

```{r}
#| echo: false
#| eval: false
sf::write_sf(zones, "input/zones.geojson")
```


## Origin destination data

OD data has the following structure:

```{r}
#| echo: false
#| eval: false
if(file.exists("od_national.csv")) {
    od_national = readr::read_csv("od_national.csv")
} else {
    od_national = pct::get_od()
    readr::write_csv(od_national, "od_national.csv")
}
od_york = od_national |>
  filter(geo_code1 %in% zones$geo_code) |>
  filter(geo_code2 %in% zones$geo_code)
write_csv(od_york, "input/od.csv")
```


```{r}
od = read_csv("input/od.csv")
od |>
  slice(seq(10)) |>
  knitr::kable()
```

This dataset was extracted from the following open access endpoint: https://s3-eu-west-1.amazonaws.com/statistics.digitalresources.jisc.ac.uk/dkan/files/FLOW/wu03ew_v2/wu03ew_v2.zip

The OD dataset can be visualised in a more policy-relevant way, as illustrated in the next section.

## Data on origins and destinations

A key dataset type for simulating trips not covered by available OD data is data on trip origins (e.g. representing residential areas and population estimates) and 'trip attractors'.
These can be obtained from OSM.
These datasets typically have pont and polygon geometries and numerous features that can feed into trip generation models, a subsection of the enxt section.

There are currently datasets representing origins and destinations in this repo, something that may change in the future.

# Processing steps

## Desire line generation

OD data can be effectively represented as desire lines, as follows (see `output/desire_lines.geojson`):

```{r desirelines}
desire_lines = od::od_to_sf(od, zones)
# sf::write_sf(desire_lines, "output/desire_lines.geojson", delete_dsn = TRUE)
tm_shape(zones) +
  tm_borders() +
  tm_shape(desire_lines) +
  tm_lines(lwd = "all", scale = 9)
```

Clearly this is an oversimplification.
The section on 'jittering' demonstrates how disaggregation and setting weighted random start and end points can lead to more realistic desire lines and route networks.

## Trip generation

Trip generation is the process of estimating the number of trips between origins and destinations.
It can be done using spatial interaction models.

## Jittering

Jittering, sometimes combined with disaggregation of desire lines representing many trips (above a threshold number of trips that can be set by the developer iteratively) distributes start and end points more evenly across origin and destination zones.

## Routing

The outcome of routing the desire lines shown above is shown below.

```{r}
#| eval: false
routes = route(l = desire_lines, route_fun = route_osrm)
sf::write_sf(routes, "routes_full.geojson")
```

```{bash}
#| eval: false
gh release upload v0 routes_full.geojson
```

```{r}
system("gh release download v0 --pattern routes")
```

```{r}
routes = sf::read_sf("routes_full.geojson")
```

The routes illustrated in the figure above and saved in `routes_full.geojson` in the repo's releases took around 4 minutes to calculate for `r nrow(desire_lines)` using OSRM's public facing instance.
That works out at around `r 60*4 / nrow(desire_lines)`, not very fast, we can surely do better!

Another issue with the routes dataset represented below is that there is only a single geometry and set of features for the entirety of each route: segment level outputs from routing engines are more policy relevent.

```{r}
tm_shape(routes) +
  tm_lines(lwd = "all", scale = 9, alpha = 0.5)
```

## Uptake functions

Uptake functions model change in transport behaviour.
They can be combined with scenarios representing changes in travel demand.

## Route network generation

In the plot of routes above there are many overlapping lines.
To overcome this problem the 'overline' function can be used to generate a cohesive route network.
The results are shown below, the output data can be found in the `output` folder.

```{r}
rnet = overline(routes, "foot")
tm_shape(rnet) +
  tm_lines(lwd = "foot", scale = 9)
sf::write_sf(rnet, "output/rnet.geojson", delete_dsn = TRUE)
```

A limitation of the current implementation is that OSM ids are lost.

## Visualisation

# Outputs