diff --git a/practicals/2-od.Rmd b/practicals/2-od.Rmd index 118a38d..0ab4ca7 100644 --- a/practicals/2-od.Rmd +++ b/practicals/2-od.Rmd @@ -1,7 +1,7 @@ --- title: "Origin-destination data" subtitle: '
Practical' -author: "Robin Lovelace" +author: "Malcolm Morgan and Robin Lovelace" date: 'University of Leeds `r # Sys.Date()()`
' output: github_document: @@ -9,119 +9,170 @@ output: bibliography: ../tds.bib --- +# Review Homework + +You should now be familiar with the basics of R and the `tidyverse`. If you have not completed these tasks go back and do them first: + +- Read Chapters 2, 3, and 4 of [Reproducible road safety research with R](https://itsleeds.github.io/rrsrr/basics.html) +- Read Chapters 3 and 5 of [R for Data Science](https://r4ds.had.co.nz/data-visualisation.html) + +# Getting started with GIS in R + +Note that this practical takes sections from Chapters 2 - 8 of [Geocomputation with R](https://r.geocompx.org). You should expand your knowledge by reading these chapters in full. ## Pre-requisites {-} You need to have a number of packages installed and loaded. -Install the packages by typing in the following commands into RStudio (you do not need to add the comments after the `#` symbol):^[ -Note: if you want to install the development version of a package from GitHub, you can do so. -Try, for example, running the following command: `remotes::install_github("ITSLeeds/pct")` -] +Install the packages by typing in the following commands into RStudio (you do not need to add the comments after the `#` symbol) -```{r, eval=FALSE} -install.packages("remotes") -pkgs = c( - "nycflights13",# data package - "pct", # package for getting travel data in the UK - "sf", # spatial data package - "stats19", # downloads and formats open stats19 crash data - "stplanr", # for working with origin-destination and route data - "tidyverse", # a package for user friendly data science - "tmap" # for making maps -) -remotes::install_cran(pkgs) -remotes::install_github("nowosad/spDataLarge") +If you need to install any of these packages use: + +```{r echo = T, results = 'hide', eval = FALSE} +install.packages("sf") # Install a package from CRAN +remotes::install_github("Nowosad/spDataLarge") # install from GitHub using the remotes package +``` + +```{r echo = T, results = 'hide', warning=FALSE, message=FALSE} +library(sf) # vector data package +library(tidyverse) # tidyverse packages +``` + +- It relies on **spData**, which loads datasets used in the code examples of this chapter: + +```{r 03-attribute-operations-2, echo = T, results = 'hide',warning=FALSE, message=FALSE} +library(spData) # spatial data package +``` + +1. Check your packages are up-to-date with `update.packages()` +1. Create an RStudio project with an appropriate name for this session (e.g. `practical2`) +1. Create appropriate folders for code, data and anything else (e.g. images) +1. Create a script called `learning-OD.R`, e.g. with the following command: + +```{r, eval = F, echo = T, results = 'hide'} +dir.create("code") # +file.edit("code/learning-OD.R") ``` +## Basic sf operations -Load the tidyverse package as follows: +We will start with a simple map of the world. Load the `world` object from the `spData` package. Notice the use of `::` to say that you want the `world` object from the `spData` package. + +```{r, echo = T, results = 'hide'} +world = spData::world +``` + +Use some basic R functions to explore the `world` object. e.g. `class(world)`, `dim(world)`, `head(world)`, `summary(world)`. Also view the `world` object by clicking on it in the Environment panel. + +`sf` objects can be plotted with `plot()`. + +```{r, warning=FALSE} +plot(world) +``` + +Note that this makes a map of each column in the data frame. Try some other plotting options ```{r} -library(tidyverse) +plot(world[3:6]) +plot(world["pop"]) ``` -# Project set-up and tidyverse testing +## Basic spatial operations -1. Check your packages are up1.to-date with `update.packages()` -1. Create an RStudio project with an appropriate name for this module (e.g. `TDSmodule`) -1. Create appropriate files for code, data and anything else (e.g. images) -1. Create a script called `learning-tidyverse.R`, e.g. with the following command: +Load the `nz` and `nz_height` datasets from the `spData` package. -```r -dir.create("code") # -file.edit("code/learning-tidyverse.R") +```{r, echo = T, results = 'hide'} +nz = spData::nz +nz_height = spData::nz_height +``` + +We can use `tidyverse` functions like `filter` and `select` on `sf` objects in the same way you did in Practical 1. + +```{r, echo = T, results = 'hide'} +canterbury = nz %>% filter(Name == "Canterbury") +canterbury_height = nz_height[canterbury, ] ``` -# Getting started with transport data +In this case we filtered the `nz` object to only include places called `Canterbury` and then did and intersection to find objects in the `nz_height` object that are in Canterbury. + +This syntax is not very clear. But is the equivalent to -We're going to start by looking at the main types of transport data:^[ -Note: if you want to get zone data for a different region you can do so, e.g. with: -`zones = sf::read_sf("https://github.com/npct/pct-outputs-regional-notR/raw/master/commute/msoa/west-yorkshire/z.geojson")` -] +```{r, echo = T, eval=FALSE} +canterbury_height = nz_height[canterbury, , op = st_intersects] +``` + +There are many different types of relationships you can use with `op`. Try `?st_intersects()` to see more. For example this would give all the places not in Canterbury + +```{r, eval=FALSE} +nz_height[canterbury, , op = st_disjoint] +``` + +![Topological relations between vector geometries, inspired by Figures 1 and 2 in Egenhofer and Herring (1990). The relations for which the function(x, y) is true are printed for each geometry pair, with x represented in pink and y represented in blue. The nature of the spatial relationship for each pair is described by the Dimensionally Extended 9-Intersection Model string. ](https://r.geocompx.org/figures/relations-1.png) + + +# Getting started with OD data In this section we will look at basic transport data in the R package **stplanr**. -Attach the `tidyverse`, `stplanr` and `sf` packages as follows: +Load the `stplanr` package as follows: -```{r} -library(tidyverse) +```{r, echo = T, results = 'hide'} library(stplanr) -library(sf) ``` The `stplanr` package contains some data that we can use to demonstrate principles in Data Science, illustrated in the Figure below. Source: Chapter 1 of R for Data Science [@grolemund_r_2016] [available online](https://r4ds.had.co.nz/introduction.html). ![](https://d33wubrfki0l68.cloudfront.net/571b056757d68e6df81a3e3853f54d3c76ad6efc/32d37/diagrams/data-science.png) -```{r} -# import + +First we will load some sample data: + +```{r, echo=FALSE} od_data = stplanr::od_data_sample ``` -```{r} -# tidy +You can click on the data in the environment panel to view it or use `head(od_data)` +Now we will rename one of the columns from `foot` to `walk` + +```{r, echo=FALSE} od_data = od_data %>% rename(walk = foot) ``` +Next we will made a new dataset `od_data_walk` by taking `od_data` and piping it (`%>%`) to `filter` the data frame to only include rows where `walk > 0`. Then `select` a few of the columns and calculate two new columns `proportion_walk` and `proportion_drive`. -```{r} -# transform +```{r, echo=FALSE} od_data_walk = od_data %>% filter(walk > 0) %>% select(geo_code1, geo_code2, all, car_driver, walk) %>% mutate(proportion_walk = walk / all, proportion_drive = car_driver / all) ``` +We can use the generic `plot` function to view the relationships between variables + ```{r} -# visualise plot(od_data_walk) ``` -```{r} -# model +R has built in modelling functions such as `lm` lets make a simple model to predict the proportion of people who walk based on the proportion of people who drive. + +```{r, echo=FALSE} model1 = lm(proportion_walk ~ proportion_drive, data = od_data_walk) od_data_walk$proportion_walk_predicted = model1$fitted.values ``` +We can use the `ggplot2` package to graph our model predictions. + ```{r} -# visualise ggplot(od_data_walk) + geom_point(aes(proportion_drive, proportion_walk)) + geom_line(aes(proportion_drive, proportion_walk_predicted)) ``` -```{r} -# transform -# ... -``` - Exercises 1. What is the class of the data in `od_data`? -1. Subset (filter) the data to only include OD pairs in which at least one person (`> 0`) person walks (bonus: on what % of the OD pairs does at least 1 person walk?) -2. Calculate the percentage who cycle in each OD pair in which at least 1 person cycles -3. Is there a positive relationship between walking and cycling in the data? -4. Plot the zones representing the `geo_code` variables in the OD data +2. Subset (filter) the data to only include OD pairs in which at least one person (`> 0`) person walks (bonus: on what % of the OD pairs does at least 1 person walk?) +3. Calculate the percentage who cycle in each OD pair in which at least 1 person cycles +4. Is there a positive relationship between walking and cycling in the data? 5. Bonus: use the function `od2line()` in to convert the OD dataset into geographic desire lines ```{r, echo=FALSE, eval=FALSE} @@ -133,7 +184,7 @@ class(od_data) #2 od_data_walk = od_data %>% filter(walk > 0) -nrow(od_data_walk) / nrow(od_data) +nrow(od_data_walk) / nrow(od_data) * 100 ``` ```{r, echo=FALSE, eval=FALSE} @@ -157,117 +208,85 @@ ggplot(od_data_new) + geom_line(aes(bicycle, walk_predicted)) ``` -```{r, echo=FALSE, eval=FALSE} -#5 -zones = sf::read_sf("https://github.com/npct/pct-outputs-regional-notR/raw/master/commute/msoa/west-yorkshire/z.geojson") -zones_leeds = zones %>% - filter(lad_name == "Leeds") -plot(zones_leeds$geometry) -``` ```{r, echo=FALSE, eval=FALSE} -#6 +#5 desire_lines = od2line(flow = od_data, zones) plot(desire_lines) ``` # Processing origin-destination data in Bristol -This section is based on Chapter 12 of Geocomputation with R: https://geocompr.robinlovelace.net/transport.html +This section is based on [Chapter 12 of Geocomputation with R](https://geocompr.robinlovelace.net/transport.html). You should read this chapter in full in your own time. + +We need the `stplanr` package which provides many useful functions for transport analysis and `tmap` package which enables advanced mapping features. + +```{r, echo = T, results = 'hide', warning=FALSE, message=FALSE} +library(stplanr) +library(tmap) +``` -The task is to reproduce the results shown in that chapter on your own computer. -Some code to get started on a subset of the data is shown below. -Start with a medium-sized dataset: +We will start by loading two datasets: ```{r} -# import od = spDataLarge::bristol_od -head(od) +zones = spDataLarge::bristol_zones ``` -```{r} -# tidy -zones = spDataLarge::bristol_zones -zones = zones %>% - mutate(local_authority = word(string = name, 1)) -plot(zones %>% select(local_authority), key.pos = 1) +Explore these datasets using the functions you have already learnt (e.g. `head`,`nrow`). + +You will notice that the `od` datasets has shared id values with the `zones` dataset. We can use these to make desire lines between each zone. But first we must filter out trips that start and end in the same zone. + +```{r, echo = T, results = 'hide', warning=FALSE, message=FALSE} +od_inter = filter(od, o != d) +desire_lines = od2line(od_inter, zones) ``` +Let's calculate the percentage of trips that are made by active travel -```{r, eval=FALSE, echo=FALSE} -# Find central data -# bristol_centre = geo_code("bristol") -#> [1] -2.597298 51.453802 +```{r, echo = T, results = 'hide'} +desire_lines$Active = (desire_lines$bicycle + desire_lines$foot) / + desire_lines$all * 100 ``` -```{r} -# transform -bristol_sf = tmaptools::geocode_OSM("bristol", as.sf = TRUE, return.first.only = T, geometry = "point") -mapview::mapview(bristol_sf) -bristol_buffer_10km = geo_buffer(bristol_sf, dist = 10000) -zones_central = zones[bristol_buffer_10km, , op = sf::st_within] -# visualise -mapview::mapview(zones_central) +Now use `tmap` to make a plot showing the number of trips and the percentage of people using active travel. + +```{r, echo = T, results = 'hide', warning=FALSE, message=FALSE} +desire_lines = desire_lines[order(desire_lines$Active),] + +tm_shape(desire_lines) + # Define the data frame used to make the map + tm_lines(col = "Active", # We want to map lines, the colour (col) is based on the "Active" column + palette = "plasma", # Select a colour palette + alpha = 0.7, # Make lines slightly transparent + lwd = "all") + # The line width (lwd) is based on the "all" column + tm_layout(legend.outside = TRUE) + # Move the ledgend outside the map + tm_scale_bar() # Add a scale bar to the map ``` +Now that we have geometry attached to our data we can calculate other variables of interest. For example let's calculate the distacne travelled and see if it relates to the percentage of people who use active travel. ```{r} -# transform -od_central = od %>% - filter(o %in% zones_central$geo_code) %>% - filter(d %in% zones_central$geo_code) -nrow(od_central) / nrow(od) -desire_lines = od2line(od_central, zones_central) desire_lines$distance_direct_m = as.numeric(st_length(desire_lines)) -desire_lines = desire_lines %>% - mutate(proportion_active = (bicycle + foot) / all) ``` -```{r, fig.show='hold', out.width="40%"} -# visualise -ggplot(desire_lines) + - geom_point(aes(distance_direct_m, proportion_active)) +Note the use of `as.numeric` by default `st_length` and many other functions return a special type of result with `unit`. Here we force the results back into the basic R numerical value. But be careful! The units you get back depend on the coordinate reference system, so check your data before you assume what values mean. + +```{r, warning=FALSE, message=FALSE} ggplot(desire_lines) + - geom_point(aes(distance_direct_m, proportion_active, size = all), alpha = 0.3) + geom_point(aes(x = distance_direct_m, y = Active, size = all)) + + geom_smooth(aes(x = distance_direct_m, y = Active)) ``` -```{r} -# model/visualise -m1 = lm(proportion_active ~ - distance_direct_m + I(distance_direct_m^2), - data = desire_lines) -desire_lines = desire_lines %>% - mutate( - new_active_travel = m1$fitted.values * car_driver, - new_total_active = new_active_travel + bicycle + foot, - new_proportion_active = new_total_active / all - ) %>% - arrange(proportion_active) -ggplot(desire_lines) + - geom_point(aes(distance_direct_m, proportion_active, size = all), alpha = 0.3) + - geom_point(aes(distance_direct_m, new_proportion_active, size = all), alpha = 0.3, colour = "blue") +The blue line is a smoothed average of the data. It shows a common concept in transport research, the distance decay curve. In this case it shows that the longer the journey the less likely people are to use active travel. But this concept applies to all kinds of travel decisions. For example you are more likely to travel to a nearby coffee shop than a far away coffee shop. Different types of trip have different curves, but most people always have a bias for shorter trips. -``` -```{r} -# visualise -ggplot(desire_lines) + - geom_sf(aes(colour = new_proportion_active, alpha = all)) -``` +# Homework -```{r} -library(tmap) -tm_shape(desire_lines) + - tm_lines(palette = "-viridis", breaks = c(0, 5, 10, 20, 40, 100) / 100, - lwd = "all", - scale = 9, - title.lwd = "Number of trips", - alpha = 0.6, - col = c("proportion_active", "new_proportion_active"), - title = "Active travel (%)" - ) + - tm_scale_bar() +1. Read Chapters 2-5 of [Geocomputation with R](https://r.geocompx.org/transport.html) +2. Work though Sections 13.1 to 13.4 of the Transport Chapter in [Geocomputation with R](https://r.geocompx.org/transport.html) +3. Bonus: Read more about using the [tmap package](https://r-tmap.github.io/tmap/) +4. Bonus: Read more about the [ggplot2 package](https://ggplot2.tidyverse.org/) +5. Bonus: Read Chapter 7 & 8 of [Geocomputation with R](https://r.geocompx.org/transport.html) -``` -4) Try mapping OD data for West Yorkshire in preparation for the next practical on routing +# References diff --git a/practicals/2-od.md b/practicals/2-od.md new file mode 100644 index 0000000..e5cc353 --- /dev/null +++ b/practicals/2-od.md @@ -0,0 +1,322 @@ +Origin-destination data +================ +Malcolm Morgan and Robin Lovelace +University of Leeds +
+ +# 1 Review Homework + +You should now be familiar with the basics of R and the `tidyverse`. If +you have not completed these tasks go back and do them first: + +- Read Chapters 2, 3, and 4 of [Reproducible road safety research with + R](https://itsleeds.github.io/rrsrr/basics.html) +- Read Chapters 3 and 5 of [R for Data + Science](https://r4ds.had.co.nz/data-visualisation.html) + +# 2 Getting started with GIS in R + +Note that this practical takes sections from Chapters 2 - 8 of +[Geocomputation with R](https://r.geocompx.org). You should expand your +knowledge by reading these chapters in full. + +## Pre-requisites + +You need to have a number of packages installed and loaded. Install the +packages by typing in the following commands into RStudio (you do not +need to add the comments after the `#` symbol) + +If you need to install any of these packages use: + +``` r +install.packages("sf") # Install a package from CRAN +remotes::install_github("Nowosad/spDataLarge") # install from GitHub using the remotes package +``` + +``` r +library(sf) # vector data package +library(tidyverse) # tidyverse packages +``` + +- It relies on **spData**, which loads datasets used in the code + examples of this chapter: + +``` r +library(spData) # spatial data package +``` + +1. Check your packages are up-to-date with `update.packages()` +2. Create an RStudio project with an appropriate name for this session + (e.g. `practical2`) +3. Create appropriate folders for code, data and anything else + (e.g. images) +4. Create a script called `learning-OD.R`, e.g. with the following + command: + +``` r +dir.create("code") # +file.edit("code/learning-OD.R") +``` + +## 2.1 Basic sf operations + +We will start with a simple map of the world. Load the `world` object +from the `spData` package. Notice the use of `::` to say that you want +the `world` object from the `spData` package. + +``` r +world = spData::world +``` + +Use some basic R functions to explore the `world` object. +e.g. `class(world)`, `dim(world)`, `head(world)`, `summary(world)`. Also +view the `world` object by clicking on it in the Environment panel. + +`sf` objects can be plotted with `plot()`. + +``` r +plot(world) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-5-1.png) + +Note that this makes a map of each column in the data frame. Try some +other plotting options + +``` r +plot(world[3:6]) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-6-1.png) + +``` r +plot(world["pop"]) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-6-2.png) + +## 2.2 Basic spatial operations + +Load the `nz` and `nz_height` datasets from the `spData` package. + +``` r +nz = spData::nz +nz_height = spData::nz_height +``` + +We can use `tidyverse` functions like `filter` and `select` on `sf` +objects in the same way you did in Practical 1. + +``` r +canterbury = nz %>% filter(Name == "Canterbury") +canterbury_height = nz_height[canterbury, ] +``` + +In this case we filtered the `nz` object to only include places called +`Canterbury` and then did and intersection to find objects in the +`nz_height` object that are in Canterbury. + +This syntax is not very clear. But is the equivalent to + +``` r +canterbury_height = nz_height[canterbury, , op = st_intersects] +``` + +There are many different types of relationships you can use with `op`. +Try `?st_intersects()` to see more. For example this would give all the +places not in Canterbury + +``` r +nz_height[canterbury, , op = st_disjoint] +``` + +![Topological relations between vector geometries, inspired by Figures 1 +and 2 in Egenhofer and Herring (1990). The relations for which the +function(x, y) is true are printed for each geometry pair, with x +represented in pink and y represented in blue. The nature of the spatial +relationship for each pair is described by the Dimensionally Extended +9-Intersection Model +string.](https://r.geocompx.org/figures/relations-1.png) + +# 3 Getting started with OD data + +In this section we will look at basic transport data in the R package +**stplanr**. + +Load the `stplanr` package as follows: + +``` r +library(stplanr) +``` + + ## Warning: package 'stplanr' was built under R version 4.2.2 + +The `stplanr` package contains some data that we can use to demonstrate +principles in Data Science, illustrated in the Figure below. Source: +Chapter 1 of R for Data Science (Grolemund and Wickham 2016) [available +online](https://r4ds.had.co.nz/introduction.html). + +![](https://d33wubrfki0l68.cloudfront.net/571b056757d68e6df81a3e3853f54d3c76ad6efc/32d37/diagrams/data-science.png) + +First we will load some sample data: + +You can click on the data in the environment panel to view it or use +`head(od_data)` Now we will rename one of the columns from `foot` to +`walk` + +Next we will made a new dataset `od_data_walk` by taking `od_data` and +piping it (`%>%`) to `filter` the data frame to only include rows where +`walk > 0`. Then `select` a few of the columns and calculate two new +columns `proportion_walk` and `proportion_drive`. + +We can use the generic `plot` function to view the relationships between +variables + +``` r +plot(od_data_walk) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-15-1.png) + +R has built in modelling functions such as `lm` lets make a simple model +to predict the proportion of people who walk based on the proportion of +people who drive. + +We can use the `ggplot2` package to graph our model predictions. + +``` r +ggplot(od_data_walk) + + geom_point(aes(proportion_drive, proportion_walk)) + + geom_line(aes(proportion_drive, proportion_walk_predicted)) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-17-1.png) + +Exercises + +1. What is the class of the data in `od_data`? +2. Subset (filter) the data to only include OD pairs in which at least + one person (`> 0`) person walks (bonus: on what % of the OD pairs + does at least 1 person walk?) +3. Calculate the percentage who cycle in each OD pair in which at least + 1 person cycles +4. Is there a positive relationship between walking and cycling in the + data? +5. Bonus: use the function `od2line()` in to convert the OD dataset + into geographic desire lines + +# 4 Processing origin-destination data in Bristol + +This section is based on [Chapter 12 of Geocomputation with +R](https://geocompr.robinlovelace.net/transport.html). You should read +this chapter in full in your own time. + +We need the `stplanr` package which provides many useful functions for +transport analysis and `tmap` package which enables advanced mapping +features. + +``` r +library(stplanr) +library(tmap) +``` + +We will start by loading two datasets: + +``` r +od = spDataLarge::bristol_od +zones = spDataLarge::bristol_zones +``` + +Explore these datasets using the functions you have already learnt +(e.g. `head`,`nrow`). + +You will notice that the `od` datasets has shared id values with the +`zones` dataset. We can use these to make desire lines between each +zone. But first we must filter out trips that start and end in the same +zone. + +``` r +od_inter = filter(od, o != d) +desire_lines = od2line(od_inter, zones) +``` + +Let’s calculate the percentage of trips that are made by active travel + +``` r +desire_lines$Active = (desire_lines$bicycle + desire_lines$foot) / + desire_lines$all * 100 +``` + +Now use `tmap` to make a plot showing the number of trips and the +percentage of people using active travel. + +``` r +desire_lines = desire_lines[order(desire_lines$Active),] + +tm_shape(desire_lines) + # Define the data frame used to make the map + tm_lines(col = "Active", # We want to map lines, the colour (col) is based on the "Active" column + palette = "plasma", # Select a colour palette + alpha = 0.7, # Make lines slightly transparent + lwd = "all") + # The line width (lwd) is based on the "all" column + tm_layout(legend.outside = TRUE) + # Move the ledgend outside the map + tm_scale_bar() # Add a scale bar to the map +``` + +![](2-od_files/figure-gfm/unnamed-chunk-27-1.png) + +Now that we have geometry attached to our data we can calculate other +variables of interest. For example let’s calculate the distacne +travelled and see if it relates to the percentage of people who use +active travel. + +``` r +desire_lines$distance_direct_m = as.numeric(st_length(desire_lines)) +``` + +Note the use of `as.numeric` by default `st_length` and many other +functions return a special type of result with `unit`. Here we force the +results back into the basic R numerical value. But be careful! The units +you get back depend on the coordinate reference system, so check your +data before you assume what values mean. + +``` r +ggplot(desire_lines) + + geom_point(aes(x = distance_direct_m, y = Active, size = all)) + + geom_smooth(aes(x = distance_direct_m, y = Active)) +``` + +![](2-od_files/figure-gfm/unnamed-chunk-29-1.png) + +The blue line is a smoothed average of the data. It shows a common +concept in transport research, the distance decay curve. In this case it +shows that the longer the journey the less likely people are to use +active travel. But this concept applies to all kinds of travel +decisions. For example you are more likely to travel to a nearby coffee +shop than a far away coffee shop. Different types of trip have different +curves, but most people always have a bias for shorter trips. + +# 5 Homework + +1. Read Chapters 2-5 of [Geocomputation with + R](https://r.geocompx.org/transport.html) +2. Work though Sections 13.1 to 13.4 of the Transport Chapter in + [Geocomputation with R](https://r.geocompx.org/transport.html) +3. Bonus: Read more about using the [tmap + package](https://r-tmap.github.io/tmap/) +4. Bonus: Read more about the [ggplot2 + package](https://ggplot2.tidyverse.org/) +5. Bonus: Read Chapter 7 & 8 of [Geocomputation with + R](https://r.geocompx.org/transport.html) + +# 6 References + +
+ +
+ +Grolemund, Garrett, and Hadley Wickham. 2016. *R for Data Science*. +O’Reilly Media. + +
+ +
diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-15-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-15-1.png new file mode 100644 index 0000000..64a8610 Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-15-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-17-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-17-1.png new file mode 100644 index 0000000..a8a627a Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-17-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-27-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-27-1.png new file mode 100644 index 0000000..9da2fdb Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-27-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-29-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-29-1.png new file mode 100644 index 0000000..55b7ae6 Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-29-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-5-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-5-1.png new file mode 100644 index 0000000..df380f8 Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-5-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-6-1.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-6-1.png new file mode 100644 index 0000000..2b5bdb7 Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-6-1.png differ diff --git a/practicals/2-od_files/figure-gfm/unnamed-chunk-6-2.png b/practicals/2-od_files/figure-gfm/unnamed-chunk-6-2.png new file mode 100644 index 0000000..e4c35ba Binary files /dev/null and b/practicals/2-od_files/figure-gfm/unnamed-chunk-6-2.png differ diff --git a/practicals/6-routing.Rmd b/practicals/6-routing.Rmd index 9657a75..7d1a019 100644 --- a/practicals/6-routing.Rmd +++ b/practicals/6-routing.Rmd @@ -69,6 +69,10 @@ piggyback::pb_download_url("otp_TDS.zip") ```{r, eval=FALSE, engine='bash', echo=FALSE} +# Download the zip file +wget https://github.com/ITSLeeds/TDS/releases/download/0.20.1/otp_TDS.zip +# Unzip it: +unzip otp_TDS.zip # Install Java 8: sudo apt-get install openjdk-8-jdk @@ -80,14 +84,30 @@ java ls otp_TDS sudo ufw allow 8080 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx10000M -d64 -jar "otp_TDS/otp-1.5.0-shaded.jar" --router west-yorkshire --graphs "otp_TDS/graphs" --server --port 8080 --securePort 8081 -# ip = "81.2052.102.153:154" +# ip = "You IP address" +# Same command but with nohup to run in the background: +nohup /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx10000M -d64 -jar "otp_TDS/otp-1.5.0-shaded.jar" --router west-yorkshire --graphs "otp_TDS/graphs" --server --port 8080 --securePort 8081 & +``` + +```{r, eval=FALSE, echo=FALSE} +path_otp = "otp_TDS/otp-1.5.0-shaded.jar" +path_data = "otp_TDS" +list.files(file.path(path_data, "graphs", "west-yorkshire")) +# Download pbf file with osmextract +osmextract::oe_get("West Yorkshire") +list.files(tempdir()) +# pbf_file +pbf_file = list.files(tempdir(), pattern = "osm.pbf", full.names = TRUE) +file.copy(pbf_file, file.path(path_data, "graphs", "west-yorkshire", "osm.pbf"), overwrite = TRUE) +log1 = otp_build_graph(otp = path_otp, dir = path_data, otp_version = 1.5, router = "west-yorkshire") + ``` To allow R to connect to the OpenTripPlanner server, we will use the `opentripplanner` package and the function `otp_connect`. ```{r, eval=FALSE, message=FALSE, warning=FALSE} -ip = "xx.xxx.xxx.xxx" # See the link on Minerva for correct value -port = 0000 # See the link on Minerva for correct value +ip = "192.168.0.11" # See the link on Minerva for correct value +port = 8080 # See the link on Minerva for correct value otpcon = otp_connect(hostname = ip, port = port, router = "west-yorkshire") @@ -135,7 +155,7 @@ head(centroids) 2. Plot the `desire_lines` and `centroids` objects using the `tmap` to show the number of travellers on each desire_line and the locations of all centroids. ```{r, message=FALSE, echo=TRUE} -tmap_mode("plot") #Change to view for interactive map +tmap_mode("plot") # Change to view for interactive map tm_shape(desire_lines) + tm_lines(lwd = "all", col = "all", scale = 4, palette = "-viridis") + tm_shape(centroids) + @@ -166,14 +186,14 @@ desire_top = desire_lines %>% ```{r, eval=TRUE, echo=TRUE} # Extract the start and end points -fromPlace <- lwgeom::st_startpoint(desire_top) -toPlace <- lwgeom::st_endpoint(desire_top) +fromPlace = lwgeom::st_startpoint(desire_top) +toPlace = lwgeom::st_endpoint(desire_top) # This returns just the geometry # So make it into an sf data.frame with the ID values from desire_top -fromPlace <- st_sf(data.frame(id = desire_top$from, geometry = fromPlace)) -toPlace <- st_sf(data.frame(id = desire_top$to, geometry = toPlace)) +fromPlace = st_sf(data.frame(id = desire_top$from, geometry = fromPlace)) +toPlace = st_sf(data.frame(id = desire_top$to, geometry = toPlace)) ``` @@ -192,6 +212,17 @@ routes_drive_top = otp_plan(otpcon = otpcon, ``` ```{r, message=FALSE, eval=TRUE, echo=FALSE} +# Save as RDS: +# saveRDS(routes_drive_top, "routes_drive_top.Rds") +# piggyback::pb_upload("routes_drive_top.Rds") +# Download the file. +# files = piggyback::pb_list() +# Search for the file: +# files = files[grepl("routes_drive_top", files$file_name),] +# u = piggyback::pb_download_url("routes_drive_top.Rds") +u = "https://github.com/ITSLeeds/TDS/releases/download/23/routes_drive_top.Rds" +f = basename(u) +if (!file.exists(f)) download.file(u, f) routes_drive_top = readRDS("routes_drive_top.Rds") ``` @@ -218,6 +249,17 @@ tm_shape(isochrone) + tm_fill("time", alpha = 0.6) ``` +```{r, message=FALSE, eval=FALSE, echo=FALSE} +# Save the isochrone as RDS +saveRDS(isochrone, "isochrone.Rds") +piggyback::pb_upload("isochrone.Rds") +``` + +```{r, message=FALSE, eval=TRUE, echo=FALSE} +u = "https://github.com/ITSLeeds/TDS/releases/download/23/isochrone.Rds" +if (!file.exists("isochrone.Rds")) download.file(u, "isochrone.Rds") +``` + ```{r, message=FALSE, eval=TRUE, echo=FALSE} isochrone = readRDS("isochrone.Rds") isochrone$time = isochrone$time / 60 diff --git a/practicals/6-routing.md b/practicals/6-routing.md index cfe9c77..a177028 100644 --- a/practicals/6-routing.md +++ b/practicals/6-routing.md @@ -32,8 +32,11 @@ Yorkshire. Try the link on Minerva.
Minerva Link +

+ Minerva Link +

@@ -43,8 +46,11 @@ You should see something like this:
OTP Web GUI +

+ OTP Web GUI +

@@ -63,8 +69,8 @@ To allow R to connect to the OpenTripPlanner server, we will use the `opentripplanner` package and the function `otp_connect`. ``` r -ip = "xx.xxx.xxx.xxx" # See the link on Minerva for correct value -port = 0000 # See the link on Minerva for correct value +ip = "192.168.0.11" # See the link on Minerva for correct value +port = 8080 # See the link on Minerva for correct value otpcon = otp_connect(hostname = ip, port = port, router = "west-yorkshire") @@ -85,11 +91,11 @@ routes_test = otp_plan(otpcon = otpcon, You can use multiple modes and combinations try: -- `mode = "WALK"` -- `mode = c("WALK","TRANSIT")` -- `mode = c("BICYCLE","TRANSIT")` -- `mode = "CAR"` -- `mode = c("CAR_PARK","TRANSIT")` + - `mode = "WALK"` + - `mode = c("WALK","TRANSIT")` + - `mode = c("BICYCLE","TRANSIT")` + - `mode = "CAR"` + - `mode = c("CAR_PARK","TRANSIT")` To get some more routes, we will start by importing some data. The `NTEM_flow.geojson` dataset the contains the top desire lines in West @@ -119,7 +125,7 @@ head(desire_lines) ## 4 E02002442 E02002441 1747 349 168 906 62 0 262 ## 5 E02002447 E02002448 4930 70 36 4162 98 0 564 ## 6 E02006876 E02006875 10314 1854 942 4680 251 0 2587 - ## # … with 1 more variable: geometry + ## # ℹ 1 more variable: geometry We will also download the points that represent the possible start and end point of trips in the model @@ -148,24 +154,26 @@ head(centroids) **Exercise** 2. Plot the `desire_lines` and `centroids` objects using the `tmap` to - show the number of travellers on each desire_line and the locations + show the number of travellers on each desire\_line and the locations of all centroids. + + ``` r -tmap_mode("plot") #Change to view for interactive map +tmap_mode("plot") # Change to view for interactive map tm_shape(desire_lines) + tm_lines(lwd = "all", col = "all", scale = 4, palette = "-viridis") + tm_shape(centroids) + tm_dots(col = "red") ``` -![](6-routing_files/figure-gfm/unnamed-chunk-10-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-11-1.png) 3. Produce some different maps for each mode of travel in the `desire_lines` dataset. How do the numbers of travellers change for walking, driving, and train travel? See example plot below. -![](6-routing_files/figure-gfm/unnamed-chunk-11-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-12-1.png) This dataset has desire lines, but most routing packages need start and endpoints, so we will extract the start and endpoints using the package @@ -179,16 +187,18 @@ endpoints, so we will extract the start and endpoints using the package 5. We need to extract start and end point from those desire lines. We would also like to give each place an ID value + + ``` r # Extract the start and end points -fromPlace <- lwgeom::st_startpoint(desire_top) -toPlace <- lwgeom::st_endpoint(desire_top) +fromPlace = lwgeom::st_startpoint(desire_top) +toPlace = lwgeom::st_endpoint(desire_top) # This returns just the geometry # So make it into an sf data.frame with the ID values from desire_top -fromPlace <- st_sf(data.frame(id = desire_top$from, geometry = fromPlace)) -toPlace <- st_sf(data.frame(id = desire_top$to, geometry = toPlace)) +fromPlace = st_sf(data.frame(id = desire_top$from, geometry = fromPlace)) +toPlace = st_sf(data.frame(id = desire_top$to, geometry = toPlace)) ``` 6. Create a new object called `routes_drive_top`, with driving routes @@ -209,15 +219,15 @@ routes_drive_top = otp_plan(otpcon = otpcon, 7. Plot `routes_drive_top` using the `tmap` package mode. You should see something like the image below. - + - ## tmap mode set to plotting + ## tmap mode set to 'plot' ``` r tm_shape(routes_drive_top) + tm_lines() ``` -![](6-routing_files/figure-gfm/unnamed-chunk-17-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-18-1.png) We can also get Isochrones from OTP. @@ -231,7 +241,7 @@ tm_shape(isochrone) + tm_fill("time", alpha = 0.6) ``` -![](6-routing_files/figure-gfm/unnamed-chunk-19-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-22-1.png) To save you time and to prevent overloading the server, we have pre-generated some extra routes. Download these routes and load them @@ -250,14 +260,14 @@ We will now join the number of drivers onto the driving routes. 8. Create a dataset called `n_driver` from `desire_lines` which only have the columns `from` `to` and `drive`. Hint ?dplyr::select and - ?sf::st_drop_geometry + ?sf::st\_drop\_geometry 9. Join the `n_driver` data onto the `routes_drive` data by linking - `fromPlace = from` and `toPlace = to`. Hint ?dplyr::left_join. + `fromPlace = from` and `toPlace = to`. Hint ?dplyr::left\_join. 10. Plot the routes showing the number of drivers on each route. -![](6-routing_files/figure-gfm/unnamed-chunk-24-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-27-1.png) ## Route Networks (also called flow maps) @@ -278,7 +288,7 @@ rnet_drive = overline(routes_drive, "drive") 10. Make a route network for driving and plot it using the `tmap` package. How is is different from just plotting the routes? -![](6-routing_files/figure-gfm/unnamed-chunk-26-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-29-1.png) ## Line Merging @@ -320,7 +330,7 @@ routes_transit_group = rbind(routes_transit_group, routes_transit_group_ml) 12. Plot the transit routes, what do you notice about them? -![](6-routing_files/figure-gfm/unnamed-chunk-30-1.png) +![](6-routing_files/figure-gfm/unnamed-chunk-33-1.png) **Bonus Exercise**: @@ -371,7 +381,7 @@ take. estimate_centrality_time(graph) ``` - ## Estimated time to calculate centrality for full graph is 00:00:07 + ## Estimated time to calculate centrality for full graph is 00:00:02 ``` r centrality = dodgr_centrality(graph) @@ -390,7 +400,11 @@ centrality_sf = dodgr_to_sf(centrality) 14. Plot the centrality of the Isle of Wight road network. What can centrality tell you about a road network? -![](6-routing_files/figure-gfm/unnamed-chunk-34-1.png) + + + ## Deprecated tmap v3 code detected. Code translated to v4 + +![](6-routing_files/figure-gfm/unnamed-chunk-37-1.png) 15. Use `dodgr::dodgr_contract_graph` before calculating centrality, how does this affect the computation time and the results? diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-11-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-11-1.png index 63476bd..b65990d 100644 Binary files a/practicals/6-routing_files/figure-gfm/unnamed-chunk-11-1.png and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-11-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-12-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-12-1.png new file mode 100644 index 0000000..0d81e93 Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-12-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-18-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-18-1.png new file mode 100644 index 0000000..b0ca1cc Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-18-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-22-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-22-1.png new file mode 100644 index 0000000..078f79e Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-22-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-27-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-27-1.png new file mode 100644 index 0000000..9ccfcca Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-27-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-29-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-29-1.png new file mode 100644 index 0000000..0c783a2 Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-29-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-33-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-33-1.png new file mode 100644 index 0000000..dc42c9a Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-33-1.png differ diff --git a/practicals/6-routing_files/figure-gfm/unnamed-chunk-37-1.png b/practicals/6-routing_files/figure-gfm/unnamed-chunk-37-1.png new file mode 100644 index 0000000..40c8556 Binary files /dev/null and b/practicals/6-routing_files/figure-gfm/unnamed-chunk-37-1.png differ