knit | title | author | description | url | bibliography | csl | documentclass | classoption | geometry | monofont | monofontoptions | biblio-style | fig_caption | link-citations | gihub-repo | twitter-handle | cover-image | site |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
bookdown::render_book |
Spatial sampling and resampling for Machine Learning |
Tom Hengl, Leandro Parente, Abdelkrim Bouasria and Ichsani Wheeler |
This R tutorial contains instructions on how to organize spatial sampling using R packages. It is organized in three main parts: (1) planning new surveys: i.e. starting from scratch, (2) implementing resampling: learning from existing point data, focusing on subsampling and Cross-Validation strategies, (3) planning additional sampling: sampling additional point data based on initial models, the running re-analysis and gradually improving models until the maximum possible accuracy is reached. We use sample datasets to demonstrate processing steps and provide interpretation and dicussion of the results. More chapters will be added in the future. Contributions are welcome. To discuss issues or report a bug please use the repository homepage. |
https\://opengeohub.github.io/spatial-sampling-ml/ |
./tex/refs.bib |
./tex/apa.csl |
svmono |
graybox,natbib,nospthms |
paperwidth=18.90cm, paperheight=24.58cm, top=2.1cm, bottom=2.1cm, inner=2cm, outer=2cm |
Source Code Pro |
Scale=0.7 |
spbasic |
true |
true |
OpenGeoHub/spatial-sampling-ml/ |
opengeohub |
cover.png |
bookdown::bookdown_site |
{.cover width="250"} This Rmarkdown tutorial provides practical instructions, illustrated with sample dataset, on how to generate and evaluate sampling plans using your own data. The specific focus is put on preparing sampling designs for predictive mapping, running analysis and interpretation on existing point data and planning 2nd and 3rd round sampling (based on initial models). A similar tutorial focusing on Spatial and spatiotemporal interpolation using Ensemble Machine Learning is also available.
We use several key R packages and existing tutorials including:
- sp package,
- clhs package,
- mlr package,
- ranger package,
- forestError package,
Other packages of interest for producing spatial sampling:
- SamplingBigData package,
- sf package,
- spatstat package(s),
For an introduction to Spatial Data Science and Machine Learning with R we recommend studying first:
- Baddeley, A., Rubak, E. and Turner, R.: “Spatial Point Patterns: Methodology and Applications with R”;
- Becker, M. et al.: “mlr3 book”;
- Irizarry, R.A.: “Introduction to Data Science: Data Analysis and Prediction Algorithms with R”;
- Molnar, C.: “Interpretable Machine Learning: A Guide for Making Black Box Models Explainable”;
- Lovelace, R., Nowosad, J. and Muenchow, J.: “Geocomputation with R”;
- Pebesma, E. and Bivand, R: “Spatial Data Science: with applications in R”;
If you are looking for a more gentle introduction to spatial sampling methods in R please refer to @Bivand2013Springer, @baddeley2015spatial, @BRUS2019464 and @Brus2021sampling. The “Spatial sampling with R” book by Dick Brus and R code examples are available via https://github.com/DickBrus/SpatialSamplingwithR.
For an introduction to Predictive Soil Mapping using R refer to https://soilmapper.org.
Machine Learning in python with resampling can be best implemented via the scikit-learn library, which matches in functionality what is available via the mlr package in R.
To install the most recent landmap, ranger, forestError and clhs packages from Github use:
library(devtools)
devtools::install_github("envirometrix/landmap")
devtools::install_github("imbs-hl/ranger")
devtools::install_github("benjilu/forestError")
devtools::install_github("pierreroudier/clhs")
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This tutorial is based on the “R for Data Science” book by Hadley Wickham and contributors.
OpenLandMap is a collaborative effort and many people have contributed data, software, fixes and improvements via pull request. OpenGeoHub is an independent not-for-profit research foundation promoting Open Source and Open Data solutions. EnvirometriX Ltd. is the commercial branch of the group responsible for designing soil sampling designs for the AgriCapture and similar soil monitoring projects.
AgriCaptureCO2 receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101004282.