Skip to content

Latest commit

 

History

History
102 lines (74 loc) · 6.07 KB

index.md

File metadata and controls

102 lines (74 loc) · 6.07 KB
knit title author description url bibliography csl documentclass classoption geometry monofont monofontoptions biblio-style fig_caption link-citations gihub-repo twitter-handle cover-image site
bookdown::render_book
Spatial sampling and resampling for Machine Learning
Tom Hengl, Leandro Parente, Abdelkrim Bouasria and Ichsani Wheeler
This R tutorial contains instructions on how to organize spatial sampling using R packages. It is organized in three main parts: (1) planning new surveys: i.e. starting from scratch, (2) implementing resampling: learning from existing point data, focusing on subsampling and Cross-Validation strategies, (3) planning additional sampling: sampling additional point data based on initial models, the running re-analysis and gradually improving models until the maximum possible accuracy is reached. We use sample datasets to demonstrate processing steps and provide interpretation and dicussion of the results. More chapters will be added in the future. Contributions are welcome. To discuss issues or report a bug please use the repository homepage.
https\://opengeohub.github.io/spatial-sampling-ml/
./tex/refs.bib
./tex/apa.csl
svmono
graybox,natbib,nospthms
paperwidth=18.90cm, paperheight=24.58cm, top=2.1cm, bottom=2.1cm, inner=2cm, outer=2cm
Source Code Pro
Scale=0.7
spbasic
true
true
OpenGeoHub/spatial-sampling-ml/
opengeohub
cover.png
bookdown::bookdown_site

Introduction {.unnumbered}

Overview {.unnumbered}

DOI

Access source code{.cover width="250"} This Rmarkdown tutorial provides practical instructions, illustrated with sample dataset, on how to generate and evaluate sampling plans using your own data. The specific focus is put on preparing sampling designs for predictive mapping, running analysis and interpretation on existing point data and planning 2nd and 3rd round sampling (based on initial models). A similar tutorial focusing on Spatial and spatiotemporal interpolation using Ensemble Machine Learning is also available.

We use several key R packages and existing tutorials including:

Other packages of interest for producing spatial sampling:

For an introduction to Spatial Data Science and Machine Learning with R we recommend studying first:

If you are looking for a more gentle introduction to spatial sampling methods in R please refer to @Bivand2013Springer, @baddeley2015spatial, @BRUS2019464 and @Brus2021sampling. The “Spatial sampling with R” book by Dick Brus and R code examples are available via https://github.com/DickBrus/SpatialSamplingwithR.

For an introduction to Predictive Soil Mapping using R refer to https://soilmapper.org.

Machine Learning in python with resampling can be best implemented via the scikit-learn library, which matches in functionality what is available via the mlr package in R.

To install the most recent landmap, ranger, forestError and clhs packages from Github use:

library(devtools)
devtools::install_github("envirometrix/landmap")
devtools::install_github("imbs-hl/ranger")
devtools::install_github("benjilu/forestError")
devtools::install_github("pierreroudier/clhs")

License {.unnumbered}

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Acknowledgements {.unnumbered}

alt text This tutorial is based on the “R for Data Science” book by Hadley Wickham and contributors.

OpenLandMap is a collaborative effort and many people have contributed data, software, fixes and improvements via pull request. OpenGeoHub is an independent not-for-profit research foundation promoting Open Source and Open Data solutions. EnvirometriX Ltd. is the commercial branch of the group responsible for designing soil sampling designs for the AgriCapture and similar soil monitoring projects.

EnvirometriX logo

AgriCaptureCO2 receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101004282.