From d98080e8b81f573cfba8cc2726dfbba97c95703a Mon Sep 17 00:00:00 2001 From: asardaes Date: Wed, 18 Jul 2018 22:15:12 +0200 Subject: [PATCH] Update CONTRIBUTING.md --- CONTRIBUTING.md | 63 +++++++++++++++++++++++++++++-------------------- 1 file changed, 37 insertions(+), 26 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index dec45fa2..7ad591ac 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,42 +9,56 @@ implementing a C++ distance with support for multi-threading is also possible. The framework used by `dtwclust` will be explained here, assuming familiarity with R's C/C++ interface and C++ itself. +## Foreword + +One of the packages that is linked against is `RcppArmadillo`. +When one works with it, +the `Rcpp` header should *not* be included, +since its functionality is included automatically by `RcppArmadillo`. +However, the `RcppArmadillo` header increases compilation times considerably, +so it should only be included when necessary. +This is why many `dtwclust` source files contain several classes in the same file, +and it will also explain some of the design choices below; +this reduces compilation times significantly. + ## Stand-alone function -By stand-alone we mean that it can be used directly without going through `proxy`. +Stand-alone means that it can be used directly without going through `proxy`. On the R side there are pretty much no restrictions, the function can have any number of parameters, but it's probably better if consistency checks are done in R. -See for example [`dtw_basic`](https://github.com/asardaes/dtwclust/blob/master/R/DISTANCES-dtw-basic.R#L52). +See for example [`dtw_basic`](https://github.com/asardaes/dtwclust/blob/master/R/DISTANCES-dtw-basic.R). -On the C++ side the distance should be declared in the corresponding [header](https://github.com/asardaes/dtwclust/blob/master/src/distances/distances.h), -registered in the [initialization](https://github.com/asardaes/dtwclust/blob/master/src/init.cpp#L8), -and [defined](https://github.com/asardaes/dtwclust/blob/master/src/distances/dtw-basic.cpp). +On the C++ side the distance should be declared in the corresponding [header](https://github.com/asardaes/dtwclust/blob/master/src/distances/R-gateways.h), +registered in the [initialization](https://github.com/asardaes/dtwclust/blob/master/src/init.cpp), +and [defined](https://github.com/asardaes/dtwclust/blob/master/src/distances/R-gateways.cpp). Importantly, if the stand-alone version will serve as a basis for the `proxy` version, -the [core calculations](https://github.com/asardaes/dtwclust/blob/master/src/distances/dtw-basic.cpp#L88) should be done independently of any R/Rcpp API, +the core calculations should be done independently of any R/Rcpp API, depending either on raw pointers, -[custom wrapper classes](https://github.com/asardaes/dtwclust/blob/master/src/utils/SurrogateMatrix.h), -or `RcppParallel`'s wrappers. -These can be declared in the [internal distances header](https://github.com/asardaes/dtwclust/blob/master/src/distances/distances-details.h). +or on some kind of wrapper; +such a wrapper is available in [this template](https://github.com/asardaes/dtwclust/blob/master/src/utils/SurrogateMatrix.h). +These core functions should be declared in the [internal distances header](https://github.com/asardaes/dtwclust/blob/master/src/distances/details.h). +Since they don't include many third-party headers, +they can be implemented in different files; +see for example [`dtw_basic`](https://github.com/asardaes/dtwclust/blob/master/src/distances/dtw-basic.cpp). ## `proxy` function -For this case, we will start with the C++ side. -A [`DistanceCalculator`](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.h) has to be implemented. +A [`DistanceCalculator`](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.h) has to be implemented. The concrete class should be declared there, -and added to the [factory](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L27). +and added to the [factory](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L28). Since all time series are passed from R as a list of vectors, matrices, or complex vectors, there is the `TSTSList` [templated class](https://github.com/asardaes/dtwclust/blob/master/src/utils/TSTSList.h) that works with `Rcpp`'s `NumericVector`, `NumericMatrix` and `ComplexVector`, -saving them Armadillo's `mat` and `cx_mat` so that they are thread-safe. +saving them as Armadillo's `mat` and `cx_mat` so that they are thread-safe. Univariate series are saved as matrices with 1 column. ### Constructor -It is expected that the `DistanceCalculator`'s constructor will take 3 `SEXP` parameters that will contain `Rcpp::List`s with: +It is expected that the `DistanceCalculator`'s constructor will take 3 `SEXP` parameters that will all contain `Rcpp::List`s with: any arguments for the distance, the time series in `x` (from `proxy`), and the time series in `y` (from `proxy`). -See for example the [DtwBasicCalculator](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L53), +See for example the [DtwBasicCalculator](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L50), and note that `x` and `y` are simply given to the `TSTSList` template. **Important**: if the concrete class has any private members that should be unique to each thread, @@ -59,25 +73,22 @@ and each thread will `delete` the clone when it's done. If there are private members that should be unique to each thread (e.g. dynamically allocated memory), they should be set-up during cloning. -See what [DtwBasicCalculator](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L88) does, -and note its [destructor](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L70). +For example, see what [DtwBasicCalculator](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L74) does. ### Calculate The factory method `calculate` takes 2 integers `i` and `j`, and it is expected that it returns the distance between `x[i]` and `y[j]`. -Some calculators [dispatch](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L78) to the appropriate method directly, -but [others](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L324) can pass more parameters as needed. -Also note how in some cases the core calculations are further [delegated](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L100) by calling special wrappers defined in their own [header](https://github.com/asardaes/dtwclust/blob/master/src/distances/distances-details.h), -which don't use R or Rcpp's API. -This avoids duplicating code that is used in stand-alone functions (described [above](#stand-alone-function)), -but is not always necessary; -for instance, the SbdCalculator does the [calculations directly](https://github.com/asardaes/dtwclust/blob/master/src/distance-calculators/distance-calculators.cpp#L330), +Some calculators [dispatch](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L69) to the appropriate method directly, +but [others](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L166) can pass more parameters as needed. +Also note how in most cases the core calculations are further [delegated](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L85) to the core functions described [above](#stand-alone-function), +but it's not always necessary. +For instance, the `SbdCalculator` does the [calculations directly](https://github.com/asardaes/dtwclust/blob/master/src/distances/calculators.cpp#L265), since the stand-alone version is implemented entirely in R. ### R side -All functions have a similar structure: +All functions have a similar structure (`dtw_basic` used as an example): - [Check consistency of inputs](https://github.com/asardaes/dtwclust/blob/master/R/DISTANCES-dtw-basic.R#L111) - [Prepare common parameters by evaluationg a pre-defined expression](https://github.com/asardaes/dtwclust/blob/master/R/DISTANCES-dtw-basic.R#L123) @@ -87,7 +98,7 @@ All functions have a similar structure: - [Final output adjustments](https://github.com/asardaes/dtwclust/blob/master/R/DISTANCES-dtw-basic.R#L160) They are registered with `proxy` during [loading](https://github.com/asardaes/dtwclust/blob/master/R/pkg.R#L77), -and unregistered during [unload](https://github.com/asardaes/dtwclust/blob/master/R/pkg.R#L152). +and unregistered during [unload](https://github.com/asardaes/dtwclust/blob/master/R/pkg.R#L154). ## Final details