-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Registry of learners #81
Comments
so not whats there, currently in memory, but what available for installation? |
ok, i also just read your other issue #82 |
Not necessarily, I'd imagine that this would just be a registry of strings, e.g. like a datatable that is appended once by whoever adds a new learner, so of the form
It would make sense to live in |
Surely one of you checks this package though and verifies it's up to some level of standard? You could also ask the maintainer of the extension to put in a PR to make sure they are added to the registry. Ultimately if they don't do it then it's their loss |
well, no, what you are describing is a different process. a) this is how it is, currently. you write a new learner extension package. you put it somewhere on github. we have given you enough unit-testing tools to demonstrate it works. now maybe the whole mlr3-team is on extended leave. you can still publish your package, everything works. b) if we do what you propose, we now have to update package X on CRAN (mlr3learners, or mlr3, rather the first?) each time we have to update that table? OTOH you COULD argue that we are already maintaining the wiki table on github? |
I'm not suggesting you push updates to CRAN for each learner. They can wait until the next release. But yes that is essentially the issue, because when working in R I don't want to go back and forth between GitHub |
Machine-readable format: available.packages(repos = "https://mlr3learners.github.io/mlr3learners.drat") |
Not saying that this is very convenient, we should really look into making the additional learners easier to discover and install. |
Oh, and if you want properties ... yes, we might need to create a JSON file for this. |
I would have been nice to have posted the output too.
which demonstrates that this does not answer raphael's request? you cannot see the provided learners of the packages. you would have to download and install them. |
please note that the package name DOES not coincide with the learner-name. and an extension package can and should contain multiple learners |
I also think this is too much information for the average user, it really doesn't need to be more than: id, package (+ properties for a bonus) |
Ok, then maintaining an extra file seems inevitable. |
The reprex below creates meta-information which could be deployed as text files to mlr3learners.drat ( This information can be used to auto-load/auto-install packages/learners behind the scenes. The only restriction is that one has internet access - but we could assert this. So in summary we could have files that store
(This information could also be used to automate the creation of a nice HTML table, similar as we have one in mlr2) library(mlr3)
library(mlr3learners)
library(mlr3proba)
library(magrittr)
extra_learners <- rownames(available.packages(repos = "https://mlr3learners.github.io/mlr3learners.drat"))
lapply(extra_learners, require, character.only = TRUE, quietly = TRUE)
keys <- mlr_learners$keys()
print(extra_learners)
#> [1] "mlr3learners.C50" "mlr3learners.c50"
#> [3] "mlr3learners.extratrees" "mlr3learners.fnn"
#> [5] "mlr3learners.gbm" "mlr3learners.kernlab"
#> [7] "mlr3learners.mboost" "mlr3learners.partykit"
dput(keys, file = paste0(tempdir(), "/keys.txt"))
dget(file = paste0(tempdir(), "/keys.txt"))
#> [1] "classif.C5.0" "classif.ctree" "classif.debug"
#> [4] "classif.extratrees" "classif.featureless" "classif.fnn"
#> [7] "classif.gamboost" "classif.gbm" "classif.glmboost"
#> [10] "classif.glmnet" "classif.kknn" "classif.ksvm"
#> [13] "classif.lda" "classif.log_reg" "classif.naive_bayes"
#> [16] "classif.qda" "classif.ranger" "classif.rpart"
#> [19] "classif.svm" "classif.xgboost" "dens.hist"
#> [22] "dens.kde" "dens.kdeKD" "dens.kdeKS"
#> [25] "dens.locfit" "dens.logspline" "dens.mixed"
#> [28] "dens.nonpar" "dens.pen" "dens.plug"
#> [31] "dens.spline" "regr.ctree" "regr.extratrees"
#> [34] "regr.featureless" "regr.fnn" "regr.gamboost"
#> [37] "regr.gbm" "regr.glmboost" "regr.glmnet"
#> [40] "regr.kknn" "regr.km" "regr.ksvm"
#> [43] "regr.lm" "regr.ranger" "regr.rpart"
#> [46] "regr.svm" "regr.xgboost" "surv.blackboost"
#> [49] "surv.coxph" "surv.cvglmnet" "surv.flexible"
#> [52] "surv.gamboost" "surv.gbm" "surv.glmboost"
#> [55] "surv.glmnet" "surv.kaplan" "surv.mboost"
#> [58] "surv.nelson" "surv.obliqueRSF" "surv.parametric"
#> [61] "surv.penalized" "surv.randomForestSRC" "surv.ranger"
#> [64] "surv.rpart" "surv.svm"
all_lrns = lrns(keys)
properties = mlr3misc::map(all_lrns, function(.x) .x$properties) %>%
setNames(keys)
package = mlr3misc::map(all_lrns, function(.x) .x$packages)
tibble::tibble(name = keys, package = package, properties = properties)
#> # A tibble: 65 x 3
#> name package properties
#> <chr> <list> <named list>
#> 1 classif.C5.0 <chr [1]> <chr [4]>
#> 2 classif.ctree <chr [1]> <chr [3]>
#> 3 classif.debug <chr [0]> <chr [3]>
#> 4 classif.extratrees <chr [1]> <chr [3]>
#> 5 classif.featureless <chr [0]> <chr [5]>
#> 6 classif.fnn <chr [1]> <chr [2]>
#> 7 classif.gamboost <chr [1]> <chr [2]>
#> 8 classif.gbm <chr [1]> <chr [5]>
#> 9 classif.glmboost <chr [1]> <chr [2]>
#> 10 classif.glmnet <chr [1]> <chr [3]>
#> # … with 55 more rows Created on 2020-04-04 by the reprex package (v0.3.0) |
I guess we can close this @mllg @berndbischl ? |
It would be nice to have a permanent registry specifically for
mlr3learners
andmlr3learners.<package>
that lists all available learners to install.i.e. like
mlr3::mlr_learners
except not a dictionary that gets repopulated but instead a permanent list of all available learners that can be installed at any given time. If this was a table likemlr::listLearners()
with properties that would be a bonus!The text was updated successfully, but these errors were encountered: