-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adds learners table and overloads lrn #142
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -56,6 +56,7 @@ Suggests: | |
ranger, | ||
rmarkdown, | ||
testthat, | ||
tibble, | ||
xgboost | ||
RdMacros: | ||
mlr3misc | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Ideally this table would be created automatically and all required packages would be installed. | ||
# and loaded. Required packages are mlr3, mlr3learners, mlr3proba, and all packages in | ||
# mlr3learners org, also when ready other packages in mlr3verse that have learners implemented in | ||
# them. | ||
# | ||
library(mlr3) | ||
library(mlr3learners) | ||
library(mlr3proba) | ||
library(data.table) | ||
extra_learners = rownames( | ||
available.packages(repos = "https://mlr3learners.github.io/mlr3learners.drat") | ||
) | ||
Comment on lines
+10
to
+12
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: Wrap in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is automated and created during builds then surely there is always internet? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah I did not see that this is not part of the function. We could possibly store it in the package though the content then depends on the version which users have installed. I guess querying an online resource (could also be the mlr3 GH repo) while requiring internet access would be better? What about both: Trying to query the online table and fall back to the local one included in the package (with a warning message that this one might not include all learners). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What's/where's the mlr3 GH repo?
I think this would slightly defeat the point because say a user wants to install xgboost but does not know that it lives in mlr3learners and have therefore only installed mlr3. Then the table will not show xgboost nor will it be able to install it if called. Unless you just mean only fall back to local when internet is not available? I guess that would make sense and be more intuitive than just erroring. Assuming the local one is identical to the code above except with no call to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I meant this, yes.
The mlr3 GitHub repo. Outlining the process again:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry completely misread and thought you were saying there's a separate repo just for certain GHaction automations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds good and just so I understand would the static version just have to be manually typed out and updated. Do you want me to try and set-up the build for mlr3data so we can close the first part (i.e. the online table)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First I would like to ask @mllg if he agrees that mlr3data would be a good place to store a .csv file containing this information?
Not sure I understand what you mean by this.
Apart from both you can continue to write a .csv containing the table that should be read in later. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, sorry I have no experience with external data structures in R so just trying to visualise it, but understand it now :) |
||
install.packages(extra_learners, repos = "https://mlr3learners.github.io/mlr3learners.drat") | ||
lapply(extra_learners, require, character.only = TRUE, quietly = TRUE) | ||
|
||
# construct all learners in attached mlr3verse | ||
keys = mlr_learners$keys() | ||
# potential warnings are either that external package required but not installed or package built | ||
# under different R version | ||
all_lrns = suppressWarnings(mlr3::lrns(keys)) | ||
|
||
# creates data.table with id split into name and class, as well as original id; | ||
# the mlr3 package that the learner is implemented in; external package it is interfaced from; | ||
# learner properties; feature types; and predict types | ||
# | ||
# may look better as tibble, option to print given in below function | ||
# | ||
# ideally this table is abstracted from the user and they access it through the getter below | ||
learner_table = data.table(t(rbindlist(list(mlr3misc::map(all_lrns, function(.x) { | ||
idsplt = strsplit(.x$id, ".", TRUE)[[1]] | ||
list(idsplt[[2]], idsplt[[1]], .x$id, strsplit(.x$man, "::", TRUE)[[1]][1], | ||
.x$packages[1], .x$properties, .x$feature_types, .x$predict_types) | ||
}))))) | ||
|
||
colnames(learner_table) = c("name", "class", "id", "mlr3_package", "required_package", | ||
"properties", "feature_types", "predict_types") | ||
learner_table[, 1:4] = lapply(learner_table[, 1:4], as.character) | ||
rm(all_lrns, extra_learners, keys) | ||
|
||
# getter function for the mlr3 learner table, assume it is called `learner_table` | ||
# args: | ||
# hide_cols `character()`: specify which, if any, columns to hide | ||
# filter `list()`: named list of conditions to filter on, names correspond to column names | ||
# in table | ||
# tibble `logical(1)`: if TRUE returns table as tibble otherwise data.table | ||
# | ||
# examples: | ||
# list_mlr3learners(hide_cols = c("properties", "feature_types"), | ||
# filter = list(class = "surv", predict_types = "distr")) | ||
# list_mlr3learners(tibble = TRUE) | ||
list_mlr3learners = function(hide_cols = NULL, filter = NULL, tibble = FALSE) { | ||
|
||
dt = copy(learner_table) | ||
|
||
if (!is.null(filter)) { | ||
if (!is.null(filter$class)) { | ||
dt = subset(dt, class %in% filter$class) | ||
} | ||
if (!is.null(filter$mlr3_package)) { | ||
dt = subset(dt, mlr3_package %in% filter$mlr3_package) | ||
} | ||
if (!is.null(filter$required_package)) { | ||
dt = subset(dt, required_package %in% filter$required_package) | ||
} | ||
if (!is.null(filter$properties)) { | ||
dt = subset(dt, mlr3misc::map_lgl(dt$properties, | ||
function(.x) any(filter$properties %in% .x))) | ||
} | ||
if (!is.null(filter$feature_types)) { | ||
dt = subset(dt, mlr3misc::map_lgl(dt$feature_types, | ||
function(.x) any(filter$feature_types %in% .x))) | ||
} | ||
if (!is.null(filter$predict_types)) { | ||
dt = subset(dt, mlr3misc::map_lgl(dt$predict_types, | ||
function(.x) any(filter$predict_types %in% .x))) | ||
} | ||
} | ||
|
||
if (!is.null(hide_cols)) { | ||
dt = subset(dt, select = !(colnames(dt) %in% hide_cols)) | ||
} | ||
|
||
if (tibble) { | ||
return(tibble::tibble(dt)) | ||
} else { | ||
return(dt) | ||
} | ||
} | ||
|
||
|
||
# overloads lrn function to automatically detect and install learners from any packages in | ||
# mlr3verse. uses list_mlr3learners with filtering for the given key. | ||
# this should actually probably be implemented in mlr3misc::dictionary_sugar_get | ||
# however this would create a dependency loop unless the learners table also lives in mlr3misc. | ||
# a vectorised version of this for `lrns` follows naturally. | ||
# | ||
# the function filters the learner_table, searches to see if the required mlr3_package is installed | ||
# and if not uses usethis::ui_yeah to ask user to install, if yes then installed and learner loaded, | ||
# if not then errors | ||
# | ||
# args: | ||
# .key `character(1)`: learner key | ||
# | ||
# examples: | ||
|
||
lrn("classif.ranger") | ||
|
||
unloadNamespace("mlr3learners.coxboost") | ||
utils::remove.packages("mlr3learners.coxboost") | ||
lrn("surv.coxboost") | ||
|
||
lrn = function(.key, ...) { | ||
pkg = unlist(subset(list_mlr3learners(), id == .key)$mlr3_package) | ||
inst = suppressWarnings(require(pkg, quietly = FALSE, character.only = TRUE)) | ||
if (!inst) { | ||
mlr3misc::catf("%s is not installed but is required, do you want to install this now?\n", pkg) | ||
cat("1: Yes\n2: No") | ||
ans = readline() == 1 | ||
if (ans) { | ||
install.packages(pkg, repos = "https://mlr3learners.github.io/mlr3learners.drat") | ||
} else { | ||
stop(sprintf("%s is not installed but is required.", pkg)) | ||
} | ||
} | ||
mlr3misc::dictionary_sugar_get(mlr_learners, .key, ...) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not pull in tibble as a soft dep.
The following overloads the printer function for data.frame (can be extended for data.tables) and can live in everyone's
.Rprofile
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't CRAN give a warning/note when using triple colon in packages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not for inclusion in a package, this is for a local
.Rprofile
.