Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Melanoma task #310

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open

Melanoma task #310

wants to merge 38 commits into from

Conversation

cxzhang4
Copy link
Collaborator

@cxzhang4 cxzhang4 commented Nov 28, 2024

https://huggingface.co/datasets/carsonzhang/ISIC_2020_small

Should we delete the individual files on Hugging Face?

cxzhang4 and others added 30 commits October 20, 2024 21:35
Bumps [JamesIves/github-pages-deploy-action](https://github.com/jamesives/github-pages-deploy-action) from 4.6.8 to 4.6.9.
- [Release notes](https://github.com/jamesives/github-pages-deploy-action/releases)
- [Commits](JamesIves/github-pages-deploy-action@v4.6.8...v4.6.9)

---
updated-dependencies:
- dependency-name: JamesIves/github-pages-deploy-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Copy link
Member

@sebffischer sebffischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is still some small cleanup tasks to be done (maybe I reviewed to early), but I already left some comments. Looking good!

#'
#' @references
#' `r format_bib("melanoma2021")`
#' @examplesIf torch::torch_is_installed()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is required here, as printing the task does not create any tensors


compressed_tarball_file_name = "hf_ISIC_2020_small.tar.gz"
compressed_tarball_path = file.path(path, compressed_tarball_file_name)
curl::curl_download(paste0(base_url, compressed_tarball_file_name), compressed_tarball_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because curl is in suggests, we should run mlr3misc::require_namespaces("curl") before so users get a good error message when they don't have it installed.

Copy link
Collaborator Author

@cxzhang4 cxzhang4 Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we should just write require_namespaces() without the mlr3misc:: right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!

R/TaskClassif_melanoma.R Outdated Show resolved Hide resolved
R/TaskClassif_melanoma.R Outdated Show resolved Hide resolved
old = c("image", "patient", "anatom_site_general"),
new = c("image_name", "patient_id", "anatom_site_general_challenge")
)[, split := "test"]
metadata = rbind(training_metadata, test_metadata, fill = TRUE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is being filled here?

R/TaskClassif_melanoma.R Outdated Show resolved Hide resolved
R/TaskClassif_melanoma.R Outdated Show resolved Hide resolved
R/TaskClassif_melanoma.R Outdated Show resolved Hide resolved
@@ -0,0 +1,84 @@
library(data.table)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the attic is more for code that was discarded for now but might be used again in the future (I haven't cleaned up in a while though :D).

If it is about how to create data, I would probably rather put it into data-raw. But as we already document on huggingface how it was created. We don't really need it in this repository I think.

tests/testthat/test_TaskClassif_melanoma.R Show resolved Hide resolved
training_metadata = data.table::fread(here::here(path, training_metadata_file_name))

test_metadata_file_name = "ISIC_2020_Test_Metadata.csv"
test_metadata = data.table::fread(here::here(path, test_metadata_file_name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use here here

old = c("image", "patient", "anatom_site_general"),
new = c("image_name", "patient_id", "anatom_site_general_challenge")
)[, split := "test"]
metadata = rbind(training_metadata, test_metadata)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fill = TRUE was here because we want to fill the response variable of the test data with NA, will double-check to confirm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok makes sense, maybe add a comment

@sebffischer
Copy link
Member

And yes, I would delete the individual files on huggingface

…y_imagenet. Probably need to construct full file paths first
@@ -83,7 +83,7 @@ load_task_melanoma = function(id = "melanoma") {
cached_constructor = function(backend) {
data = cached(constructor_melanoma, "datasets", "melanoma")$data

data[, benign_malignant := factor(benign_malignant, levels = c("benign", "malignant"))]
data[, outcome := factor(outcome, levels = c("benign", "malignant"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the static analyzer will not know that outcome is a valid variable here (this is generally an issue with NSE)
you can use outcome := factor(get("outcome"), ...) as a workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants