Merge pull request #167 from aim-rsf/rOpensci-editor-comments

R opensci editor comments
aim-rsf · Dec 17, 2024 · 9556a43 · 9556a43
2 parents 2c9fcb8 + dbe6e20
commit 9556a43
Show file tree

Hide file tree

Showing 34 changed files with 208 additions and 286 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -3,22 +3,22 @@
 We warmly welcome contributions to the browseMetadata project! 
 This document provides guidelines for contributing to this repository.
 
-## How to Contribute
+## How to contribute
 
-### Reporting Issues
+### Reporting issues
 
 - **Bug Reports**: If you find a bug, please open an issue with a clear description of the problem and steps to reproduce it.
 - **Feature Requests**: Suggestions for new features or improvements are always welcome. Please open an issue to discuss your ideas.
 
-### Making Changes
+### Making changes
 
 1. **Fork the Repository**: Start by forking the repository to your GitHub account.
 2. **Create a Feature Branch**: Create a new branch for your feature or fix.
 3. **Make Your Changes**: Implement your changes, adhering to the coding standards and practices outlined below.
 4. **Test Your Changes**: Ensure your changes do not break any existing functionality.
 5. **Submit a Pull Request**: Open a pull request from your feature branch to the main branch of the original repository.
 
-### Coding Standards
+### Coding standards
 
 - Follow the [tidyverse style guide](https://style.tidyverse.org) for R code.
 - Write clear, readable, and maintainable code.
@@ -39,12 +39,11 @@ If your contribution involves changes to the R package itself (as an author or r
 3. In this **Git** tab, move to the branch you want to make changes in (or review and test the changes of someone else).
 4. Ensure that your current working directory is the R package directory you cloned (`getwd()` to check and `setwd()` to change).
 5. Run `devtools::load_all()` in the R console. You should see `ℹ Loading browseMetadata` returned.
-6. Test the function runs by running `domain_mapping()` in the R console.
-7. Make your changes (or review changes made by others), and commit these changes in the way you choose to interact with git locally!
+6. Make your changes (or review changes made by others), and commit these changes in the way you choose to interact with git locally!
 
 If you run into issues with branches not seeming to be up to date in the R Studio workspace, consider running `remove.packages("browseMetadata")` and trying the above steps again, in case a previously installed package library is getting in the way somehow. 
 
-### Working with Package Data
+### Working with package data
 
 - **Creating .rda Files**: To create `.rda` files in the data directory of the package, use the following command in R:
   ```R
@@ -63,18 +62,24 @@ If you run into issues with branches not seeming to be up to date in the R Studi
   ```
   Again, replace `dataname` with the name of the data you wish to load.
 
-### Building Documentation
+### Building documentation
 
 - **Generating Documentation Files**: To build the documentation files for the package, use the `roxygen2` package:
   ```R
-  library(roxygen2)
-  roxygenise()
+  devtools::document() 
   ```
-  This will generate the necessary documentation based on your roxygen comments in the R code.
+  This will generates the .Rd files from any updated roxygen comments.
 
-### Testing Your Changes
+### Testing your changes and check your style :sunglasses:
 
-- Ensure that your changes do not break any existing functionality. Run any existing tests, and consider adding new tests to cover your changes.
+Ensure that your changes do not break any existing functionality. Run any existing tests, and consider adding new tests to cover your changes. Here are some helpful functions to consider:
+
+- `codemetar::write_codemeta()` ensures the metadata file is up to date.
+- `lintr::lint_package(path = ".")` checks for adherence to a given style, identifying syntax errors and possible semantic issues
+- `desc::desc_normalize()` to ensure DESCRIPTION file follows a standard structure and style
+- `styler::style_pkg()` ensures consistent code styling that match the guidelines.
+- `devtools::check()` runs a comprehensive package check. 
+- https://docs.ropensci.org/pkgcheck/ (but there is also GitHub Action that runs this)
 
 ### Submitting Changes
 

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -3,23 +3,21 @@ Package: browseMetadata
 Title: Browse and categorise health metadata
 Version: 2.0.2
 Authors@R: c(
-    person("Rachael", "Stickland", , "[email protected]", 
-    role = c("aut", "cre"),
-    comment = c(ORCID = "0000-0003-3398-4272")),
-    person("Batool", "Almarzouq", , , role = "ctb",
-    comment = c(ORCID = "0000-0002-3905-2751")),
-    person("Mahwish", "Mohammad", , , role = "ctb",
-    comment = c(ORCID = "0009-0004-5295-0726")),
-    person("Daniel", "Delbarre", , , role = "ctb",
-    comment = c(ORCID = "0000-0003-4633-4252")),
-    person("Nida", "Ziauddeen", , , role = "ctb",
-    comment = c(ORCID = "0000-0002-8964-5029"))
-    )
-Maintainer: Rachael Stickland <[email protected]>
-Description: Visualise and categorise publicly available metadata from health 
-  datasets. By interacting with metadata prior to gaining full access to health 
-  datasets, researchers can use this tool to browse datasets and categorise 
-  variables.
+    person("Rachael", "Stickland", , "[email protected]", role = c("aut", "cre"),
+           comment = c(ORCID = "0000-0003-3398-4272")),
+    person("Batool", "Almarzouq", role = "ctb",
+           comment = c(ORCID = "0000-0002-3905-2751")),
+    person("Mahwish", "Mohammad", role = "ctb",
+           comment = c(ORCID = "0009-0004-5295-0726")),
+    person("Daniel", "Delbarre", role = "ctb",
+           comment = c(ORCID = "0000-0003-4633-4252")),
+    person("Nida", "Ziauddeen", role = "ctb",
+           comment = c(ORCID = "0000-0002-8964-5029"))
+  )
+Description: Visualise and categorise publicly available metadata from
+    health datasets. By interacting with metadata prior to gaining full
+    access to health datasets, researchers can use this tool to browse
+    datasets and categorise variables.
 License: GPL (>= 3)
 URL: https://aim-rsf.github.io/browseMetadata/
 BugReports: https://github.com/aim-rsf/browseMetadata/issues
@@ -31,15 +29,16 @@ Imports:
     ggplot2,
     gridExtra,
     htmlwidgets,
-    plotly,
     jsonlite,
+    plotly,
     tidyr
 Suggests: 
+    devtools,
     knitr,
+    mockery,
     rmarkdown,
-    devtools,
     testthat (>= 3.0.0),
-    mockery
+    withr
 VignetteBuilder: 
     knitr
 Config/testthat/edition: 3

diff --git a/NAMESPACE b/NAMESPACE
@@ -34,6 +34,7 @@ importFrom(stats,reorder)
 importFrom(tidyr,complete)
 importFrom(tidyr,pivot_longer)
 importFrom(tools,file_path_sans_ext)
+importFrom(utils,browseURL)
 importFrom(utils,packageVersion)
 importFrom(utils,read.csv)
 importFrom(utils,write.csv)
diff --git a/R/browse_metadata.R b/R/browse_metadata.R
@@ -25,8 +25,9 @@
 #' @importFrom plotly plot_ly layout
 #' @importFrom htmlwidgets saveWidget
 #' @importFrom tidyr pivot_longer
+#' @importFrom utils browseURL
 
-browse_metadata <- function(json_file = NULL, output_dir = NULL) {
+browse_metadata <- function(json_file = NULL, output_dir = getwd()) {
   # DEFINE INPUTS AND OUTPUTS ----
 
   ## Read in the json file containing the meta data, if null load the demo file
@@ -40,10 +41,6 @@ browse_metadata <- function(json_file = NULL, output_dir = NULL) {
     meta_json <- fromJSON(json_file)
   }
 
-  ## Set output_dir to current wd if user has not provided it
-  if (is.null(output_dir)) {
-    output_dir <- getwd()
-  }
   ## Extract dataset from json_file
   dataset <- meta_json$dataModel
   dataset_name <- dataset$label
@@ -165,17 +162,15 @@ browse_metadata <- function(json_file = NULL, output_dir = NULL) {
   saveWidget(widget = barplot_html, file = bar_fname, selfcontained = TRUE)
 
   ## Save the data that made the bar plot to a csv file
-  bar_fname <- paste0("BROWSE_bar_", base_fname, ".csv")
-  write.csv(count_empty_long, bar_fname, row.names = FALSE)
+  bar_data_fname <- paste0("BROWSE_bar_", base_fname, ".csv")
+  write.csv(count_empty_long, bar_data_fname, row.names = FALSE)
 
   setwd(original_wd) # saveWidget has a bug with paths & saving
 
   # OUTPUTS ----
   cat("\n")
-  cli_alert_info("Three outputs have been saved to your output directory.")
-  cli_alert_info("The two html files are shown in your Viewer tab. Open in your browser for full screen viewing.")
-  cat("\n")
+  browseURL(table_fname)
+  browseURL(bar_fname)
+  cli_alert_info("Three outputs have been saved to your output directory, and two outputs should have opened in your browser.")
 
-  html_figs <- list(table_html = table_html, barplot_html = barplot_html)
-  return(html_figs)
 } # end of function
diff --git a/R/data_manipulation.R b/R/data_manipulation.R
@@ -40,9 +40,9 @@ json_table_to_df <- function(dataset, n) {
 count_empty_desc <- function(table_df, table_colname) {
   table_df["empty"] <- NA
 
-  for (data_v in 1:nrow(table_df)) {
-    if ((nchar(table_df$description[data_v]) == 1) |
-      (table_df$description[data_v] == "Description to follow") |
+  for (data_v in seq_len(nrow(table_df))) {
+    if ((nchar(table_df$description[data_v]) == 1) ||
+      (table_df$description[data_v] == "Description to follow") ||
       (table_df$description[data_v] == "NA")) {
       table_df$empty[data_v] <- "Yes"
     } else {

diff --git a/R/map_metadata.R b/R/map_metadata.R
@@ -1,3 +1,5 @@
+readline <- NULL
+
 #' map_metadata
 #'
 #' This function will read in the metadata file for a chosen dataset, loop
@@ -49,18 +51,13 @@ map_metadata <- function(
     json_file = NULL,
     domain_file = NULL,
     look_up_file = NULL,
-    output_dir = NULL,
+    output_dir = getwd(),
     table_copy = TRUE) {
   timestamp_now_fname <- format(Sys.time(), "%Y-%m-%d-%H-%M-%S")
   timestamp_now <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
 
   # DEFINE INPUTS AND OUTPUTS ----
 
-  ## Set output_dir to current wd if user has not provided it
-  if (is.null(output_dir)) {
-    output_dir <- getwd()
-  }
-
   ## Use 'load_data.R' to collect inputs (defaults or user inputs)
   data <- load_data(json_file, domain_file, look_up_file)
 
@@ -169,7 +166,7 @@ map_metadata <- function(
       start_end_v <- 0
       start_v <- 0
       end_v <- 0
-      while (length(start_end_v) != 2 | start_v > end_v) {
+      while (length(start_end_v) != 2 || start_v > end_v) {
         start_end_v <- user_prompt_list(
           prompt_text = "Which data elements do you want to process? 1:[start integer] and 2:[end integer]",
           list_allowed = seq(from = 1, to = nrow(table_df), by = 1),
@@ -228,13 +225,13 @@ map_metadata <- function(
     }
 
     ### Review user categorized data elements (optional)
-    #### Use 'user_prompt.R' to ask the user if they want to review
+    #### Use 'user_prompt.R' to ask the user if they want to review|
     #### Use 'user_prompt_list.R' to ask the user which rows to edit
     review_cats <- user_prompt(
       prompt_text = "Would you like to review your categorisations? (y/n): ",
       any_keys = FALSE
     )
-    if (review_cats == "Y" | review_cats == "y") {
+    if (review_cats == "Y" || review_cats == "y") {
       output_not_auto <- subset(output_df, note != "AUTO CATEGORISED")
       output_not_auto["note (first 12 chars)"] <-
         substring(output_not_auto$note, 1, 11)

diff --git a/R/map_metadata_compare.R b/R/map_metadata_compare.R
@@ -32,7 +32,7 @@
 #'   domain_file = demo_domain_file
 #' )
 #' }
-map_metadata_compare <- function(session_dir, session1_base, session2_base, json_file, domain_file, output_dir = NULL) {
+map_metadata_compare <- function(session_dir, session1_base, session2_base, json_file, domain_file, output_dir = session_dir) {
   timestamp_now_fname <- format(Sys.time(), "%Y-%m-%d-%H-%M-%S")
 
   # DEFINE INPUTS ----
@@ -48,11 +48,6 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json
   dataset <- meta_json$dataModel
   dataset_name <- dataset$label
 
-  ## Set output_dir to current wd if user has not provided it
-  if (is.null(output_dir)) {
-    output_dir <- session_dir
-  }
-
   # VALIDATION CHECKS ----
 
   ## Use 'valid_comparison.R' to check if sessions can be compared to each other and to the json (min requirements):
@@ -131,7 +126,7 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json
 
   ## Use 'json_table_to_df.R' to extract table from meta_json into a df
   table_find <- data.frame(table_n = numeric(length(dataset$childDataClasses)), table_label = character(length(dataset$childDataClasses)))
-  for (t in 1:length(dataset$childDataClasses)) {
+  for (t in seq_along(dataset$childDataClasses)) {
     table_find$table_n[t] <- t
     table_find$table_label[t] <- dataset$childDataClasses$label[t]
   }
@@ -143,7 +138,7 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json
   ses_join <- join_outputs(session_1 = csv_1b, session_2 = csv_2b)
 
   # FIND MISMATCHES AND ASK FOR CONCENSUS DECISION ----
-  for (datavar in 1:nrow(ses_join)) {
+  for (datavar in seq_len(nrow(ses_join))) {
     consensus <- concensus_on_mismatch(ses_join, table_df, datavar, max(df_plots$code$code))
     ses_join$domain_code_join[datavar] <- consensus$domain_code_join
     ses_join$note_join[datavar] <- consensus$note_join

diff --git a/R/map_metadata_convert.R b/R/map_metadata_convert.R
@@ -19,11 +19,11 @@ map_metadata_convert <- function(output_csv, output_dir) {
   output <- read.csv(paste0(output_dir, "/", output_csv))
   output_long <- output[0, ] # make duplicate
 
-  for (row in 1:(nrow(output))) {
+  for (row in seq_len(nrow(output))) {
     if (grepl(",", output$domain_code[row])) { # Domain_code for this row is a list
       domain_code_list <- output$domain_code[row] # extract Domain_code list
       domain_code_list_split <- unlist(strsplit(domain_code_list, ",")) # split the list
-      for (code in 1:(length(domain_code_list_split))) { # for every domain code in list, create a new row
+      for (code in seq_len(length(domain_code_list_split))) { # for every domain code in list, create a new row
         row_to_copy <- output[row, ] # extract row
         row_to_copy$domain_code <- domain_code_list_split[code] # change domain code to single
         output_long[nrow(output_long) + 1, ] <- row_to_copy # copy altered row

diff --git a/R/user_interactions.R b/R/user_interactions.R
@@ -1,3 +1,6 @@
+readline <- NULL
+scan <- NULL
+
 #' Internal: user_categorisation
 #'
 #' Internal Function: This function is called within the map_metadata function. \cr \cr
@@ -16,7 +19,7 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
   first_run <- TRUE
   go_back <- ""
 
-  while (go_back == "Y" | go_back == "y" | first_run == TRUE) {
+  while (go_back == "Y" || go_back == "y" || first_run == TRUE) {
     go_back <- ""
     # print text to R console
     cat(paste(
@@ -31,15 +34,15 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
     validated <- FALSE
     cat("\n \n")
 
-    while (decision == "" | validated == FALSE) {
+    while (decision == "" || validated == FALSE) {
       decision <- readline("Categorise data element into domain(s). E.g. 3 or 3,4: ")
 
       # validate input given by user
       decision_int <- as.integer(unlist(strsplit(decision, ",")))
       decision_int_NA <- any(is.na((decision_int)))
       suppressWarnings(decision_int_max <- max(decision_int, na.rm = TRUE))
       suppressWarnings(decision_int_min <- min(decision_int, na.rm = TRUE))
-      if (decision_int_NA == TRUE | decision_int_max > domain_code_max | decision_int_min < 0) {
+      if (decision_int_NA == TRUE || decision_int_max > domain_code_max || decision_int_min < 0) {
         cli_alert_warning("Formatting is invalid or integer out of range. Provide one integer or a comma seperated list of integers.")
         validated <- FALSE
       } else {
@@ -55,7 +58,7 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
     cat("\n \n")
     decision_note <- readline("Categorisation note (or press enter to continue): ")
 
-    while (go_back != "Y" & go_back != "y" & go_back != "N" & go_back != "n") {
+    while (go_back != "Y" && go_back != "y" && go_back != "N" && go_back != "n") {
       cat("\n \n")
       go_back <- readline(prompt = paste0("Response to be saved is '", decision, "'. Would you like to re-do? (y/n): "))
     }
@@ -116,7 +119,7 @@ user_categorisation_loop <- function(start_v, end_v, table_df, df_prev_exist, df
         domain_code = as.character(lookup_subset$domain_code),
         note = "AUTO CATEGORISED"
       )
-    } else if (df_prev_exist == TRUE &
+    } else if (df_prev_exist == TRUE &&
       nrow(df_prev_subset) == 1) {
       ###### 2 - copy from previous table
       output_df <- output_df %>% add_row(
@@ -154,6 +157,7 @@ user_categorisation_loop <- function(start_v, end_v, table_df, df_prev_exist, df
 #' If FALSE, only these are allowed: Y, y, N and n.
 #' @return It returns variable text, depending on any_keys.
 #' @keywords internal
+
 user_prompt <- function(prompt_text, any_keys) {
   # prompt text
   if (any_keys == TRUE) {
@@ -189,7 +193,7 @@ user_prompt <- function(prompt_text, any_keys) {
 user_prompt_list <- function(prompt_text, list_allowed, empty_allowed) {
   list_to_process_error <- TRUE
   list_to_process_in_range <- TRUE
-  while (list_to_process_error == TRUE | list_to_process_in_range == FALSE) {
+  while (list_to_process_error == TRUE || list_to_process_in_range == FALSE) {
     tryCatch(
       {
         cat("\n \n")

diff --git a/README.md b/README.md
@@ -66,12 +66,6 @@ Load the library:
 library(browseMetadata)
 ```
 
-Set your working directory to an empty folder:
-
-```r         
-setwd("/Users/your-username/test-browseMetadata")
-```
-
 ### Demo (using the `R Studio` IDE)
 
 Fo a longer more detailed demo, see the [Getting Started](https://aim-rsf.github.io/browseMetadata/articles/browseMetadata.html) page on the package website. 
@@ -89,11 +83,10 @@ browse_metadata()
 Upon success, you should see:
 
 ```
-ℹ Three outputs have been saved to your output directory.
-ℹ Open the two HTML files in your browser for full-screen viewing.
+ℹ Three outputs have been saved to your output directory, and two outputs should have opened in your browser.
 ```
 
-The output files are saved to your working directory. You can change the save location by adjusting the `output_dir` argument. Examples of outputs are available in [inst/outputs](https://github.com/aim-rsf/browseMetadata/tree/main/inst/outputs).
+The output files are saved to your project directory. You can change the save location by adjusting the `output_dir` argument. Examples of outputs are available in [inst/outputs](https://github.com/aim-rsf/browseMetadata/tree/main/inst/outputs).
 
 #### `map_metadata()`