Skip to content

Commit

Permalink
Merge pull request #167 from aim-rsf/rOpensci-editor-comments
Browse files Browse the repository at this point in the history
R opensci editor comments
  • Loading branch information
RayStick authored Dec 17, 2024
2 parents 2c9fcb8 + dbe6e20 commit 9556a43
Show file tree
Hide file tree
Showing 34 changed files with 208 additions and 286 deletions.
31 changes: 18 additions & 13 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,22 @@
We warmly welcome contributions to the browseMetadata project!
This document provides guidelines for contributing to this repository.

## How to Contribute
## How to contribute

### Reporting Issues
### Reporting issues

- **Bug Reports**: If you find a bug, please open an issue with a clear description of the problem and steps to reproduce it.
- **Feature Requests**: Suggestions for new features or improvements are always welcome. Please open an issue to discuss your ideas.

### Making Changes
### Making changes

1. **Fork the Repository**: Start by forking the repository to your GitHub account.
2. **Create a Feature Branch**: Create a new branch for your feature or fix.
3. **Make Your Changes**: Implement your changes, adhering to the coding standards and practices outlined below.
4. **Test Your Changes**: Ensure your changes do not break any existing functionality.
5. **Submit a Pull Request**: Open a pull request from your feature branch to the main branch of the original repository.

### Coding Standards
### Coding standards

- Follow the [tidyverse style guide](https://style.tidyverse.org) for R code.
- Write clear, readable, and maintainable code.
Expand All @@ -39,12 +39,11 @@ If your contribution involves changes to the R package itself (as an author or r
3. In this **Git** tab, move to the branch you want to make changes in (or review and test the changes of someone else).
4. Ensure that your current working directory is the R package directory you cloned (`getwd()` to check and `setwd()` to change).
5. Run `devtools::load_all()` in the R console. You should see `ℹ Loading browseMetadata` returned.
6. Test the function runs by running `domain_mapping()` in the R console.
7. Make your changes (or review changes made by others), and commit these changes in the way you choose to interact with git locally!
6. Make your changes (or review changes made by others), and commit these changes in the way you choose to interact with git locally!

If you run into issues with branches not seeming to be up to date in the R Studio workspace, consider running `remove.packages("browseMetadata")` and trying the above steps again, in case a previously installed package library is getting in the way somehow.

### Working with Package Data
### Working with package data

- **Creating .rda Files**: To create `.rda` files in the data directory of the package, use the following command in R:
```R
Expand All @@ -63,18 +62,24 @@ If you run into issues with branches not seeming to be up to date in the R Studi
```
Again, replace `dataname` with the name of the data you wish to load.

### Building Documentation
### Building documentation

- **Generating Documentation Files**: To build the documentation files for the package, use the `roxygen2` package:
```R
library(roxygen2)
roxygenise()
devtools::document()
```
This will generate the necessary documentation based on your roxygen comments in the R code.
This will generates the .Rd files from any updated roxygen comments.

### Testing Your Changes
### Testing your changes and check your style :sunglasses:

- Ensure that your changes do not break any existing functionality. Run any existing tests, and consider adding new tests to cover your changes.
Ensure that your changes do not break any existing functionality. Run any existing tests, and consider adding new tests to cover your changes. Here are some helpful functions to consider:

- `codemetar::write_codemeta()` ensures the metadata file is up to date.
- `lintr::lint_package(path = ".")` checks for adherence to a given style, identifying syntax errors and possible semantic issues
- `desc::desc_normalize()` to ensure DESCRIPTION file follows a standard structure and style
- `styler::style_pkg()` ensures consistent code styling that match the guidelines.
- `devtools::check()` runs a comprehensive package check.
- https://docs.ropensci.org/pkgcheck/ (but there is also GitHub Action that runs this)

### Submitting Changes

Expand Down
39 changes: 19 additions & 20 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,21 @@ Package: browseMetadata
Title: Browse and categorise health metadata
Version: 2.0.2
Authors@R: c(
person("Rachael", "Stickland", , "[email protected]",
role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-3398-4272")),
person("Batool", "Almarzouq", , , role = "ctb",
comment = c(ORCID = "0000-0002-3905-2751")),
person("Mahwish", "Mohammad", , , role = "ctb",
comment = c(ORCID = "0009-0004-5295-0726")),
person("Daniel", "Delbarre", , , role = "ctb",
comment = c(ORCID = "0000-0003-4633-4252")),
person("Nida", "Ziauddeen", , , role = "ctb",
comment = c(ORCID = "0000-0002-8964-5029"))
)
Maintainer: Rachael Stickland <[email protected]>
Description: Visualise and categorise publicly available metadata from health
datasets. By interacting with metadata prior to gaining full access to health
datasets, researchers can use this tool to browse datasets and categorise
variables.
person("Rachael", "Stickland", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-3398-4272")),
person("Batool", "Almarzouq", role = "ctb",
comment = c(ORCID = "0000-0002-3905-2751")),
person("Mahwish", "Mohammad", role = "ctb",
comment = c(ORCID = "0009-0004-5295-0726")),
person("Daniel", "Delbarre", role = "ctb",
comment = c(ORCID = "0000-0003-4633-4252")),
person("Nida", "Ziauddeen", role = "ctb",
comment = c(ORCID = "0000-0002-8964-5029"))
)
Description: Visualise and categorise publicly available metadata from
health datasets. By interacting with metadata prior to gaining full
access to health datasets, researchers can use this tool to browse
datasets and categorise variables.
License: GPL (>= 3)
URL: https://aim-rsf.github.io/browseMetadata/
BugReports: https://github.com/aim-rsf/browseMetadata/issues
Expand All @@ -31,15 +29,16 @@ Imports:
ggplot2,
gridExtra,
htmlwidgets,
plotly,
jsonlite,
plotly,
tidyr
Suggests:
devtools,
knitr,
mockery,
rmarkdown,
devtools,
testthat (>= 3.0.0),
mockery
withr
VignetteBuilder:
knitr
Config/testthat/edition: 3
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ importFrom(stats,reorder)
importFrom(tidyr,complete)
importFrom(tidyr,pivot_longer)
importFrom(tools,file_path_sans_ext)
importFrom(utils,browseURL)
importFrom(utils,packageVersion)
importFrom(utils,read.csv)
importFrom(utils,write.csv)
19 changes: 7 additions & 12 deletions R/browse_metadata.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@
#' @importFrom plotly plot_ly layout
#' @importFrom htmlwidgets saveWidget
#' @importFrom tidyr pivot_longer
#' @importFrom utils browseURL

browse_metadata <- function(json_file = NULL, output_dir = NULL) {
browse_metadata <- function(json_file = NULL, output_dir = getwd()) {
# DEFINE INPUTS AND OUTPUTS ----

## Read in the json file containing the meta data, if null load the demo file
Expand All @@ -40,10 +41,6 @@ browse_metadata <- function(json_file = NULL, output_dir = NULL) {
meta_json <- fromJSON(json_file)
}

## Set output_dir to current wd if user has not provided it
if (is.null(output_dir)) {
output_dir <- getwd()
}
## Extract dataset from json_file
dataset <- meta_json$dataModel
dataset_name <- dataset$label
Expand Down Expand Up @@ -165,17 +162,15 @@ browse_metadata <- function(json_file = NULL, output_dir = NULL) {
saveWidget(widget = barplot_html, file = bar_fname, selfcontained = TRUE)

## Save the data that made the bar plot to a csv file
bar_fname <- paste0("BROWSE_bar_", base_fname, ".csv")
write.csv(count_empty_long, bar_fname, row.names = FALSE)
bar_data_fname <- paste0("BROWSE_bar_", base_fname, ".csv")
write.csv(count_empty_long, bar_data_fname, row.names = FALSE)

setwd(original_wd) # saveWidget has a bug with paths & saving

# OUTPUTS ----
cat("\n")
cli_alert_info("Three outputs have been saved to your output directory.")
cli_alert_info("The two html files are shown in your Viewer tab. Open in your browser for full screen viewing.")
cat("\n")
browseURL(table_fname)
browseURL(bar_fname)
cli_alert_info("Three outputs have been saved to your output directory, and two outputs should have opened in your browser.")

html_figs <- list(table_html = table_html, barplot_html = barplot_html)
return(html_figs)
} # end of function
6 changes: 3 additions & 3 deletions R/data_manipulation.R
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ json_table_to_df <- function(dataset, n) {
count_empty_desc <- function(table_df, table_colname) {
table_df["empty"] <- NA

for (data_v in 1:nrow(table_df)) {
if ((nchar(table_df$description[data_v]) == 1) |
(table_df$description[data_v] == "Description to follow") |
for (data_v in seq_len(nrow(table_df))) {
if ((nchar(table_df$description[data_v]) == 1) ||
(table_df$description[data_v] == "Description to follow") ||
(table_df$description[data_v] == "NA")) {
table_df$empty[data_v] <- "Yes"
} else {
Expand Down
15 changes: 6 additions & 9 deletions R/map_metadata.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
readline <- NULL

#' map_metadata
#'
#' This function will read in the metadata file for a chosen dataset, loop
Expand Down Expand Up @@ -49,18 +51,13 @@ map_metadata <- function(
json_file = NULL,
domain_file = NULL,
look_up_file = NULL,
output_dir = NULL,
output_dir = getwd(),
table_copy = TRUE) {
timestamp_now_fname <- format(Sys.time(), "%Y-%m-%d-%H-%M-%S")
timestamp_now <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")

# DEFINE INPUTS AND OUTPUTS ----

## Set output_dir to current wd if user has not provided it
if (is.null(output_dir)) {
output_dir <- getwd()
}

## Use 'load_data.R' to collect inputs (defaults or user inputs)
data <- load_data(json_file, domain_file, look_up_file)

Expand Down Expand Up @@ -169,7 +166,7 @@ map_metadata <- function(
start_end_v <- 0
start_v <- 0
end_v <- 0
while (length(start_end_v) != 2 | start_v > end_v) {
while (length(start_end_v) != 2 || start_v > end_v) {
start_end_v <- user_prompt_list(
prompt_text = "Which data elements do you want to process? 1:[start integer] and 2:[end integer]",
list_allowed = seq(from = 1, to = nrow(table_df), by = 1),
Expand Down Expand Up @@ -228,13 +225,13 @@ map_metadata <- function(
}

### Review user categorized data elements (optional)
#### Use 'user_prompt.R' to ask the user if they want to review
#### Use 'user_prompt.R' to ask the user if they want to review|
#### Use 'user_prompt_list.R' to ask the user which rows to edit
review_cats <- user_prompt(
prompt_text = "Would you like to review your categorisations? (y/n): ",
any_keys = FALSE
)
if (review_cats == "Y" | review_cats == "y") {
if (review_cats == "Y" || review_cats == "y") {
output_not_auto <- subset(output_df, note != "AUTO CATEGORISED")
output_not_auto["note (first 12 chars)"] <-
substring(output_not_auto$note, 1, 11)
Expand Down
11 changes: 3 additions & 8 deletions R/map_metadata_compare.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
#' domain_file = demo_domain_file
#' )
#' }
map_metadata_compare <- function(session_dir, session1_base, session2_base, json_file, domain_file, output_dir = NULL) {
map_metadata_compare <- function(session_dir, session1_base, session2_base, json_file, domain_file, output_dir = session_dir) {
timestamp_now_fname <- format(Sys.time(), "%Y-%m-%d-%H-%M-%S")

# DEFINE INPUTS ----
Expand All @@ -48,11 +48,6 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json
dataset <- meta_json$dataModel
dataset_name <- dataset$label

## Set output_dir to current wd if user has not provided it
if (is.null(output_dir)) {
output_dir <- session_dir
}

# VALIDATION CHECKS ----

## Use 'valid_comparison.R' to check if sessions can be compared to each other and to the json (min requirements):
Expand Down Expand Up @@ -131,7 +126,7 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json

## Use 'json_table_to_df.R' to extract table from meta_json into a df
table_find <- data.frame(table_n = numeric(length(dataset$childDataClasses)), table_label = character(length(dataset$childDataClasses)))
for (t in 1:length(dataset$childDataClasses)) {
for (t in seq_along(dataset$childDataClasses)) {
table_find$table_n[t] <- t
table_find$table_label[t] <- dataset$childDataClasses$label[t]
}
Expand All @@ -143,7 +138,7 @@ map_metadata_compare <- function(session_dir, session1_base, session2_base, json
ses_join <- join_outputs(session_1 = csv_1b, session_2 = csv_2b)

# FIND MISMATCHES AND ASK FOR CONCENSUS DECISION ----
for (datavar in 1:nrow(ses_join)) {
for (datavar in seq_len(nrow(ses_join))) {
consensus <- concensus_on_mismatch(ses_join, table_df, datavar, max(df_plots$code$code))
ses_join$domain_code_join[datavar] <- consensus$domain_code_join
ses_join$note_join[datavar] <- consensus$note_join
Expand Down
4 changes: 2 additions & 2 deletions R/map_metadata_convert.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ map_metadata_convert <- function(output_csv, output_dir) {
output <- read.csv(paste0(output_dir, "/", output_csv))
output_long <- output[0, ] # make duplicate

for (row in 1:(nrow(output))) {
for (row in seq_len(nrow(output))) {
if (grepl(",", output$domain_code[row])) { # Domain_code for this row is a list
domain_code_list <- output$domain_code[row] # extract Domain_code list
domain_code_list_split <- unlist(strsplit(domain_code_list, ",")) # split the list
for (code in 1:(length(domain_code_list_split))) { # for every domain code in list, create a new row
for (code in seq_len(length(domain_code_list_split))) { # for every domain code in list, create a new row
row_to_copy <- output[row, ] # extract row
row_to_copy$domain_code <- domain_code_list_split[code] # change domain code to single
output_long[nrow(output_long) + 1, ] <- row_to_copy # copy altered row
Expand Down
16 changes: 10 additions & 6 deletions R/user_interactions.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
readline <- NULL
scan <- NULL

#' Internal: user_categorisation
#'
#' Internal Function: This function is called within the map_metadata function. \cr \cr
Expand All @@ -16,7 +19,7 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
first_run <- TRUE
go_back <- ""

while (go_back == "Y" | go_back == "y" | first_run == TRUE) {
while (go_back == "Y" || go_back == "y" || first_run == TRUE) {
go_back <- ""
# print text to R console
cat(paste(
Expand All @@ -31,15 +34,15 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
validated <- FALSE
cat("\n \n")

while (decision == "" | validated == FALSE) {
while (decision == "" || validated == FALSE) {
decision <- readline("Categorise data element into domain(s). E.g. 3 or 3,4: ")

# validate input given by user
decision_int <- as.integer(unlist(strsplit(decision, ",")))
decision_int_NA <- any(is.na((decision_int)))
suppressWarnings(decision_int_max <- max(decision_int, na.rm = TRUE))
suppressWarnings(decision_int_min <- min(decision_int, na.rm = TRUE))
if (decision_int_NA == TRUE | decision_int_max > domain_code_max | decision_int_min < 0) {
if (decision_int_NA == TRUE || decision_int_max > domain_code_max || decision_int_min < 0) {
cli_alert_warning("Formatting is invalid or integer out of range. Provide one integer or a comma seperated list of integers.")
validated <- FALSE
} else {
Expand All @@ -55,7 +58,7 @@ user_categorisation <- function(data_element, data_desc, data_type, domain_code_
cat("\n \n")
decision_note <- readline("Categorisation note (or press enter to continue): ")

while (go_back != "Y" & go_back != "y" & go_back != "N" & go_back != "n") {
while (go_back != "Y" && go_back != "y" && go_back != "N" && go_back != "n") {
cat("\n \n")
go_back <- readline(prompt = paste0("Response to be saved is '", decision, "'. Would you like to re-do? (y/n): "))
}
Expand Down Expand Up @@ -116,7 +119,7 @@ user_categorisation_loop <- function(start_v, end_v, table_df, df_prev_exist, df
domain_code = as.character(lookup_subset$domain_code),
note = "AUTO CATEGORISED"
)
} else if (df_prev_exist == TRUE &
} else if (df_prev_exist == TRUE &&
nrow(df_prev_subset) == 1) {
###### 2 - copy from previous table
output_df <- output_df %>% add_row(
Expand Down Expand Up @@ -154,6 +157,7 @@ user_categorisation_loop <- function(start_v, end_v, table_df, df_prev_exist, df
#' If FALSE, only these are allowed: Y, y, N and n.
#' @return It returns variable text, depending on any_keys.
#' @keywords internal

user_prompt <- function(prompt_text, any_keys) {
# prompt text
if (any_keys == TRUE) {
Expand Down Expand Up @@ -189,7 +193,7 @@ user_prompt <- function(prompt_text, any_keys) {
user_prompt_list <- function(prompt_text, list_allowed, empty_allowed) {
list_to_process_error <- TRUE
list_to_process_in_range <- TRUE
while (list_to_process_error == TRUE | list_to_process_in_range == FALSE) {
while (list_to_process_error == TRUE || list_to_process_in_range == FALSE) {
tryCatch(
{
cat("\n \n")
Expand Down
11 changes: 2 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,6 @@ Load the library:
library(browseMetadata)
```

Set your working directory to an empty folder:

```r
setwd("/Users/your-username/test-browseMetadata")
```

### Demo (using the `R Studio` IDE)

Fo a longer more detailed demo, see the [Getting Started](https://aim-rsf.github.io/browseMetadata/articles/browseMetadata.html) page on the package website.
Expand All @@ -89,11 +83,10 @@ browse_metadata()
Upon success, you should see:

```
ℹ Three outputs have been saved to your output directory.
ℹ Open the two HTML files in your browser for full-screen viewing.
ℹ Three outputs have been saved to your output directory, and two outputs should have opened in your browser.
```

The output files are saved to your working directory. You can change the save location by adjusting the `output_dir` argument. Examples of outputs are available in [inst/outputs](https://github.com/aim-rsf/browseMetadata/tree/main/inst/outputs).
The output files are saved to your project directory. You can change the save location by adjusting the `output_dir` argument. Examples of outputs are available in [inst/outputs](https://github.com/aim-rsf/browseMetadata/tree/main/inst/outputs).

#### `map_metadata()`

Expand Down
Loading

0 comments on commit 9556a43

Please sign in to comment.