Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cohort subset to allow custom windowing #197

Merged
merged 4 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: CohortGenerator
Type: Package
Title: Cohort Generation for the OMOP Common Data Model
Version: 0.11.2
Version: 0.12.0
Date: 2024-09-30
Authors@R: c(
person("Anthony", "Sena", email = "[email protected]", role = c("aut", "cre")),
Expand Down Expand Up @@ -46,6 +46,6 @@ License: Apache License
VignetteBuilder: knitr
URL: https://ohdsi.github.io/CohortGenerator/, https://github.com/OHDSI/CohortGenerator
BugReports: https://github.com/OHDSI/CohortGenerator/issues
RoxygenNote: 7.3.1
RoxygenNote: 7.3.2
Encoding: UTF-8
Language: en-US
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
CohortGenerator 0.12.0
======================

- Backwards compatable extension to CohortSubsetOperators and cohortSubsetWindows to allow windowing to be logic of any
length


CohortGenerator 0.11.2
=======================

Expand Down
28 changes: 16 additions & 12 deletions R/SubsetQueryBuilders.R
Original file line number Diff line number Diff line change
Expand Up @@ -49,23 +49,27 @@ CohortSubsetQb <- R6::R6Class(
inherit = QueryBuilder,
private = list(
innerQuery = function(targetTable) {
cohortWindowLogic <- lapply(private$operator$windows, function(window) {
lsql <- " AND (S.@s_cohort_anchor >= DATEADD(d, @window_start_day, T.@window_anchor) AND S.@s_cohort_anchor <= DATEADD(d, @window_end_day, T.@window_anchor))"
SqlRender::render(lsql,
window_anchor = ifelse(window$targetAnchor == "cohortStart",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're inclined, in the CI package the choice of start/end is simply 'start' or 'end' seen here:https://github.com/OHDSI/CohortIncidence/blob/09cf6aa3a6f4cfdbe99d04e38b61a85441fa3925/R/Classes.R#L547. If we want to adopt that moniker for choosing a start or end date to those choices instead of 'cohortStart' or 'cohortEnd'. I origionally had CI using same thing (with cohort in the label) but switched it out to be not cohort-specific but just simply a choice of start/end. Just pointing out where we might be consistent across packages.

yes = "cohort_start_date",
no = "cohort_end_date"),
s_cohort_anchor = ifelse(window$subsetAnchor == "cohortStart",
yes = "cohort_start_date",
no = "cohort_end_date"),
window_end_day = window$endDay,
window_start_day = window$startDay)
})

cohortWindowLogic <- paste(cohortWindowLogic, collapse = "\n ")

sql <- SqlRender::readSql(system.file("sql", "sql_server", "subsets", "CohortSubsetOperator.sql", package = "CohortGenerator"))
sql <- SqlRender::render(sql,
target_table = targetTable,
output_table = self$getTableObjectId(),
end_window_anchor = ifelse(private$operator$endWindow$targetAnchor == "cohortStart",
yes = "cohort_start_date",
no = "cohort_end_date"
),
end_window_end_day = private$operator$endWindow$endDay,
end_window_start_day = private$operator$endWindow$startDay,
negate = ifelse(private$operator$negate == TRUE, yes = "1", no = "0"),
start_window_anchor = ifelse(private$operator$startWindow$targetAnchor == "cohortStart",
yes = "cohort_start_date",
no = "cohort_end_date"
),
start_window_end_day = private$operator$startWindow$endDay,
start_window_start_day = private$operator$startWindow$startDay,
cohort_window_logic = cohortWindowLogic,
cohort_ids = private$operator$cohortIds,
subset_length = ifelse(private$operator$cohortCombinationOperator == "any",
yes = 1,
Expand Down
157 changes: 104 additions & 53 deletions R/Subsets.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ SubsetCohortWindow <- R6::R6Class(
private = list(
.startDay = as.integer(0),
.endDay = as.integer(0),
.targetAnchor = "cohortStart"
.targetAnchor = "cohortStart",
.subsetAnchor = "cohortStart"
),
public = list(
#' @description List representation of object
Expand All @@ -60,6 +61,10 @@ SubsetCohortWindow <- R6::R6Class(
objRepr$targetAnchor <- jsonlite::unbox(private$.targetAnchor)
}

if (length(private$.subsetAnchor)) {
objRepr$subsetAnchor <- jsonlite::unbox(private$.subsetAnchor)
}

objRepr
},
#' To JSON
Expand All @@ -76,7 +81,8 @@ SubsetCohortWindow <- R6::R6Class(
return(all(
self$startDay == criteria$startDay,
self$endDay == criteria$endDay,
self$targetAnchor == criteria$targetAnchor
self$targetAnchor == criteria$targetAnchor,
self$subsetAnchor == criteria$subsetAnchor
))
}
),
Expand Down Expand Up @@ -107,22 +113,43 @@ SubsetCohortWindow <- R6::R6Class(
checkmate::assertChoice(x = targetAnchor, choices = c("cohortStart", "cohortEnd"))
private$.targetAnchor <- targetAnchor
return(self)
},
#' @field subsetAnchor Boolean
subsetAnchor = function(subsetAnchor) {
if (missing(subsetAnchor)) {
return(private$.subsetAnchor)
}
checkmate::assertChoice(x = subsetAnchor, choices = c("cohortStart", "cohortEnd"))
private$.subsetAnchor <- subsetAnchor
return(self)
}
)
)

# createSubsetCohortWindow ------------------------------
#' A definition of subset functions to be applied to a set of cohorts
#' @title Create a relative time window for cohort subset operations
#' @description
#' This function is used to create a relative time window for
#' cohort subset operations. The cohort window allows you to define an interval
#' of time relative to the target cohort's start/end date and the
#' subset cohort's start/end date.
#' @export
#' @param startDay The start day for the window
#' @param endDay The end day for the window
#' @param targetAnchor To anchor using the target cohort's start date or end date
#' @param startDay The start day for the time window
#' @param endDay The end day for the time window
#' @param targetAnchor To anchor using the target cohort's start date or end date.
#' The parameter is specified as 'cohortStart' or 'cohortEnd'.
#' @param subsetAnchor To anchor using the subset cohort's start date or end date.
#' The parameter is specified as 'cohortStart' or 'cohortEnd'.
#' @returns a SubsetCohortWindow instance
createSubsetCohortWindow <- function(startDay, endDay, targetAnchor) {
createSubsetCohortWindow <- function(startDay, endDay, targetAnchor, subsetAnchor = NULL) {
if (is.null(subsetAnchor))
subsetAnchor <- "cohortStart"

window <- SubsetCohortWindow$new()
window$startDay <- startDay
window$endDay <- endDay
window$targetAnchor <- targetAnchor
window$subsetAnchor <- subsetAnchor
window
}

Expand Down Expand Up @@ -271,19 +298,35 @@ CohortSubsetOperator <- R6::R6Class(
.cohortIds = integer(0),
.cohortCombinationOperator = "all",
.negate = FALSE,
.startWindow = SubsetCohortWindow$new(),
.endWindow = SubsetCohortWindow$new()
.windows = list()
),
public = list(

#' @param definition json character or list - definition of subset operator
#'
#' @return instance of object
initialize = function(definition = NULL) {
# support backwards compatibility with old style of storing definitions
if (!is.null(definition)) {
oldFormat <- c("startWindow", "endWindow") %in% names(definition)
if (any(oldFormat)) {
definition$startWindow$subsetAnchor <- "cohortStart"
definition$startWindow$subsetAnchor <- "cohortEnd"
definition["windows"] <- list(definition$startWindow, definition$endWindow)
definition$startWindow <- NULL
definition$endWindow <- NULL
}
}
super$initialize(definition)
},
#' to List
#' @description List representation of object
toList = function() {
objRepr <- super$toList()
objRepr$cohortIds <- private$.cohortIds
objRepr$cohortCombinationOperator <- jsonlite::unbox(private$.cohortCombinationOperator)
objRepr$negate <- jsonlite::unbox(private$.negate)
objRepr$startWindow <- private$.startWindow$toList()
objRepr$endWindow <- private$.endWindow$toList()
objRepr$windows <- lapply(private$.windows, function(x) { x$toList() })

objRepr
},
Expand All @@ -306,20 +349,23 @@ CohortSubsetOperator <- R6::R6Class(
cohortIds <- sprintf("cohorts: (%s)", paste(self$cohortIds, collapse = ", "))
nameString <- paste0(nameString, cohortIds)

windowString <- lapply(self$windows, function(window) {
paste(
"subset",
tolower(SqlRender::camelCaseToTitleCase(window$subsetAnchor)),
"is within D:",
window$startDay,
"- D:",
window$endDay,
"of target",
tolower(SqlRender::camelCaseToTitleCase(window$targetAnchor))
)
})

nameString <- paste(
nameString,
"starts within D:",
self$startWindow$startDay,
"- D:",
self$startWindow$endDay,
"of",
tolower(SqlRender::camelCaseToTitleCase(self$startWindow$targetAnchor)),
"and ends D:",
self$endWindow$startDay,
"- D:",
self$endWindow$endDay,
"of",
tolower(SqlRender::camelCaseToTitleCase(self$endWindow$targetAnchor))
"where",
paste(windowString, collapse = " and ")
)

return(paste0(nameString))
Expand Down Expand Up @@ -358,34 +404,21 @@ CohortSubsetOperator <- R6::R6Class(
private$.negate <- negate
self
},
#' @field startWindow The time window to use evaluating the subset cohort
#' start relative to the target cohort
startWindow = function(startWindow) {
if (missing(startWindow)) {
return(private$.startWindow)
#' @field windows list of time windows to use when evaluating the subset
#' cohort relative to the target cohort
windows = function(windows) {
if (missing(windows)) {
return(private$.windows)
}

if (is.list(startWindow)) {
startWindow <- do.call(createSubsetCohortWindow, startWindow)
realWindows <- list()
for (window in windows) {
if (is.list(window))
window <- do.call(createSubsetCohortWindow, window)
realWindows[[length(realWindows) + 1]] <- window
}

checkmate::assertClass(x = startWindow, classes = "SubsetCohortWindow")
private$.startWindow <- startWindow
self
},
#' @field endWindow The time window to use evaluating the subset cohort
#' end relative to the target cohort
endWindow = function(endWindow) {
if (missing(endWindow)) {
return(private$.endWindow)
}

if (is.list(endWindow)) {
endWindow <- do.call(createSubsetCohortWindow, endWindow)
}

checkmate::assertClass(x = endWindow, classes = "SubsetCohortWindow")
private$.endWindow <- endWindow
checkmate::assertList(x = realWindows, types = "SubsetCohortWindow")
private$.windows <- realWindows
self
}
)
Expand All @@ -400,19 +433,37 @@ CohortSubsetOperator <- R6::R6Class(
#' @param cohortCombinationOperator "any" or "all" if using more than one cohort id allow a subject to be in any cohort
#' or require that they are in all cohorts in specified windows
#'
#' @param startWindow A SubsetCohortWindow that patients must fall inside (see createSubsetCohortWindow)
#' @param endWindow A SubsetCohortWindow that patients must fall inside (see createSubsetCohortWindow)
#' @param startWindow DEPRECATED: Use `windows` instead.
#' @param endWindow DEPRECATED: Use `windows` instead.
#' @param windows A list of time windows to use to evaluate subset cohorts in relation to the
#' target cohorts. The logic is to always apply these windows with logical AND conditions.
#' See [@seealso [createSubsetCohortWindow()]] for more details on how to create
#' these windows.
#' @param negate The opposite of this definition - include patients who do NOT meet the specified criteria
#' @returns a CohortSubsetOperator instance
createCohortSubset <- function(name = NULL, cohortIds, cohortCombinationOperator, negate, startWindow, endWindow) {
createCohortSubset <- function(name = NULL, cohortIds, cohortCombinationOperator, negate, windows = list(), startWindow = NULL, endWindow = NULL) {
subset <- CohortSubsetOperator$new()
subset$name <- name
subset$cohortIds <- cohortIds
subset$cohortCombinationOperator <- cohortCombinationOperator
subset$negate <- negate
subset$startWindow <- startWindow
subset$endWindow <- endWindow

# Start and end windows must always have subset anchor values set to support backwards compatibility
if (!is.null(startWindow) || !is.null(endWindow)) {
warning("Arguments 'startWindow' and 'endWindow' is deprecated. Use 'windows' instead.")
}

if (!is.null(startWindow)){
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis section forces the old logic upon the user, which I dislike, but it also preseves the exact logic of the previous implementation. To me there is no perfect solution here...

startWindow$subsetAnchor <- "cohortStart"
windows[[length(windows) + 1]] <- startWindow
}

if (!is.null(endWindow)) {
endWindow$subsetAnchor <- "cohortEnd"
windows[[length(windows) + 1]] <- endWindow
}

subset$windows <- windows
subset
}

Expand Down
4 changes: 2 additions & 2 deletions inst/sql/sql_server/subsets/CohortSubsetOperator.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ FROM (
FROM @target_table T
JOIN @cohort_database_schema.@cohort_table S ON T.subject_id = S.subject_id
WHERE S.cohort_definition_id in (@cohort_ids)
AND (S.cohort_start_date >= DATEADD(d, @start_window_start_day, T.@start_window_anchor) AND S.cohort_start_date <= DATEADD(d, @start_window_end_day, T.@start_window_anchor))
AND (S.cohort_end_date >= DATEADD(d, @end_window_start_day, T.@end_window_anchor) and S.cohort_end_date <= DATEADD(d, @end_window_end_day, T.@end_window_anchor))
-- AND Cohort lies within window criteria
@cohort_window_logic
GROUP BY T.subject_id, T.cohort_start_date, T.cohort_end_date
HAVING COUNT (DISTINCT S.COHORT_DEFINITION_ID) >= @subset_length
) A
Expand Down
32 changes: 24 additions & 8 deletions man/CohortSubsetOperator.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading