Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use RSQLite extended types to support DATE and DATETIME natively #12

Closed
schuemie opened this issue Jun 23, 2021 · 11 comments
Closed

Use RSQLite extended types to support DATE and DATETIME natively #12

schuemie opened this issue Jun 23, 2021 · 11 comments
Labels

Comments

@schuemie
Copy link
Member

RSQlite has added a new extend_types argument to the dbConnect function that adds support for DATE and DATETIME fields. I propose to switch Eunomia to use that. I see two challenges:

  1. This will require all the work-around code to be removed from DatabaseConnector and SqlRender that was emulating DATE and DATETIME fields. These changes will need to be made backwards compatible somehow. I guess we should define a new dialect in SqlRender called 'sqlite_extended_types', and switch to that if the RSQLite connection has extended_types == TRUE.

  2. We'd need to test whether all examples in the Book of OHDSI still work with the updated version.

Also looping in @ablack3 who has been playing with the extended_types for Andromeda.

@gowthamrao
Copy link
Member

@schuemie does this impact Andromeda? I believe Andromeda also uses SqlLite?

@ablack3
Copy link
Collaborator

ablack3 commented Jun 24, 2021

@gowthamrao Yes native date and datetime support has been added to the development branch of Andromeda. I've been hesitant to release it because I'm afraid of breaking code that relies on the workarounds in DatabaseConnector and SqlRender. Maybe we can coordinate new releases of Eunomia, SqlRender, DatabaseConnector, and Andromeda that implement date and datetime support.

@schuemie
Copy link
Member Author

@ablack3: Nobody should be using DatabaseConnector and SqlRender on Andromeda (and I don't think anybody is). We might have some calls to Andromeda::restoreDate(), but if we simply make sure that function returns the unchanged input if the input is already a date that should be sufficient.

We should of course test the new Andromeda with the various reverse dependencies like CohortMethod, but I don't a priori see any reason why switching Andromeda to extended types should break anything.

@ablack3
Copy link
Collaborator

ablack3 commented Jun 24, 2021

@schuemie Ok good to know. I'll plan for a new Andromeda release pretty soon then and run all reverse dependency tests. If I have time I can make sure that all uses of Andromeda in other HADES packages are covered by tests.

@ablack3
Copy link
Collaborator

ablack3 commented Jun 24, 2021

Do we need to recreate the inst/sqlite/cdm.tar.xz file? It seems so.

library(Eunomia)
#> Loading required package: DatabaseConnector
cd <- getEunomiaConnectionDetails()

con <- DBI::dbConnect(RSQLite::SQLite(), cd$server(), extended_types = TRUE)
dbGetQuery(con, "select * from main.drug_era limit 3")
#>   DRUG_ERA_ID PERSON_ID DRUG_CONCEPT_ID DRUG_ERA_START_DATE DRUG_ERA_END_DATE
#> 1        2707       181          738818           464400000         465609600
#> 2       45782      3109         1125315         -1227744000       -1226534400
#> 3        1918       126         1118084           620784000         620784000
#>   DRUG_EXPOSURE_COUNT GAP_DAYS
#> 1                   1     5389
#> 2                   1   -14196
#> 3                   1     7185
dbGetQuery(con, "select typeof(drug_era_start_date) from main.drug_era limit 1")
#>   typeof(drug_era_start_date)
#> 1                        real
dbGetQuery(con, "PRAGMA table_info(drug_era);")
#>   cid                name type notnull dflt_value pk
#> 1   0         DRUG_ERA_ID REAL       0         NA  0
#> 2   1           PERSON_ID REAL       0         NA  0
#> 3   2     DRUG_CONCEPT_ID REAL       0         NA  0
#> 4   3 DRUG_ERA_START_DATE REAL       0         NA  0
#> 5   4   DRUG_ERA_END_DATE REAL       0         NA  0
#> 6   5 DRUG_EXPOSURE_COUNT REAL       0         NA  0
#> 7   6            GAP_DAYS REAL       0         NA  0
DBI::dbDisconnect(con)

Created on 2021-06-24 by the reprex package (v2.0.0)

@msuchard
Copy link
Member

Just an FYI, a small # of the unit-tests in CohortMethod are calling DatabaseConnector / SqlRender on Andromeda. We are doing this to generate a competing risk event to test the Fine-Gray model. But the proposed changes above should not break anything.

@ablack3
Copy link
Collaborator

ablack3 commented Jun 25, 2021

Thanks for the heads up @msuchard. I'll check those tests.

@ablack3
Copy link
Collaborator

ablack3 commented Jul 2, 2021

I'll start an outline the changes required to complete this issue:

Changes to DatabaseConnector

Changes to Eunomia

Changes to SQLRender

  • ???

@schuemie
Copy link
Member Author

schuemie commented Jul 2, 2021

Yes, DatabaseConnector can, based on the extended_types flag, correct the dates or just pass them on.

SqlRender will need a separate target dialect (e.g. 'sqlite_extended_types') that has a subset of the translation rules for 'sqlite', removing all rules related to emulating dates.

@schuemie
Copy link
Member Author

I've taken the first steps:

  • DatabaseConnector (develop branch) now recognizes 'sqlite extended' as a new DBMS. Dates and datetimes are just passed through.
  • SqlRender (develop branch) now recognizes the 'sqlite extended' target dialect. This has all the translation rules of 'sqlite' minus the ones for dates and datetimes.
  • Eunomia (extended_types branch) now has the Eunomia data stored using extended types. The getEunomiaConnectionDetails() function returns a connection with dbms = 'sqlite extended'.

So far so good, but we still need to make sure we support all functions and structures expected by SqlRender. For example, RSQLite with extended_types = TRUE still does not support DATEFROMPARTS() as a SQL statement. In fact, it is very unclear what exactly the extended types adds to SQLite. It adds the DATE and DATETIME types, but as far as I can tell does not change any of the SQL functions. For example the query `SELECT DATE('now');' still returns an object of type character in R.

@schuemie
Copy link
Member Author

It seems there aren't any operations supported for DATE and DATETIME in SQLite. It therefore might be that the extended types makes it harder, not easier, to handle dates, since instead of working around dates all of the time we'll now have to work around them some of the times.

I added an issue to RSQLite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants