Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate etl enrollments #154

Closed
wants to merge 4 commits into from
Closed

Conversation

kairstenfay
Copy link
Contributor

@kairstenfay kairstenfay commented Jun 26, 2020

Delete the enrollments ETL.

A few questions:
both

  • should these command still exist but only print out that the ETL has been deprecated? if no, is there a good place to document the deprecated ETLs? Note: I did mention the deprecated /receiving/enrollment endpoint in the static API HTML document.

enrollments etl

  • should we remove all mentions of the enrollments ETL across ID3C? e.g.
    • README.md
    • kit ETL
    • manifest ETL
  • should we delete the enrollment-related db role(s)?

kits etl

  • should --sample-type be removed as an option from id3c manifest?
  • should mentions of 'kits' be removed from id3c identifier?
  • should we delete the the kit-processor db role?

Also, take a look at DEVELOPMENT.md. My how our processes have changed since I first wrote that 😄

The enrollments ETL (`id3c etl enrollments`) was used in Y1 for Audere
data. We no longer partner with Audere, so deprecate this ETL.
Not only is this document is fairly outdated at this point, but also,
the more logical place for it to live is seattleflu/documentation.
@kairstenfay kairstenfay requested a review from a team as a code owner June 26, 2020 22:53
Deprecate the kit ETL(`id3c etl kit`). It was the original swab-n-send
ETL, but we have REDCap for that now.
@kairstenfay
Copy link
Contributor Author

While we're at it, I also deleted the kit ETL.

We no longer receive enrollment data from Audere. Delete this endpoint.
@kairstenfay
Copy link
Contributor Author

I just remembered to deprecate the API endpoint for enrollments.

Copy link
Member

@tsibley tsibley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of questions to consider here! Thanks for getting this discussion
started. Detailed answers below, but overall I think we want to strike a
balance between removing dead, unused code and not creating unnecessary work.

should these command still exist but only print out that the ETL has been
deprecated? if no, is there a good place to document the deprecated ETLs?
Note: I did mention the deprecated /receiving/enrollment endpoint in the
static API HTML document.

If we were not the only users, then a true deprecation cycle would definitely
be warranted. As it is though, I think we should just remove things we're not
using.

Note that as used in this PR, "deprecation" is a misnomer. Deprecating
something means warning not to use it, but not actually removing it yet, thus
giving time for people to adjust their usage of it. This PR removes things.

should we remove all mentions of the enrollments ETL across ID3C? e.g.

I'd say yes, remove or adjust them to include that enrollments are now gone,
otherwise these mentions make no sense.

should we delete the enrollment-related db role(s)?
should we delete the the kit-processor db role?

I want to say yes, and I think for roles that's ok. But other database objects are more complicated.

For example, off the top of my head, there are:

enrollments
    receiving.enrollment
    enroller role
    enrollment-processor role

kits (more complicated, since they're in the warehouse?)
    warehouse.kit
    kit-processor role

More things might turn up with a search/audit of the codebase.

We definitely want to keep our production data tables (in both receiving and
warehouse)! But ideally new databases wouldn't have them (like an ID3C
instance for Nextstrain).

Maybe we write the Sqitch changes but deploy them (carefully!) with
--log-only? Or maybe we don't write the Sqitch changes and just let the
database objects be for now? Not causing much harm, although may cause future
confusion.

should --sample-type be removed as an option from id3c manifest?

Hmm, aren't we talking about formalizing sample types (swab types) further?
Might we re-use or re-purpose this? It seems all the flag does is add a static
value to the manifest record?

should mentions of 'kits' be removed from id3c identifier?

I don't think this is necessary. We've minted identifiers to use for kits, and
that's still a valid use case even if we don't ingest them back into ID3C.

from id3c.db import find_identifier
from id3c.db.session import DatabaseSession
from id3c.db.datatypes import Json
from id3c.db.types import KitRecord, SampleRecord
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KitRecord and SampleRecord can be removed from id3c.db.types too, as this appears to be their sole callsite.

@kairstenfay
Copy link
Contributor Author

If we were not the only users, then a true deprecation cycle would definitely
be warranted. As it is though, I think we should just remove things we're not
using.

👍

Note that as used in this PR, "deprecation" is a misnomer. Deprecating
something means warning not to use it, but not actually removing it yet, thus
giving time for people to adjust their usage of it. This PR removes things.

I'll create a new PR replacing "deprecate" with "remove".

should we remove all mentions of the enrollments ETL across ID3C? e.g.

I'd say yes, remove or adjust them to include that enrollments are now gone,
otherwise these mentions make no sense.

👍

should we delete the enrollment-related db role(s)?
should we delete the the kit-processor db role?

I want to say yes, and I think for roles that's ok. But other database objects are more complicated.

For example, off the top of my head, there are:

enrollments
    receiving.enrollment
    enroller role
    enrollment-processor role

kits (more complicated, since they're in the warehouse?)
    warehouse.kit
    kit-processor role

More things might turn up with a search/audit of the codebase.

We definitely want to keep our production data tables (in both receiving and
warehouse)! But ideally new databases wouldn't have them (like an ID3C
instance for Nextstrain).

Maybe we write the Sqitch changes but deploy them (carefully!) with
--log-only? Or maybe we don't write the Sqitch changes and just let the
database objects be for now? Not causing much harm, although may cause future
confusion.

I agree that we don't want to delete our production data tables, but perhaps their definitions could be moved to id3c-customizations? We could always create an issue to track the task of deleting the database objects if we don't want to deal with handling them now.

Hmm, aren't we talking about formalizing sample types (swab types) further?
Might we re-use or re-purpose this? It seems all the flag does is add a static
value to the manifest record?

Yes, I wrote this before learning that we would want to formalize sample types.

should mentions of 'kits' be removed from id3c identifier?

I don't think this is necessary. We've minted identifiers to use for kits, and
that's still a valid use case even if we don't ingest them back into ID3C.

👍

@kairstenfay kairstenfay mentioned this pull request Jul 10, 2020
@kairstenfay
Copy link
Contributor Author

Closing this for now because I intend to merge #155 over this PR, but I will continue to comment here for our conversation.

@kairstenfay kairstenfay deleted the deprecate-etl-enrollments branch January 15, 2021 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants