Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID Triplet Feature #146

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

ID Triplet Feature #146

wants to merge 26 commits into from

Conversation

cbrinson-rise8
Copy link
Collaborator

@cbrinson-rise8 cbrinson-rise8 commented Dec 2, 2024

Description

  • Implemenet the ID Triplet as described in RFC-001
  • Replace MRN, SSN and DRIVERS_LICENSE with the new IDENTIFIER triplet concept introduced in RFC-001.

Related Issues

closes #125

Additional Notes

  • Update the /link endpoint to accept an identifier triplet
  • Create blocking values on IDENTIFIER values
  • Feature match on IDENTIFIER and/or IDENTIFIER:XXX values
  • Add new test cases for blocking and feature matching on the new values
  • Update documentation in references.md regarding the new blocking key and feature

<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->

Checklist

Please review and complete the following checklist before submitting your pull request:

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.
  • I have minimized the number of reviewers to include only those essential for the review.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

  • The code follows best practices and conventions.
  • The changes implement the desired functionality or fix the reported issue.
  • The tests cover the new changes and pass successfully.
  • Any potential edge cases or error scenarios have been considered.

@cbrinson-rise8 cbrinson-rise8 force-pushed the feat/id-triplet branch 3 times, most recently from 9bee5d0 to 5ebabcd Compare December 16, 2024 19:30
@ericbuckley ericbuckley linked an issue Dec 16, 2024 that may be closed by this pull request
5 tasks
Copy link

codecov bot commented Dec 17, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 9 lines in your changes missing coverage. Please review.

Project coverage is 96.92%. Comparing base (c07aea2) to head (3c8121e).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/recordlinker/schemas/pii.py 87.69% 8 Missing ⚠️
src/recordlinker/schemas/identifier.py 99.40% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #146      +/-   ##
==========================================
- Coverage   97.21%   96.92%   -0.29%     
==========================================
  Files          30       31       +1     
  Lines        1327     1496     +169     
==========================================
+ Hits         1290     1450     +160     
- Misses         37       46       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cbrinson-rise8 cbrinson-rise8 marked this pull request as ready for review December 18, 2024 15:06
Copy link
Collaborator

@ericbuckley ericbuckley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start @cbrinson-rise8

docs/site/reference.md Outdated Show resolved Hide resolved
docs/site/reference.md Outdated Show resolved Hide resolved
src/recordlinker/assets/initial_algorithms.json Outdated Show resolved Hide resolved
src/recordlinker/assets/initial_algorithms.json Outdated Show resolved Hide resolved
src/recordlinker/hl7/fhir.py Outdated Show resolved Hide resolved
src/recordlinker/schemas/identifier.py Outdated Show resolved Hide resolved
src/recordlinker/schemas/pii.py Outdated Show resolved Hide resolved
src/recordlinker/schemas/pii.py Outdated Show resolved Hide resolved
src/recordlinker/schemas/pii.py Outdated Show resolved Hide resolved

def __str__(self):
"""
Return the value of the enum as a string.
"""
return self.value
return self.value
class Feature(pydantic.BaseModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chaning Feature from an enum to a pydantic.BaseModel means we can't auto list out all the choices in the API docs anymore for features. Since we already have an enum for the FeatureAttribtues and another for Identifier, did you consider just combining those to create a new Feature enum? For example

def all_features() -> typing.Iterator[str]:
    """
    Return a list of all possible features that can be used for comparison.
    """
    for feature in FeatureAttribute:
        yield str(feature)
        if feature == FeatureAttribute.IDENTIFIER:
            for identifier in IdentifierType:
                yield f"{feature}:{identifier}"
Feature = enum.Enum("Feature", [(f, f) for f in all_features()])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in an enum for this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me share a slightly different implementation. I'm thinking we don't need both Feature and FeatureEnum, we can just use one class for both functionality and docs. I'm not married to this implementation as it requires us to "parse" everytime we call suffix() or attribute(), so it might be better to split the suffix and attribute in the constructor so we don't have to parse everytime the functions are invoked. However this should give you a better idea as how we can meet both needs with one class.

class Feature(enum.Enum):
    """
    Enum for the different Patient attributes that can be used for comparison.
    """

    # Dynamically populate the enum members
    @classmethod
    def populate(cls):
        """Populate the Feature enum dynamically."""
        features = {}
        for f in FeatureAttribute:
            features[f.value] = f.value
            if f == FeatureAttribute.IDENTIFIER:
                for i in IdentifierType:
                    features[f"{f.value}:{i.value}"] = f"{f.value}:{i.value}"
        # Add members to the class
        cls._member_map_.update(features)
        cls._value2member_map_.update({v: cls(k, v) for k, v in features.items()})

    def attribute(self) -> FeatureAttribute:
        """Get the attribute of the feature."""
        return FeatureAttribute(self.value.split(":")[0])

    def suffix(self) -> typing.Optional[IdentifierType]:
        """Get the suffix of the feature if it is an identifier, otherwise None."""
        if self.attribute() == FeatureAttribute.IDENTIFIER:
            return IdentifierType(self.value.split(":")[1])
        return None

    def __str__(self):
        """Return the value of the enum as a string."""
        return self.value
Feature.populate()

@@ -83,9 +83,13 @@ linkage evaluation phase. The following features are supported:

: The patient's email address.

`DRIVERS_LICENSE`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MRN and SSN on lines 18 and 22 need to be removed as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

id triplet feature
2 participants