Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update source parsing for query filter parsing #280

Merged
merged 9 commits into from
Oct 18, 2023

Conversation

sarahmcdougall
Copy link
Contributor

Summary

For measures authored using QI-Core, the sources array is unpopulated due to unhandled scenarios related to parsing query sources. Updates parseSources to drill into expressions on a given query that are not immediate retrieves.

New behavior

The sources arrays on the query info should now be populated appropriately. The approach to source parsing is as follows:

  • When parsing sources, handle all sources whose expressions are retrieves (which is the current behavior)
  • If the source expression is not a retrieve, check that it contains a resultTypeSpecifier. For the QI-Core-authored measures, the source is of the following format (for an example of an Encounter datatype and “Union” type):
“source”: [
  {
    “localId”: 2,
    “alias”: “MyAlias”,
    “resultTypeSpecifier”: {
      “type”: “ListTypeSpecifier”,
      “elementType”: {
        “name”: “{http//hl7.org/fhir}Encounter”,
        “type”: “NamedTypeSpecifier”
      }
    },
    “expression”: {
      “localId”: “1”,
      “type”: “Union”,
      “resultTypeSpecifier”: {
        “type”: “ListTypeSpecifier”,
        “elementType”: {
          “name”: “{http://hl7.org/fhir}Encounter”,
          “type”: “NamedTypeSpecifier”
        }
      }
    }
  }
]

Since there is no datatype on the expression, the “elementType” is parsed to retrieve the associated datatype.

  • The relationship clauses are now also parsed for sources. The relationship clauses allow related sources to be used to restrict the elements included from another source in a query scope.

A potential side effect of the current query filter parsing implementation is that incorrect attribute paths are assigned to resource types during data requirements calculation. Note that this issue is not directly addressed in this pull request, as there are other issues involved, such as the “query stacking” approach that is implemented and how it handles the “Union” type that is seen in the ELM of QI-Core measures.

More information about the ELM structure is available at https://cql.hl7.org/04-logicalspecification.html.

Code changes

QueryFilterParser

  • Update parseSources function to now extract sources from the resultTypeSpecifier if present (in the case that the type is not a Retrieve)
  • Parse the relationship clauses in addition to parsing the sources
  • Add function parseElementType to parse the elementType on the expression (this is an alternative to parsing the datatype)
    ELMTypes
  • Adds resultTypeSpecifier as an option field on an ELM expression

Testing guidance

It is important to test against both QI-Core and non-QI-Core measures.
Some potential steps that can be taken for testing:

  • npm run test, npm run test:integration, etc. to ensure that no side effects were introduced. Pay specific attention to the query filter parsing unit tests
  • Run data requirements against various measures with the debug option enabled.
    • Within debug/gaps.json, pay specific attention to the queryInfo output on the retrieves. Previously the sources array would be empty for some of the QI-Core measures (CMS161 is an example). Check that this is no longer the case, and that the sources array is appropriately populated by looking back at the ELM output for the measure being tested.
    • Note that the data requirements output may not be 100% accurate (i.e. the paths on the filters may be inaccurate due to the current query stacking approach)
    • Check that the gaps.json output has not changed for non-QI-Core measures, except in the case that the relationships array sources are not encompassed in the sources. The data requirements output is not expected to change. This can be done by generating the output on this branch and on the main branch, and using the ‘Select for Compare’ feature in VSCode.
    • Specific measures that can be tested: CMS161, CMS1028, CMS142, CMS177

Open questions to check for when testing

  • Do we want to parse the relationships expressions alongside the source expressions? According to the logical specification, the relationship clauses “allow related sources to restrict the elements included from another source in a query scope…The elements of the related source are not included in the query scope.”
  • Is it “valid” to pull information from the resultTypeSpecifier? The cql-execution library contains types for TypeSpecifiers, but little (if any) handling is done in fqm-execution for parsing the TypeSpecifiers.
  • Right now, if an inner query exists, the inner query is parsed, and the original query’s sources are set to the inner query’s sources. Do we want to keep this logic intact? Using CMS161 as an example, there are instances where the inner query info sources contain a Procedure:
image

However, the original query info does not contain the Procedure and instead contains an Encounter:

image

When we set queryInfo.sources = innerQueryInfo.sources, we lose this Procedure source.

@github-actions
Copy link

github-actions bot commented Sep 29, 2023

Coverage report

St.
Category Percentage Covered / Total
🟢 Statements
86.33% (-0.25% 🔻)
2362/2736
🟡 Branches
73.64% (-0.19% 🔻)
2190/2974
🟢 Functions
88.89% (-0.16% 🔻)
424/477
🟢 Lines
86.67% (-0.27% 🔻)
2282/2633
Show files with reduced coverage 🔻
St.
File Statements Branches Functions Lines
🟢
... / GapsReportBuilder.ts
93.66% (-0.31% 🔻)
83.91% 100%
93.63% (-0.34% 🔻)
🟢
... / QueryFilterParser.ts
85.31% (-1.56% 🔻)
78.57% (-1.86% 🔻)
97.06% (-2.94% 🔻)
85.07% (-1.58% 🔻)

Test suite run success

447 tests passing in 31 suites.

Report generated by 🧪jest coverage report action from bead5ad

Copy link
Contributor

@elsaperelli elsaperelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are looking good so far except for some weirdness with existing QueryFilterParser code. Looks like some expressions that should be in the sources array in gaps.json are being overwritten. Functionality looks okay with CMS161 and CMS177 but looks like there are issues with CMS142.

Other than that, the data-requirements outputs are the same for these measures and unit, integration, and regression tests pass 👍

@@ -247,6 +274,17 @@ function parseDataType(retrieve: ELMRetrieve): string {
return retrieve.dataType.replace(/^(\{http:\/\/hl7.org\/fhir\})?/, '');
}

/**
* Pulls out the resource type of a result type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a bit more info to this? Does "Pulls out the resource type from a resultTypeSpecifier on an expression"? Or something like that? Like maybe include expression there somewhere

}
});
// parse relationship clauses
query.relationship.forEach(relationship => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused about the addition of this forEach...it is parsing through the relationship array rather than the source array so I think either we can change the name of this function to incorporate that somehow or perhaps put this in its own parseRelationship function? What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to parse the 'relationship`. But will need to parse it when it is something other than a retrieve, similar to above.

Copy link
Contributor

@hossenlopp hossenlopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments that may bring further discussion. This also does not deconflict the sources to their filters when building dataRequirements and will still result in the wrong filters being added to the wrong data types, ex: the Procedure.period issue.

}
});
// parse relationship clauses
query.relationship.forEach(relationship => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to parse the 'relationship`. But will need to parse it when it is something other than a retrieve, similar to above.

* @returns FHIR ResourceType name.
*/
function parseElementType(expression: ELMExpression): string {
const elementType = (expression.resultTypeSpecifier as ListTypeSpecifier).elementType as NamedTypeSpecifier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't a ListTypeSpecifier of NamedTypeSpecifier or a NamedTypeSpecifier then perhaps we should gracefully not return anything. In the case it isn't or doesn't exist this could error out.

src/types/ELMTypes.ts Show resolved Hide resolved
@sarahmcdougall sarahmcdougall force-pushed the datareq-attribute-paths branch from b0c964d to 7556acc Compare October 16, 2023 19:42
@sarahmcdougall
Copy link
Contributor Author

Summary of changes made in newest commits:

  1. Before adding filters to data requirement, GapsReportBuilder now checks that a queryInfo source exists whose alias matches the alias of the data type query. If no source exists, filters are not added to the data requirement.
  2. Before adding filters to data requirement, GapsReportBuilder now checks that the alias of the queryInfo source matches the alias of the filter. If the aliases do not match, filters are not added to the data requirement. All the filter types in the function extend from AttributeFilter, which has alias as a required field. I did notice a few filters during testing that do not have alias populated but it seems that this is the case when the type is ‘unknown’ and ‘withError’ is populated.
  3. In the QueryFilterParser, we now only copy over the first index of the inner query info source rather than the entire array of sources. While it usually is the case that there is only one source, it is possible for there to be multiple sources (like when we are working with relationships), and so we want to only replace the first index and copy over the rest of the sources so that we do not lose any information.
  4. parseSources was updated to consolidate the logic between sources and relationships since we are essentially treating them the same. The function now parses sources and relationships in the same loop.
  5. Type handling added for parseElementType.

Changes to the output:
We no longer receive the “Procedure.period” error (or similar errors depending on the measure under test). This is due to the additional query alias handling that was added to GapsReportBuilder. The correct filter does not get added to the data requirements because we never actually parse the relationship during the query filter parsing. Is this still in scope of this task? It looks like the relationships do get parsed in RetrievesFinder but the suchThat expressions do not get parsed in the query parser. I believe that significant re-working of the parseQueryInfo function will be required to dig down into the relationship clauses.

Copy link
Contributor

@elsaperelli elsaperelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good updates! It looks like the sources array is now getting properly populated with everything and not getting overwritten!

The one thing I am confused about and I am not sure if it was intentional or not based on your summary of the changes is that the dataRequirements output.json now no longer contains dateFilter on one of the data requirements (seen for CMS177, CMS161, and CMS142).

Other than that, unit tests, integration tests, and regression still pass!

Comment on lines 628 to 637
if (cf !== null) {
dataRequirement.codeFilter?.push(cf);
}
} else if (df.type === 'during') {
const dateFilter = generateDetailedDateFilter(df as DuringFilter);
if (dataRequirement.dateFilter) {
dataRequirement.dateFilter.push(dateFilter);
} else {
dataRequirement.dateFilter = [dateFilter];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this code was already there but I am a little confused by it. So for the first if statement, are we only pushing cf to dataRequirement.codeFilter if dataRequirement.codeFilter already exists? If not, then can we consolidate the code in the else if to do the same thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in DataRequirementHelpers.ts this function gets called after calling generateDataRequirement on the retrieve. The generateDataRequirement function already creates the codeFilter array if retrieve.valueSet or retrieve.code exists. So it seems that in most cases dataRequirement.codeFilter will already exist by the time that we get here, unless the retrieve does not have a code or valueSet. In that case, it seems that the cf here would not get added since codeFilter does not exist. Perhaps this is the expected behavior? @hossenlopp does this reasoning seem right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this reasoning sounds good. It is possible that they may have not used a filter on the retrieve itself and may be filtering in the where part of the query. So codeFilter could definitely use similar logic to the dateFilter to see if the list is there first, and create it if it is not.

src/gaps/QueryFilterParser.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@hossenlopp hossenlopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good and collects the source information correctly now. Just that one suggestion on the codeFilter.

While it would be nice to have this code more unit tested, I think it would be better to revisit unit testing when we re-work dataRequirements.

Comment on lines 628 to 637
if (cf !== null) {
dataRequirement.codeFilter?.push(cf);
}
} else if (df.type === 'during') {
const dateFilter = generateDetailedDateFilter(df as DuringFilter);
if (dataRequirement.dateFilter) {
dataRequirement.dateFilter.push(dateFilter);
} else {
dataRequirement.dateFilter = [dateFilter];
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this reasoning sounds good. It is possible that they may have not used a filter on the retrieve itself and may be filtering in the where part of the query. So codeFilter could definitely use similar logic to the dateFilter to see if the list is there first, and create it if it is not.

Copy link
Contributor

@elsaperelli elsaperelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@hossenlopp hossenlopp merged commit d1eb581 into master Oct 18, 2023
5 checks passed
@hossenlopp hossenlopp deleted the datareq-attribute-paths branch October 18, 2023 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants