Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triggers aren't converted to Bio- or Coref-Mentions and don't survive serialization #785

Open
kwalcock opened this issue Mar 2, 2023 · 2 comments

Comments

@kwalcock
Copy link
Member

kwalcock commented Mar 2, 2023

It looks very much like any BioEventMention or CorefEventMention that is deserialized is forced to have a BioTextBoundMention or CorefTextBoundMention as a trigger:

toBioMention(mjson \ "trigger", docMap).toBioMention.asInstanceOf[BioTextBoundMention],

toCorefMention(mjson \ "trigger", docMap).toCorefMention.asInstanceOf[CorefTextBoundMention],

However, when the BioEventMention or CorefEventMention is created, the TextBoundMention in the trigger is not converted similarly:

This might cause many problems, but one is that a serialized and then deserialized Bio- or CorefEventMention will have its trigger change type from a simple TextBoundMention to a more specific one so that the round trip is essentially invalid. A newly enabled but old test confirms this.

Those lines should probably be changed to

          m.trigger.toBioMention.asInstanceOf[BioTextBoundMention],

and

          m.trigger.toCorefMention.asInstanceOf[CorefTextBoundMention],

It would be good for others who know more about this project to confirm the intention of the original design and consider whether the change would cause problems. Thanks.

FYI @enoriega, @MihaiSurdeanu

@kwalcock
Copy link
Member Author

kwalcock commented Mar 2, 2023

For instance, and this was a concern in Eidos, the original TextBoundMention may be shared between two EventMentions and something may be depending on that reference equality. That single trigger will be converted into two separate CorefTextBoundMention copies which might be manipulated independently, for instance, by serializing them both. They will have the same ID, leading to duplicate keys in some database, for example.

@kwalcock
Copy link
Member Author

kwalcock commented Mar 3, 2023

Similarly, it doesn't look like the paths are converted. I get something like a CorefEventMention but the paths it has are simple BioTextBoundMentions. This can mean that the IDs don't match up. It looks like sometimes the same mention is both an argument as a CorefTextBoundMention and in the path as a BioTextBoundMention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant