You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
In NER (named entity detection) we sometimes already know the segmentation of the entities, but still need to classify their type. E.g. in the sentece
'Paxar Corp said it has acquired Thermo-Print GmbH'
We might know that 'Paxar Corp' and 'Thermo-Print GmbH' are the relevant entities, but we want to predict their label as ORG. Quoting form Wikipedia:
Full named-entity recognition is often broken down, conceptually and possibly also in implementations,[6] as two distinct problems: detection of names, and classification of the names by the type of entity they refer to (e.g. person, organization, location and other[7]). The first phase is typically simplified to a segmentation problem: names are defined to be contiguous spans of tokens, with no nesting, so that "Bank of America" is a single name, disregarding the fact that inside this name, the substring "America" is itself a name. This segmentation problem is formally similar to chunking. The second phase requires choosing an ontology by which to organize categories of things.
Describe the solution you'd like
Perhaps add an optional param named spans to SequenceLabeler.predict, which is a list of dictionaries. Each dictionary will contain the start and end indices.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
In NER (named entity detection) we sometimes already know the segmentation of the entities, but still need to classify their type. E.g. in the sentece
'Paxar Corp said it has acquired Thermo-Print GmbH'
We might know that 'Paxar Corp' and 'Thermo-Print GmbH' are the relevant entities, but we want to predict their label as ORG. Quoting form Wikipedia:
Describe the solution you'd like
Perhaps add an optional param named
spans
toSequenceLabeler.predict
, which is a list of dictionaries. Each dictionary will contain the start and end indices.The text was updated successfully, but these errors were encountered: