-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] AnnotationTable.TokenizedAnnotationTable #39
Comments
i am thinking of a complete rework of the parsing. I think we should use ARCtrl's composite column model. |
iirc ARCtrl parses annotation tables like this:
if that is true, then it should be easy to use for tokenization as well, by filling these composite columns with CvParams in an additional step. Sounds good? @HLWeil |
Yup that's pretty much it. It sounds fine with me, provided that it doesn't fail in some specific cases which should be checked. But as a starting point for getting your tokens for further use it should be good! |
Closing this as we use ARCtr's ARCTable parser now, which we then tokenize. See #48 |
I think we should reconsider the current design of this type as it's kind of an awkward state:
Currently it is split into a list of
IO columns
and a list ofTerm Columns
. This has two-fold problems according to the current proposed state of the ARC specification 1.2:Protocol REF
?1 Input
and1 Output
Column, so a list seems counterintuitive.Alternatively to trying to design this in some specific way, we could also keep it more naive and just have a list of columns (including terms, IOs and whatever)?
#25
The text was updated successfully, but these errors were encountered: