Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoNLLRDFUpdater splits off at @prefix #90

Open
chiarcos opened this issue Dec 22, 2023 · 0 comments · May be fixed by #91
Open

CoNLLRDFUpdater splits off at @prefix #90

chiarcos opened this issue Dec 22, 2023 · 0 comments · May be fixed by #91
Labels

Comments

@chiarcos
Copy link
Contributor

chiarcos commented Dec 22, 2023

Observation:
At every @prefix, CoNLLRDFUpdater seems to start a new sentence. (Also see example below.)

Solution A (easier?):
Maintain a stack of previously declared prefixes and inset them into the current code block. (Note that this will skrew up line numbers.)

Solution B (better):
check that the current code block contains at least one triple before starting a new sentence.

I assume that solution B is currently implemented, but that the triple check is just checking whether the block contains a triple separator. However, the @prefix notation requires a dot at the end of the prefix, too.

Example
Input:

    # newdoc id = txt/en/bibl.en_web.ACT.txt
    # newpar
    # sent_id = 1
    # text = The first book I wrote, Theophilus, concerned all that Jesus began both to do and to teach, until the day in which he was received up, after he had given commandment through the Holy Spirit to the apostles whom he had chosen.
    @prefix :      <file:///home/christian/Desktop/github/AURIS/#> .
    @prefix conll: <http://ufal.mff.cuni.cz/conll2009-st/task-description.html#> .
    @prefix nif:   <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
    @prefix powla: <http://purl.org/powla/powla.owl#> .
    @prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
    @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
    @prefix terms: <http://purl.org/acoli/open-ie/> .
    @prefix x:     <http://purl.org/acoli/conll-rdf/xml#> .

    :s1_22  rdf:type      nif:Word ;
            nif:nextWord  :s1_23 ;
            conll:EDGE    "det" ;
            conll:FEATS   "Definite=Def|PronType=Art" ;
            conll:FORM    "the" ;
            conll:HEAD    :s1_23 ;
            conll:ID      "22" ;
            conll:LEMMA   "the" ;
            conll:UPOS    "DET" ;
            conll:XPOS    "DT" .

Output & Stderr:

# newdoc id = txt/en/bibl.en_web.ACT.txt

# newpar

# sent_id = 1

# text = The first book I wrote, Theophilus, concerned all that Jesus began both to do and to teach, until the day in which he was received up, after he had given commandment through the Holy Spirit to the apostles whom he had chosen.

@prefix : <file:///home/christian/Desktop/github/AURIS/#> .

@prefix conll: <http://ufal.mff.cuni.cz/conll2009-st/task-description.html#> .

@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .

@prefix powla: <http://purl.org/powla/powla.owl#> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix :     <file:///home/christian/Desktop/github/AURIS/#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix terms: <http://purl.org/acoli/open-ie/> .

15:01:33.941 [Thread-0] ERROR org.apache.jena.riot - [line: 3, col: 1 ] Undefined prefix: 
15:01:33.950 [Thread-0] ERROR org.acoli.conll.rdf.CoNLLRDFUpdater - Exception while reading: @prefix x: <http://purl.org/acoli/conll-rdf/xml#> .

:s1_22 rdf:type nif:Word ;
nif:nextWord :s1_23 ;
conll:EDGE "det" ;
conll:FEATS "Definite=Def|PronType=Art" ;
conll:FORM "the" ;
conll:HEAD :s1_23 ;
conll:ID "22" ;
conll:LEMMA "the" ;
conll:UPOS "DET" ;
conll:XPOS "DT" .
@chiarcos chiarcos added the bug label Dec 22, 2023
@chiarcos chiarcos changed the title CoNLLStreamExtractor splits off at @prefix CoNLLRDFUpdater splits off at @prefix Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant