Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing attributes #20

Open
hackmanschorsch opened this issue Sep 21, 2020 · 6 comments
Open

missing attributes #20

hackmanschorsch opened this issue Sep 21, 2020 · 6 comments
Assignees

Comments

@hackmanschorsch
Copy link

hackmanschorsch commented Sep 21, 2020

<TextLine id="r1l3" custom="readingOrder {index:2;} datum {offset:4; length:13;datum:1696-02-15;} persoonsnaam {offset:33; length:12; continued:true;}">`

results in the next line - without the property/attibute 'datum'

<lb facs="#facs_2_r1l3" n="N003"/>die <datum>15 febr. 1696</datum> gehuwd was met <persoonsnaam>Aletta Catha</persoonsnaam>

@hackmanschorsch
Copy link
Author

This was also observed for other tags, e.g. persoonsnaam.
Examples with source files can be found in Transkribus - CollID 76546

@elespdn
Copy link

elespdn commented May 24, 2022

Hello, here at RISE we are also using these wonderful stylesheets to export from Transkribus in TEI.

We now have the same issue described here: the attributes in the page xml source are not rendered in the TEI output. For example:

<TextLine id="r1l7" custom="readingOrder {index:6;} hi {offset:0; length:1;rend:ornamentalInitial;}">
    <Coords points="358,852 941,869 962,849 1128,857 1138,811 358,803"/>
    <Baseline points="365,838 409,839 453,840 497,841 541,842 585,842 629,843 673,844 717,845 761,845 805,846 849,846 893,847 937,847 981,847 1025,847 1069,847 1131,850"/>
    <TextEquiv>
        <Unicode>En principio criò ...</Unicode>
    </TextEquiv>
</TextLine>

creates

<lb facs="#facs_1_r1l7" n="N007"/><hi>E</hi>n principio criò ...

where the attributes @rend and its value are missing.

Could someone suggest a strategy to add this to the transformation? The stylesheet is very complex, I guess additional rules should go where @Custom is parsed to produce the text (https://github.com/dariok/page2tei/blob/master/page2tei-0.xsl#L520), but I haven't been able to fix it.

@dariok
Copy link
Owner

dariok commented Jul 6, 2022

Sorry for the long wait! Too much to do…

I will try to have a look into this problem in the next 2–3 weeks as this has arisen in other projects, too. Thanks for your examples!

@dariok dariok self-assigned this Jul 6, 2022
@elespdn
Copy link

elespdn commented Jul 29, 2022

Thanks a lot @dariok !

Afterall there are also workarounds, one could create adhoc tags in Transkribus and then replace them with tag+attribute after the export ..

But should you need other examples or contribution, do let me know.

@elespdn
Copy link

elespdn commented Jul 29, 2022

I've heard that there are developments on the TEI export from Transkribus at the Biblioteca Hertziana, and I think @liladude is the specialist there, I've seen some of her talks! Maybe joining forces is possible? And https://github.com/eeditiones would like to develop better integration from Transkribus to TEI Publisher too, thus going through a TEI export.

@liladude
Copy link

Thanks @elespdn for the kind words and sorry for not noticing earlier, we are exactely tryng to link Transkribus + PAGE2TEI + TEI Publisher. Hopefully we will be able to join forces!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants