Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Available parsers? #39

Open
fjellandermedia opened this issue Jul 29, 2019 · 9 comments
Open

Available parsers? #39

fjellandermedia opened this issue Jul 29, 2019 · 9 comments

Comments

@fjellandermedia
Copy link

Excuse me if this is the wrong forum for this issue. If so, please close this thread (but I would in that case much appreciate a referral!)

I’m an independent developer in Sweden working a lot with different mobile and web apps för various church needs. I have encountered the USX format a few times already (developing apps for church needs often involved text from the Bible, who knew?!) and my own parsers of the USX data have been mediocre at best. To be fair I’m in no way an xml expert and my understanding of how to work with it is unfortunately limited.

So my question is: are there any resources available (parsers, xlst files or something) to convert USX format to html (preferably) or even plain text? I’ve seen Haiola, but it’s not entirely clear if that works flawlessly with USX since it’s built for USFX and I would also prefer something I can use in the command line (and I think Haiola is a gui app?). I’ve been duckduckgoing and googling all night but I haven’t found anything else. Any help would be greatly appreciated!

@jonbitgood
Copy link
Contributor

It's about as basic as it gets but I've made a usx.xslt that's generally compatible with usx 3.0. It looks nice with the css from API.bible

It'd be nice to get a more functional version put together and pushed here.

@fjellandermedia
Copy link
Author

Wow, this is great, Jon! Thanks a lot! This goes a long way! I agree that it would be great if there could be any “official” and more functional version at this repo. Maybe even a couple of ones, so you could choose if you’d like verse numbers, comments inline, etc. I think a lot of implementations of bible text in one way or another includes showing it as html.

@jonbitgood
Copy link
Contributor

I think you could do that with just a single source file and alter the bible.css? That way users could choose which variations they like without the loss of data. This open source bible reader operates in that way.

Anyways, I'll work on learning xslt and xpath to make a more functional version of the usx.xslt once it's more fleshed out I'll submit a pull request.

@klassenjm
Copy link
Contributor

@jonbitgood Thanks for replying. I have also inquired with some colleagues familiar with processing USX (including API.Bible). @fjellandermedia There is nothing else I can readily add to the repo here right now. I agree that it will be good to do so, and will welcome your PR.

@shadow-light
Copy link

Hi, just checking if anyone make any progress on this?

We also need a USX -> HTML converter (and will probably go ahead and build one if needed). It looks like a bulk of the work has been done already by api.bible with their open source styles: https://github.com/americanbible/scripture-styles/blob/master/scss/modules/_paragraphs.scss

So it looks like it's just a matter of converting XML tags to HTML approximations and adding classes that match the USX (which the stylesheets also use).

Is api.bible able to share their own code they used for doing the conversion?

@jonbitgood
Copy link
Contributor

I've got a few inputs and outputs using xslt to convert to HTML, epub, pdf ect. They've got some domain specific information in them. But I'd love to extract that and collab on them. I'll get in touch.

@ethanbarry
Copy link

I see it's been a while, but I have created a very rough parser that outputs LaTeX source. Maybe someone can use this? I needed it for a typesetting project, and wrote it over the course of a week or so...

@danzuep
Copy link

danzuep commented Nov 23, 2023

Thanks @jonbitgood for the XSLT file example! I used a step by step conversion process demonstrad by haiola/BibleFileLib as inspiration to make a more parser-friendly XML file from USX. Here's a link to the folder. I've also added a PowerShell script in there you can use for the conversion if you're not au fait with C# .NET. I've only tested it with the book of Genesis from a new Creative Commons licensed bible from The Digital Bible Library but posting it here anyway as I'll update it if I find any issues.

@jonbitgood
Copy link
Contributor

The XSLT file example has been expanded upon to account for various outputs - html, epub, sql, pdf, ect. It got a little custom and convoluted but maybe it'll be of help - all the code is open source and available here:

https://github.com/digitalbiblesociety/lamedh/tree/main/xslt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants