Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve detection of sections? #65

Open
ldt opened this issue Dec 26, 2023 · 1 comment
Open

How to improve detection of sections? #65

ldt opened this issue Dec 26, 2023 · 1 comment

Comments

@ldt
Copy link

ldt commented Dec 26, 2023

Hi,

Congrats for your great work and beautiful API!

I'm especially interested in using it to create a hierarchical document based on the original PDF.
My issue is that some sections are not correctly identified.

For example in your papermage.pdf file, the 2nd section is mixed with the 2.1 section:
image

And the title of the 3.3 section is partially identified:
image

I have similar issues on some of my documents.

I would like to know how it could be improved. Could it be more trained if there was a training set of documents with the correct sections that were pre-identified?

Let me know how I could help, the topic is really interesting!

@MpLebron
Copy link

I have the same problem! Hope the authors can sovle this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants