title | datePublished | cuid | slug | cover | tags |
---|---|---|---|---|---|
Unveiling the Coolest Tech Finds (Continue updating) |
Wed Jul 19 2023 16:55:43 GMT+0000 (Coordinated Universal Time) |
clk9yswla00020ajsa6fmf9y4 |
techfind-list |
technology |
Through this article, I aim to share my encounters with the coolest tech finds I come across - those awe-inspiring creations that make us gasp in wonder and fuel our imaginations.
1) Paper to HTML by Allen Institute for AI
While diving deeper into the world of cutting-edge tech, I stumbled upon a fascinating research paper titled "VILA: Improving structured content extraction from scientific PDFs using visual layout groups." The paper caught my attention because it addresses a critical issue in the realm of information extraction from PDFs.
As I continued my exploration, I made an accidental discovery that truly impressed me - "Paper to HTML: A Publicly Available Web Tool for Converting Scientific Pdfs into Accessible HTML." Curiosity got the better of me, and I immediately decided to try out their demo at https://papertohtml.org/ by uploading a random research paper.
Surprisingly the result is excellent!
![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784821696/f87d501e-82b5-45b0-8a3f-8df352aa31f8.png align="center")
One standout feature that caught my eye was the clickable hyperlink for citation placeholders.
![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784837830/e71201b2-c553-42fa-b65d-ce498e475668.png align="center")
This functionality allows users to navigate seamlessly to the corresponding citation with a simple click.
The tool is also able to extract diagrams from the PDFs.
![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784958049/b3f2697a-28af-4b6f-a37f-f889241630c2.png align="center")
However, extracting mathematical equations appears to be a challenge. I believe it's a common hurdle in text extraction processes.
![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689785003656/6956328a-df77-4c4c-9ebf-6b5e15f0bd08.png align="center")
As I pondered the intricacies of text extraction, I couldn't help but wonder why this process remains so complex. There are likely numerous factors at play, from the inherent complexities of various file formats to the intricacies of language encoding.
LayoutParser is a Python library that provides a wide range of pre-trained deep learning models to detect the layout of a document image.
Github repo: Layout-Parser/layout-parser: A Unified Toolkit for Deep Learning Based Document Image Analysis (github.com)
Sample code: https://www.kaggle.com/code/ammarnassanalhajali/layout-parsing-starter
I also found a student who worked on document analysis a lot: lolipopshock (Shannon Shen) (github.com)
Another shocking news is the research on layout parsing has been started many years ago. I found a paper that was published during 2012: Layout-aware text extraction from full-text PDF of scientific articles | Source Code for Biology and Medicine | Full Text (biomedcentral.com)