title	datePublished	cuid	slug	cover	tags
Unveiling the Coolest Tech Finds (Continue updating)	Wed Jul 19 2023 16:55:43 GMT+0000 (Coordinated Universal Time)	clk9yswla00020ajsa6fmf9y4	techfind-list	https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/5K_ijgjiDko/upload/e1e14125813c0bd4f2420674f5d5f76f.jpeg	technology

Through this article, I aim to share my encounters with the coolest tech finds I come across - those awe-inspiring creations that make us gasp in wonder and fuel our imaginations.

1) Paper to HTML by Allen Institute for AI

While diving deeper into the world of cutting-edge tech, I stumbled upon a fascinating research paper titled "VILA: Improving structured content extraction from scientific PDFs using visual layout groups." The paper caught my attention because it addresses a critical issue in the realm of information extraction from PDFs.

As I continued my exploration, I made an accidental discovery that truly impressed me - "Paper to HTML: A Publicly Available Web Tool for Converting Scientific Pdfs into Accessible HTML." Curiosity got the better of me, and I immediately decided to try out their demo at https://papertohtml.org/ by uploading a random research paper.

Surprisingly the result is excellent!

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784821696/f87d501e-82b5-45b0-8a3f-8df352aa31f8.png align="center")

One standout feature that caught my eye was the clickable hyperlink for citation placeholders.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784837830/e71201b2-c553-42fa-b65d-ce498e475668.png align="center")

This functionality allows users to navigate seamlessly to the corresponding citation with a simple click.

The tool is also able to extract diagrams from the PDFs.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689784958049/b3f2697a-28af-4b6f-a37f-f889241630c2.png align="center")

However, extracting mathematical equations appears to be a challenge. I believe it's a common hurdle in text extraction processes.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1689785003656/6956328a-df77-4c4c-9ebf-6b5e15f0bd08.png align="center")

As I pondered the intricacies of text extraction, I couldn't help but wonder why this process remains so complex. There are likely numerous factors at play, from the inherent complexities of various file formats to the intricacies of language encoding.