Skip to content

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

License

Notifications You must be signed in to change notification settings

Steinbeck-Lab/DECIMER-Image_Transformer

 
 

Repository files navigation

🧪 DECIMER Image Transformer 🖼️

Deep Learning for Chemical Image Recognition using Efficient-Net V2 + Transformer

DECIMER Logo

License Maintenance GitHub issues GitHub contributors tensorflow DOI Documentation Status GitHub release PyPI version fury.io


📚 Table of Contents


🔬 Abstract

The DECIMER 2.2 project tackles the OCSR (Optical Chemical Structure Recognition) challenge using cutting-edge computational intelligence methods. Our goal? To provide an automated, open-source software solution for chemical image recognition.

We've supercharged DECIMER with Google's TPU (Tensor Processing Unit) to handle datasets of over 1 million images with lightning speed!


🧠 Method and Model Changes

🖼️ Image Feature Extraction

Now utilizing EfficientNet-V2 for superior image analysis

🔮 SMILES Prediction

Employing a state-of-the-art transformer model

🚀 Training Enhancements

  1. TFRecord Files: Lightning-fast data reading
  2. Google Cloud Buckets: Efficient cloud storage solution
  3. TensorFlow Data Pipeline: Optimized data loading
  4. TPU Strategy: Harnessing the power of Google's TPUs

💻 Installation

# Create a conda wonderland
conda create --name DECIMER python=3.10.0 -y
conda activate DECIMER

# Equip yourself with DECIMER
pip install decimer

🎮 Usage

from DECIMER import predict_SMILES

# Unleash the power of DECIMER
image_path = "path/to/your/chemical/masterpiece.jpg"
SMILES = predict_SMILES(image_path)
print(f"🎉 Decoded SMILES: {SMILES}")

✍️ DECIMER - Hand-drawn Model

🌟 New Feature Alert! 🌟

Our latest model brings the magic of AI to hand-drawn chemical structures!

DOI


📜 Citation

If DECIMER helps your research, please cite:

  1. Rajan K, et al. "DECIMER.ai - An open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications." Nat. Commun. 14, 5045 (2023).
  2. Rajan, K., et al. "DECIMER 1.0: deep learning for chemical image recognition using transformers." J Cheminform 13, 61 (2021).
  3. Rajan, K., et al. "Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture," J Cheminform 16, 78 (2024).

🙏 Acknowledgements

  • A big thank you to Charles Tapley Hoyt for his invaluable contributions!
  • Powered by Google's TPU Research Cloud (TRC)


👨‍🔬 Author: Kohulan


🌐 Project Website

Experience DECIMER in action at decimer.ai, brilliantly implemented by Otto Brinkhaus!


🏫 Research Group


📊 Project Analytics

Repobeats

About

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%