Skip to content

Latest commit

 

History

History
32 lines (23 loc) · 687 Bytes

README.md

File metadata and controls

32 lines (23 loc) · 687 Bytes

Simple Transformers

Implementing transformers papers very simplified because until you code it you don't really understand it.

Features

Models

  • Implement GPT
  • Implement BERT
  • Implement T5

Training

Inference

  • Implement top-k
  • Implement temperature

Supervised Fine Tuning

  • Implement addition of new tokens

Metrics

  • Implement hallucination metrics

Optimizations

  • Flash attention