Support Tied Weights in Llama Models #777

Helw150 · 2024-10-25T00:17:31Z

The new smaller Llama 3.2 1B and 3.2 3B models have tied weights - so Levanter throws an error currently if we try to import these models.

This adds HF support for that argument and just switches to using embedding.unembed when Embeddings are tied!

The new smaller Llama 3.2 1B and 3.2 3B models have tied weights - so Levanter throws an error currently if we try to import these models. https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/blob/main/config.json ![Screenshot 2024-10-24 8 17 03 PM](https://github.com/user-attachments/assets/08e79ed7-cab5-43f0-9ca6-f90e2fe73249) This adds HF support for that argument and just switches to using embedding.unembed when Embeddings are tied!

Support Tied Weights in Llama Models

62d92dd

Helw150 requested review from dlwh and Ivan-Zhou October 25, 2024 00:17

Fix Pre-Commit

d88f1f2

dlwh approved these changes Oct 25, 2024

View reviewed changes

dlwh merged commit 331c0aa into main Oct 25, 2024
8 checks passed

dlwh deleted the will/tied-llama branch October 25, 2024 17:00

TheQuantumFractal mentioned this pull request Nov 5, 2024

Internal eval fixes #786

Closed

Provide feedback