v2.0.3
Important changes
- Add: Support for the Falcon2 by @Nilabhra in #1886
- New speculation method MLPSpeculator. by @JRosenkranz in #1865
- Pali gemma modeling by @drbh in #1895
What's Changed
- Fix: "Fixing" double BOS for mistral too. by @Narsil in #1843
- Adding scripts to prepare load data. by @Narsil in #1841
- Remove misleading warning (not that important nowadays anyway). by @Narsil in #1848
- feat: prefer huggingface_hub in docs and show image api by @drbh in #1844
- Updating Phi3 (long context). by @Narsil in #1849
- Add router name to /info endpoint by @Wauplin in #1854
- Upgrading to rust 1.78. by @Narsil in #1851
- update xpu docker image and use public ipex whel by @sywangyi in #1860
- Refactor layers. by @Narsil in #1866
- Granite support? by @Narsil in #1882
- Add: Support for the Falcon2 11B architecture by @Nilabhra in #1886
- MLPSpeculator. by @JRosenkranz in #1865
- Fixing truncation. by @Narsil in #1890
- Correct 'using guidance' link by @brandon-lockaby in #1892
- Add GPT-2 with flash attention by @danieldk in #1889
- Removing accepted ids in the regular info logs, downgrade to debug. by @Narsil in #1898
- feat: add deprecation warning to clients by @drbh in #1855
- [Bug Fix] Update torch import reference in bnb quantization by @DhruvSrikanth in #1902
- Pali gemma modeling by @drbh in #1895
New Contributors
- @Nilabhra made their first contribution in #1886
- @brandon-lockaby made their first contribution in #1892
- @danieldk made their first contribution in #1889
- @DhruvSrikanth made their first contribution in #1902
Full Changelog: v2.0.2...v2.0.3