v0.1.0

github-actions released this 21 Sep 11:51

· 239 commits to main since this release

133dd7a

What's Changed

Support Falcon 180B by @casper-hansen in #35
[NEW] GEMV kernel implementation by @casper-hansen in #40
Allow user to use custom calibration data for quantization by @boehm-e in #27
Safetensors and model sharding by @casper-hansen in #47
2x faster context processing with GEMV by @casper-hansen in #58
Support kv_heads by @casper-hansen in #60
Refactor quantization code by @casper-hansen in #62
support windows by @qwopqwop200 in #53
Improve model loading by @casper-hansen in #66

New Contributors

@boehm-e made their first contribution in #27

Full Changelog: v0.0.2...v0.1.0

Contributors

boehm-e, casper-hansen, and qwopqwop200

Assets 10