Happy Friday! This is the v1.2.0 release of LlamaChat, and the big update this week is support for configuring ✨ model hyperparameters ✨, alongside a bunch of other tweaks and improvements.
🔥 New
- You can now configure the model hyperparameters (including context size and sampling parameters like top-p and top-k values) for all of your chat sources in Settings > Sources. These parameters still default to sensible values for each model type, allowing you to tweak them as you see fit (#13).
- You can now configure the number of threads that LlamaChat runs text generation on from Settings > General (#13).
- You can now configure whether the model for each chat source is fully loaded into memory during prediction, which can improve performance for smaller models. If you're familiar with llama.cpp this controls the
--mlock
parameter. (#4)
🫧 Improved
- You can now import older
.ggml
files directly into LlamaChat without conversion, thanks to some upstream changes made to llama.cpp. (#3) - The chat view has been revamped with slicker message bubbles and animations. It also automatically scrolls to the bottom when new messages are added. (#18)
- The File menu has been improved: you can now add new sources with ⌘N.
- The Add Chat Source flow has been improved to make it (almost) pixel-perfect, and a dedicated Cancel button has been added to make it clearer how to exit (previously this could be done with Esc). (#7)
🐞 Bug Fixes
- Previously when converting PyTorch checkpoints directly in LlamaChat, an intermediary converted (but un-quantized)
.ggml
artefact was left on the filesystem. This has now been fixed, and any of these artefacts left by previous versions of LlamaChat are automatically cleaned up by on launch. (#10)
❤️ Sponsorships
- Sponsorship positions have been opened to help support the continued development of LlamaChat. Any support is much appreciated, and more info can be found on the sponsorship page.