Try llama.cpp/ggml #1

maxbbraun · 2023-11-15T03:00:05Z

The main reason I chose karpathy/llama2.c over ggerganov/llama.cpp initially was that the former comes out of the box with very small (15M) models.

llama.cpp and ggml more generally is a more mature system with a number of optimizations including 4-bit quantization. Seems worth a try! Might have to train a right-sized model from scratch though.

Potentially relevant examples:

The text was updated successfully, but these errors were encountered:

maxbbraun added the enhancement New feature or request label Nov 15, 2023

maxbbraun changed the title ~~Try llama.cpp~~ Try llama.cpp/ggml Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try llama.cpp/ggml #1

Try llama.cpp/ggml #1

maxbbraun commented Nov 15, 2023

Try llama.cpp/ggml #1

Try llama.cpp/ggml #1

Comments

maxbbraun commented Nov 15, 2023