v0.0.1
FastServe
Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.
Installation
git clone https://github.com/aniketmaurya/fastserve.git
cd fastserve
pip install .
Run locally
python -m fastserve
Usage/Examples
Serve Mistral-7B with Llama-cpp
from fastserve.models import ServeLlamaCpp
model_path = "openhermes-2-mistral-7b.Q5_K_M.gguf"
serve = ServeLlamaCpp(model_path=model_path, )
serve.run_server()
or, run python -m fastserve.models --model llama-cpp --model_path openhermes-2-mistral-7b.Q5_K_M.gguf
from terminal.
Serve SDXL Turbo
from fastserve.models import ServeSDXLTurbo
serve = ServeSDXLTurbo(device="cuda", batch_size=2, timeout=1)
serve.run_server()
or, run python -m fastserve.models --model sdxl-turbo --batch_size 2 --timeout 1
from terminal.