FastServe

Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.

Installation

git clone https://github.com/aniketmaurya/fastserve.git
cd fastserve
pip install .

Run locally

python -m fastserve

Usage/Examples

Serve Mistral-7B with Llama-cpp

from fastserve.models import ServeLlamaCpp

model_path = "openhermes-2-mistral-7b.Q5_K_M.gguf"
serve = ServeLlamaCpp(model_path=model_path, )
serve.run_server()

or, run python -m fastserve.models --model llama-cpp --model_path openhermes-2-mistral-7b.Q5_K_M.gguf from terminal.

Serve SDXL Turbo

from fastserve.models import ServeSDXLTurbo

serve = ServeSDXLTurbo(device="cuda", batch_size=2, timeout=1)
serve.run_server()

or, run python -m fastserve.models --model sdxl-turbo --batch_size 2 --timeout 1 from terminal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.1