Releases: aniketmaurya/fastserve-ai
Releases · aniketmaurya/fastserve-ai
v0.0.3
What's Changed
- Feat/refactor api by @aniketmaurya in #8
- deploy lightning by @aniketmaurya in #9
- add face recognition by @aniketmaurya in #10
- add image classification by @aniketmaurya in #11
- document youtube video by @aniketmaurya in #12
- [pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #13
- Serve UI by @aniketmaurya in #14
- Redirect
/ui
to web app by @aniketmaurya in #15 - add vLLM by @aniketmaurya in #21
New Contributors
- @pre-commit-ci made their first contribution in #13
Full Changelog: v0.0.2...v0.0.3
v0.0.2
What's Changed
- fix ci by @aniketmaurya in #2
- improved exception handling by @aniketmaurya in #4
- improve testing by @aniketmaurya in #5
- refactor WaitedObject by @aniketmaurya in #6
- Implement handler design by @aniketmaurya in #7
New Contributors
- @aniketmaurya made their first contribution in #2
Full Changelog: v0.0.1...v0.0.2
v0.0.1
FastServe
Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.
Installation
git clone https://github.com/aniketmaurya/fastserve.git
cd fastserve
pip install .
Run locally
python -m fastserve
Usage/Examples
Serve Mistral-7B with Llama-cpp
from fastserve.models import ServeLlamaCpp
model_path = "openhermes-2-mistral-7b.Q5_K_M.gguf"
serve = ServeLlamaCpp(model_path=model_path, )
serve.run_server()
or, run python -m fastserve.models --model llama-cpp --model_path openhermes-2-mistral-7b.Q5_K_M.gguf
from terminal.
Serve SDXL Turbo
from fastserve.models import ServeSDXLTurbo
serve = ServeSDXLTurbo(device="cuda", batch_size=2, timeout=1)
serve.run_server()
or, run python -m fastserve.models --model sdxl-turbo --batch_size 2 --timeout 1
from terminal.