v0.3.0
What's Changed
Improvements
- Add the ingestion of Astronomer's Cosmos documentation. #290
- Vastly improve the HTML chunking logic, content fetching logic for URLs, and the ingestion of Astronomer documentation from Astronomer's website. #293
- Vastly improve the Python & Markdown chunking logic, enforce strict token limit per chunk globally, improve multiple extract and scraping processes. #307
- Upgrade the Weaviate schema to utilize the new OpenAI embedding model
text-embedding-3-small
. #297 - Implement preprocessing steps for request prompts to handle invalid cases and parse invalid symbols. #279
- Add a backend message and a frontend banner to notify users of service maintenance. #294
- Update system prompts to discourage discussions on unwanted topics and to caution users about potential hallucinations. #308
Bug Fixes
- Resolve inconsistent output parsing and formatting in the MultiQueryRetriever prompt rewording. #276
- Fix typing errors in the Airflow documentation ingestion task introduced in 0.2.0 release. #283
- Introduce a delay between web scraping requests to prevent rate limiting issues. #269
- Address issues with the rate limiter by adding
sveltekit-rate-limiter
with automated exponential backoff and retry. #280 - Fix invalid parameter parsing bugs and improve the semi-automated quality evaluation DAG. #303
- Fix incorrect hyperlink formatting in website UI by amending the system prompt based on client type. #306
- Correct wrong default feedback scoring in Snowflake metrics ingestion DAG and track
client
type as an additional metric. #304
Misc
- Consolidate the Azure US East configuration and credentials. #288
- Modify the ingestion schedule interval to a weekly frequency. #284
Full Changelog: v0.2.0...v0.3.0