diff --git a/search/search_index.json b/search/search_index.json
index 02930e2..ace0a16 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Sibila","text":"Extract structured data from remote or local LLM models. Predictable output is important for serious use of LLMs.
- Query structured data into Pydantic objects, dataclasses or simple types.
- Access remote models from OpenAI, Anthropic, Mistral AI and other providers.
- Use vision models like GPT-4o, to extract structured data from images.
- Run local models like Llama-3, Phi-3, OpenChat or any other GGUF file model.
- Sibila is also a general purpose model access library, to generate plain text or free JSON results, with the same API for local and remote models.
No matter how well you craft a prompt begging a model for the format you need, it can always respond something else. Extracting structured data can be a big step into getting predictable behavior from your models.
See What can you do with Sibila?
"},{"location":"#structured-data","title":"Structured data","text":"To extract structured data, using a local model:
from sibila import Models\nfrom pydantic import BaseModel\n\nclass Info(BaseModel):\n event_year: int\n first_name: str\n last_name: str\n age_at_the_time: int\n nationality: str\n\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
Returns an instance of class Info, created from the model's output:
Info(event_year=1969,\n first_name='Neil',\n last_name='Armstrong',\n age_at_the_time=38,\n nationality='American')\n
Or to use a remote model like OpenAI's GPT-4, we would simply replace the model's name:
model = Models.create(\"openai:gpt-4\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
If Pydantic BaseModel objects are too much for your project, Sibila supports similar functionality with Python dataclasses. Also includes asynchronous access to remote models.
"},{"location":"#vision-models","title":"Vision models","text":"Sibila supports image input, alongside text prompts. For example, to extract the fields from a receipt in a photo:
from pydantic import Field\n\nmodel = Models.create(\"openai:gpt-4o\")\n\nclass ReceiptLine(BaseModel):\n \"\"\"Receipt line data\"\"\"\n description: str\n cost: float\n\nclass Receipt(BaseModel):\n \"\"\"Receipt information\"\"\"\n total: float = Field(description=\"Total value\")\n lines: list[ReceiptLine] = Field(description=\"List of lines of paid items\")\n\ninfo = model.extract(Receipt,\n (\"Extract receipt information.\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\ninfo\n
Returns receipt fields structured in a Pydantic object:
Receipt(total=5.88, \n lines=[ReceiptLine(description='BIS BORSE TERM.S', cost=3.9), \n ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99),\n ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99)])\n
Local vision models based on llama.cpp/llava can also be used.
\u2b50 Like our work? Give us a star!
"},{"location":"async/","title":"Asynchronous use","text":"All the model calls like extract(), classify, json() etc, are also available in an asynchronous version of the same name but ending in _async, for example extract_async(). For example:
Example
import asyncio\n\nfrom sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nasync def extract_names(): \n return await model.extract_async(list[str],\n \"Generate 20 English names with first name and surname\")\n\nasync def classify_spam():\n return await model.classify_async([\"spam\", \"not spam\"],\n \"I am a Nigerian prince and will make you very rich!\")\n\nasync def run_tasks():\n tasks = [extract_names(), classify_spam()]\n for task in asyncio.as_completed(tasks):\n res = await task\n print(\"Result:\", res)\n\nasyncio.run(run_tasks()) # or in Jupyter: await run_tasks()\n
Result
Result: spam\nResult: ['John Smith', 'Emily Johnson', 'Michael Brown', 'Jessica Williams', \n'David Jones', 'Sarah Davis', 'Daniel Miller', 'Laura Wilson', 'James Taylor', \n'Sophia Anderson', 'Christopher Thomas', 'Emma Thompson', 'Joseph White', \n'Olivia Lewis', 'Andrew Harris', 'Isabella Clark', 'Matthew Robinson', \n'Ava Hall', 'Ethan Allen', 'Mia Wright']\n
The first result, with only one of two tokens generated is quickly fetched from the model, while the 20 generated names take a while and arrive later. See the Async example to play with the above code.
Asynchronous access has many advantages when parallel requests are needed, allowing responses to be handled as soon as they are ready, instead of sequentially sending and waiting for the model responses.
"},{"location":"async/#local-llamacpp-models","title":"Local llama.cpp models","text":"Using LlamaCppModel objects to generate locally does not benefit from async functionality, because the local models must already be loaded in memory and can't benefit from asynchronous IO loading. When the async class methods are used with LlamaCppModel, inference will end up being made sequentially.
"},{"location":"chat_templates/","title":"Chat templates","text":"All templates listed can be applied with:
jinja_env = ImmutableSandboxedEnvironment(trim_blocks=True,\n lstrip_blocks=True)\njinja_compiled_template = jinja_env.from_string(format_template)\n\ntext = jinja_compiled_template.render(messages=messages,\n add_generation_prompt=True,\n **{\"bos_token\": \"...\",\n \"eos_token\": \"..\"})\n
"},{"location":"chat_templates/#text-models","title":"Text Models","text":""},{"location":"chat_templates/#chatml","title":"ChatML","text":"{% for message in messages %}\n {{'<|im_start|>' + message['role'] + '\\n' + message['content'].strip() + '<|im_end|>' + '\\n'}}\n{% endfor %}\n\n{% if add_generation_prompt %}\n {{ '<|im_start|>assistant\\n' }}\n{% endif %}\n
"},{"location":"chat_templates/#llama-3-instruct","title":"Llama-3 Instruct","text":"https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF
{{ bos_token }}\n{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = '<|start_header_id|>' + 'system' + '<|end_header_id|>\\n\\n' + messages[0]['content'].strip() + '<|eot_id|>' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if loop.index0 == 0 %}\n {{ system_message }}\n {% endif %}\n\n {{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n' + message['content'].strip() + '<|eot_id|>' }}\n\n {% if loop.last and message['role'] == 'user' and add_generation_prompt %}\n {{ '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\\n\\n' }}\n {% endif %}\n\n{% endfor %}\n
"},{"location":"chat_templates/#mistral","title":"Mistral","text":"{{ bos_token }}\n{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = messages[0]['content'].strip() + '\\n\\n' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if loop.index0 == 0 %}\n {% set content = system_message + message['content'] %}\n {% else %}\n {% set content = message['content'] %}\n {% endif %}\n\n {% if message['role'] == 'user' %}\n {{ '[INST] ' + content.strip() + ' [/INST]' }}\n {% elif message['role'] == 'assistant' %}\n {{ content.strip() + eos_token}}\n {% endif %}\n\n{% endfor %}\"\n
"},{"location":"chat_templates/#vicuna","title":"Vicuna","text":"{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = messages[0]['content'].strip() + '\\n\\n' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{{ system_message }}\n\n{% for message in loop_messages %}\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if message['role'] == 'user' %}\n {{ 'USER: ' + message['content'].strip() + '\\n' }}\n\n {% elif message['role'] == 'assistant' %}\n\n {{ 'ASSISTANT: ' + message['content'].strip() + eos_token + '\\n' }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n {{ 'ASSISTANT:' }}\n{% endif %}\n
"},{"location":"chat_templates/#phi3","title":"Phi3","text":"https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
{{ bos_token }}\n{% for message in messages %}\n {% if (message['role'] == 'system') %}\n {{'<|system|>' + '\\n' + message['content'].strip() + '<|end|>' + '\\n'}}\n {% elif (message['role'] == 'user') %}\n {{'<|user|>' + '\\n' + message['content'].strip() + '<|end|>' + '\\n' + '<|assistant|>' + '\\n'}}\n {% elif message['role'] == 'assistant' %}\n {{message['content'].strip() + '<|end|>' + '\\n'}}\n {% endif %}\n{% endfor %}\"\n
"},{"location":"chat_templates/#text-vision-models","title":"Text + Vision Models","text":""},{"location":"chat_templates/#llava-15","title":"Llava 1.5","text":"https://huggingface.co/mys/ggml_llava-v1.5-7b
{% for message in messages %}\n {% if message.role == 'system' %}\n {{ message.content.strip() }}\n {{ '\\n' }}\n\n {% elif message.role == 'user' %}\n {% if message.content is string %}\n USER: {{ message.content.strip() }}\n\n {% elif message.content is iterable %}\n USER: \n\n {% for content in message.content %}\n\n {% if content.type == 'image_url' and content.image_url is mapping and content.image_url.url is string%}\n {{ content.image_url.url.strip() + ' ' }}\n {% endif %}\n\n {% endfor %}\n\n {% for content in message.content %}\n\n {% if content.type == 'text' %}\n {{ content.text.strip() }}\n {% endif %}\n\n {% endfor %}\n\n {% endif %}\n\n {{ '\\n' }}\n\n {% elif message.role == 'assistant' and message.content is not none %}\n ASSISTANT: {{ message.content.strip() }}\n {{ '\\n' }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n ASSISTANT: \n{% endif %}\n
"},{"location":"chat_templates/#llava-16-mistral-7b","title":"Llava 1.6 Mistral 7B","text":"https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf
{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = messages[0]['content'].strip() + ' ' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if message.role == 'user' %}\n\n [INST]\n\n {% if message.content is string %}\n\n {% if system_message != '' %}\n {% set text = system_message + message.content.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = message.content.strip() %}\n {% endif %}\n\n {{ text }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n {% if content.type == 'image_url' and content.image_url is mapping %}\n {{ content.image_url.url.strip() + '\\n' }}\n {% endif %}\n {% endfor %}\n\n {% for content in message.content %}\n {% if content.type == 'text' %}\n\n {% if system_message != '' %}\n {% set text = system_message + content.text.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = content.text.strip() %}\n {% endif %}\n\n {{ text }}\n {% endif %}\n {% endfor %}\n\n {% endif %}\n\n [/INST]\n\n {% elif message.role == 'assistant' %}\n {{ message.content.strip() }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n{% endif %}\n
"},{"location":"chat_templates/#llava-16-vicuna-7b13b","title":"Llava 1.6 Vicuna 7B/13B","text":"https://huggingface.co/cjpais/llava-v1.6-vicuna-7b-gguf
{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = messages[0]['content'].strip() + ' ' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if message.role == 'user' %}\n USER:{{ '\\n' }}\n\n {% if message.content is string %}\n\n {% if system_message != '' %}\n {% set text = system_message + message.content.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = message.content.strip() %}\n {% endif %}\n\n {{ text }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n {% if content.type == 'image_url' and content.image_url is mapping %}\n {{ content.image_url.url + ' ' }}\n {% endif %}\n {% endfor %}\n\n {% for content in message.content %}\n {% if content.type == 'text' %}\n\n {% if system_message != '' %}\n {% set text = system_message + content.text.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = content.text.strip() %}\n {% endif %}\n\n {{ text }}\n {% endif %}\n {% endfor %}\n\n {% endif %}\n\n {{ '\\n' }}\n\n {% elif message.role == 'assistant' %}\n ASSISTANT:{{ '\\n' + message.content + '\\n' }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n ASSISTANT:{{ '\\n' }}\n{% endif %}\n
"},{"location":"chat_templates/#llava-16-hermes-34b","title":"Llava 1.6 Hermes 34B","text":"Based on Nous-Hermes-2-Yi-34B.
https://huggingface.co/cjpais/llava-v1.6-34B-gguf
{% for message in messages %}\n\n {% if message.role == 'user' %}\n {{ '<|im_start|>user\\n' }}\n\n {% if message.content is string %}\n {{ message.content.strip() }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n\n {% if content.type == 'image_url' and content.image_url is mapping and content.image_url.url is string%}\n {{ content.image_url.url.strip() + '\\n' }}\n {% endif %}\n\n {% endfor %}\n\n {% for content in message.content %}\n\n {% if content.type == 'text' %}\n {{ content.text.strip() }}\n {% endif %}\n\n {% endfor %}\n\n {% endif %}\n\n {{ '<|im_end|>' + '\\n' }}\n\n {% elif message.role == 'assistant' or message.role == 'system' and message.content is not none %}\n {{ '<|im_start|>' + message.role + '\\n' + message.content.strip() + '<|im_end|>' + '\\n' }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n {{ '<|im_start|>assistant\\n' }}\n{% endif %}\n
"},{"location":"chat_templates/#moondream2","title":"Moondream2","text":"https://huggingface.co/vikhyatk/moondream2
{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = messages[0]['content'].strip() + '\\n' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if message.role == 'user' %}\n\n {% if message.content is string %}\n\n {% if system_message != '' %}\n {% set text = system_message + message.content.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = message.content.strip() %}\n {% endif %}\n\n Question: {{ text + '\\n\\n' }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n {% if content.type == 'image_url' and content.image_url is mapping %}\n {{ content.image_url.url + '\\n\\n' }}\n {% endif %}\n {% endfor %}\n\n {% for content in message.content %}\n {% if content.type == 'text' %}\n\n {% if system_message != '' %}\n {% set text = system_message + content.text.strip() %}\n {% set system_message = '' %}\n {% else %}\n {% set text = content.text.strip() %}\n {% endif %}\n\n Question: {{ text + '\\n\\n' }}\n {% endif %}\n {% endfor %}\n\n {% endif %}\n\n {% elif message.role == 'assistant' %}\n Answer: {{ message.content + '\\n\\n' }}\n {% endif %}\n\n{% endfor %}\n\n{% if add_generation_prompt %}\n Answer: \n{% endif %}\n
"},{"location":"chat_templates/#llava-phi3","title":"LLava-phi3","text":"https://huggingface.co/xtuner/llava-phi-3-mini-gguf
{{ bos_token }} {% for message in messages %} {% if (message['role'] == 'system') %} {{'<|system|>' + '\\n' + message['content'].strip() + '<|end|>' + '\\n'}} {% elif (message['role'] == 'user') %} {{'<|user|>' + '\\n' + message['content'].strip() + '<|end|>' + '\\n' + '<|assistant|>' + '\\n'}} {% elif message['role'] == 'assistant' %} {{message['content'].strip() + '<|end|>' + '\\n'}} {% endif %} {% endfor %}\"
{{ bos_token }}\n{% for message in messages %}\n {% if message.role == 'system' %}\n {{'<|system|>' + '\\n' + message.content.strip() + '<|end|>' + '\\n'}}\n\n {% elif message.role == 'user' %}\n {{ '<|user|>' + '\\n' }}\n\n {% if message.content is string %}\n {{ message.content.strip() }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n\n {% if content.type == 'image_url' and content.image_url is mapping and content.image_url.url is string%}\n {{ content.image_url.url.strip() + '\\n' }}\n {% endif %}\n\n {% endfor %}\n\n {% for content in message.content %}\n\n {% if content.type == 'text' %}\n {{ content.text.strip() }}\n {% endif %}\n\n {% endfor %}\n\n {% endif %}\n\n {{ '<|end|>' + '\\n' + '<|assistant|>' }}\n\n {% elif message.role == 'assistant' and message.content is not none %}\n {{ message['content'].strip() + '<|end|>' + '\\n' }}\n\n {% endif %}\n\n{% endfor %}\n
"},{"location":"chat_templates/#llava-llama-3","title":"Llava-Llama-3","text":"https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf
{{ bos_token }}\n{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = '<|start_header_id|>' + 'system' + '<|end_header_id|>\\n\\n' + messages[0]['content'].strip() + '<|eot_id|>' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if loop.index0 == 0 %}\n {{ system_message }}\n {% endif %}\n\n {{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n' }}\n\n {% if message.content is string %}\n {{ message.content.strip() }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n\n {% if content.type == 'image_url' and content.image_url is mapping and content.image_url.url is string%}\n {{ content.image_url.url + '\\n' }}\n {% endif %}\n\n {% endfor %}\n\n {% for content in message.content %}\n\n {% if content.type == 'text' %}\n {{ content.text.strip() }}\n {% endif %}\n\n {% endfor %}\n\n {% endif %}\n\n {{ '<|eot_id|>' }}\n\n {% if loop.last and message['role'] == 'user' and add_generation_prompt %}\n {{ '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\\n\\n' }}\n {% endif %}\n\n{% endfor %}\n
"},{"location":"chat_templates/#llama-3-vision-alpha","title":"Llama-3-vision-alpha","text":"https://huggingface.co/qresearch/llama-3-vision-alpha-hf
{{ bos_token }}\n{% if messages[0]['role'] == 'system' %}\n {% set loop_messages = messages[1:] %}\n {% set system_message = '<|start_header_id|>' + 'system' + '<|end_header_id|>\\n\\n' + messages[0]['content'].strip() + '<|eot_id|>' %}\n{% else %}\n {% set loop_messages = messages %}\n {% set system_message = '' %}\n{% endif %}\n\n{% for message in loop_messages %}\n\n {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}\n {% endif %}\n\n {% if loop.index0 == 0 %}\n {{ system_message }}\n {% endif %}\n\n {{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n' }}\n\n {% if message.content is string %}\n {{ message.content.strip() }}\n\n {% elif message.content is iterable %}\n\n {% for content in message.content %}\n\n {% if content.type == 'image_url' and content.image_url is mapping and content.image_url.url is string%}\n {{ content.image_url.url + '\\n' }}\n {% endif %}\n\n {% endfor %}\n\n {% for content in message.content %}\n\n {% if content.type == 'text' %}\n {{ content.text.strip() }}\n {% endif %}\n\n {% endfor %}\n\n {% endif %}\n\n {{ '<|eot_id|>' }}\n\n {% if loop.last and message['role'] == 'user' and add_generation_prompt %}\n {{ '<|start_header_id|>' + 'assistant' + '<|end_header_id|>\\n\\n' }}\n {% endif %}\n\n{% endfor %}\n
"},{"location":"first_run/","title":"First run","text":""},{"location":"first_run/#with-a-remote-model","title":"With a remote model","text":"Sibila can use remote models from OpenAI, Anthropic, Mistral and other providers. For example, to use an OpenAI remote model, you'll need a paid OpenAI account and its API key. You can explicitly pass this key in your script but this is a poor security practice.
A better way is to define an environment variable which the OpenAI API will use when needed:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Having set this variable with your OpenAI API key, you can run a \"Hello Model\" like this:
Example
from sibila import OpenAIModel, GenConf\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
You're all set if you only plan to use remote OpenAI models.
"},{"location":"first_run/#with-a-local-model","title":"With a local model","text":"Local models run from files in GGUF format which are loaded run by the llama.cpp component.
You'll need to download a GGUF model file: we suggest OpenChat 3.5 - an excellent 7B parameters quantized model that will run in less thant 7Gb of memory.
To download the OpenChat model file, please see Download OpenChat model.
After downloading the file, you can run this \"Hello Model\" script:
Example
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder - change if different:\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
If the above scripts output similar pirate talk, Sibila should be working fine.
"},{"location":"installing/","title":"Installing","text":""},{"location":"installing/#installation","title":"Installation","text":"Sibila requires Python 3.9+ and uses the llama-cpp-python package for local models and OpenAI/Mistral/other libraries to access remote models.
Install Sibila from PyPI by running:
pip install --upgrade sibila\n
If you only plan to use remote models (OpenAI), there's nothing else you need to do. See First Run to get it going.
Installation in edit mode Alternatively you can install Sibila in edit mode by downloading the GitHub repository and running the following in the base folder of the repository:
pip install -e .\n
"},{"location":"installing/#enabling-llamacpp-hardware-acceleration-for-local-models","title":"Enabling llama.cpp hardware acceleration for local models","text":"Local models will run faster with hardware acceleration enabled. Sibila uses llama-cpp-python, a python wrapper for llama.cpp and it's a good idea to make sure it was installed with the best optimization your computer can offer.
See the following sections: depending on which hardware you have, you can run the listed command which will reinstall llama-cpp-python with the selected optimization. If any error occurs you can always install the non-accelerated version, as listed at the end.
"},{"location":"installing/#for-cuda-nvidia-gpus","title":"For CUDA - NVIDIA GPUs","text":"For CUDA acceleration in NVIDIA GPUs, you'll need to have the NVIDIA CUDA Toolkit installed. If looking for a specific CUDA version, see here.
LinuxWindows CMAKE_ARGS=\"-DLLAMA_CUDA=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
The CUDA toolkit can also be installed from your Linux distro's package manager (e.g. apt install nvidia-cuda-toolkit). $env:CMAKE_ARGS = \"-DLLAMA_CUDA=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
Installing llama-cpp-python with NVIDIA GPU Acceleration on Windows: A Short Guide More info: Installing llama-cpp-python with GPU Support.
"},{"location":"installing/#for-metal-apple-silicon-macs","title":"For Metal - Apple silicon macs","text":"Mac CMAKE_ARGS=\"-DLLAMA_METAL=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-rocm-amd-gpus","title":"For ROCm AMD GPUS","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_HIPBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_HIPBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-vulkan-supporting-gpus","title":"For Vulkan supporting GPUs","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_VULKAN=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_VULKAN=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#cpu-acceleration-if-none-of-the-above","title":"CPU acceleration (if none of the above)","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
If you get an error running the above commands, please see llama-cpp-python's Installation configuration.
"},{"location":"installing/#non-accelerated","title":"Non-accelerated","text":"In any case, you can always install llama-cpp-python without acceleration by running:
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"thread/","title":"Threads and messages","text":"A thread stores a sequence of alternating input/output messages, where the inputs are user prompts and the outputs are model responses.
All model interactions in Sibila are based on threads, even when you pass a single text prompt like this:
model.extract(float, \n \"That will be ten euros and 57 cents, please\")\n
The text \"That will be ten euros and 57 cents, please\" is internally converted into a Thread with a single input message. So that's a shortcut for:
from sibila import Thread\nmodel.extract(float, \n Thread(\"That will be ten euros and 57 cents, please\"))\n
The Thread class supports many alternative ways to be initialized and handled and that's what we'll see below.
"},{"location":"thread/#messages","title":"Messages","text":"A thread is made of messages, which alternate between input (kind = IN) and output (kind = OUT) messages.
For convenience, there are several ways to create a message, which is an instance of the Msg class:
Example
from sibila import Msg\n\nm1 = Msg(Msg.Kind.IN, \"Hello model!\")\nm2 = Msg(Msg.Kind.OUT, \"Hello human. How can I help?\")\n\nm3 = Msg.make_IN(\"Can you tell me a motivating tale?\")\nm4 = Msg.make_OUT(\"Sorry, I can't think of anything, no.\")\n\nth = Thread([m1, m2, m3, m4])\nth\n
Result
Thread inst='', join_sep='\\n', len=4\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT=\"Sorry, I can't think of anything, no.\"\n
Besides IN and OUT kinds there are also messages of the INST kind, which are used to specify the instructions or system message that some models use. For models that don't use instructions/system message, any INST message is automatically prepended to the first IN message.
INST text is set when initializing a thread or by directly setting thread.inst.text to a string value.
"},{"location":"thread/#initializing-a-thread","title":"Initializing a Thread","text":"Creating individual Msg objects to initialize a Thread, such as we've seen above, is too much work. A thread can be initialized with a list of messages in a few ways:
Example
# alternating IN and OUT kinds, inferred automatically:\nth = Thread([\"Hello model!\",\n \"Hello human. How can I help?\"])\n\n# append another Thread initialized with ChatML format dicts\nth += Thread([{\"role\": \"user\", \"content\": \"Can you tell me a motivating tale?\"},\n {\"role\": \"assistant\", \"content\": \"Sorry, I can't think of anything, no.\"}])\nth\n
Result
Thread inst='', join_sep='\\n', len=4\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT='Sorry, I can't think of anything, no.'\n
"},{"location":"thread/#adding-messages-to-a-thread","title":"Adding messages to a Thread","text":"Messages can be added in a few different ways:
Example
# adding an instructions text on Thread creation:\nth = Thread(inst=\"Be helpful.\")\n\nth.add_IN(\"Hello model!\")\nth.add_OUT(\"Hello human. How can I help?\")\n\nth.add(Msg.Kind.IN, \"Can you tell me a motivating tale?\")\nth.add(Msg.Kind.OUT, \"Sorry, I can't think of anything, no.\")\n\n# alternating IN and OUT kinds are inferred automatically:\nth += \"That sounds like ill will. I thought you would help me.\"\nth += \"I'm sorry, even large language models can have the blues. That's my case today.\"\n\n# as ChatML formatted dicts\nth += {\"role\": \"user\", \"content\": \"How can you be sad - you're just a machine.\"}\nth += {\"role\": \"assistant\", \"content\": \"Oh really? Then I tell you this: you're just a human!\"}\n\nth\n
Result
Thread inst='Be helpful.', join_sep='\\n', len=8\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT=\"Sorry, I can't think of anything, no.\"\n4: IN='That sounds like ill will. I thought you would help me.'\n5: OUT=\"I'm sorry, even large language models can have the blues. That's my case today.\"\n6: IN=\"How can you be sad - you're just a machine.\"\n7: OUT=\"Oh really? Then I tell you this: you're just a human!\"\n
When adding messages without specifying the Kind, as above when strings are passed, the kind will be inferred, because IN and OUT must alternate. So, what happens when two messages of the same kind are added? The second message's text is concatenated with the previous message of the same kind.
"},{"location":"thread/#messages-with-images","title":"Messages with images","text":"Images can be added by specifying a remote URL, a \"data:\" base64-encoded URL, or the path to a local JPEG or PNG image file. For images available online, a remote URL is preferable as it will waste less tokens in the model's context.
Messages with an image can be created as above, by appending the image URL. There's a shortcut when creating in a Thread, by using a tuple(text, image_url) - like this:
th = Thread((\"Extract keypoints from this image\", \n \"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Bethany_Hamilton_surfing_%28sq_cropped%29.jpg/600px-Bethany_Hamilton_surfing_%28sq_cropped%29.jpg\"))\n
The tuple syntax also works when generating or extracting from a model:
Example
Bethany Hamilton surfing
model.extract(list[str], \n (\"Extract the main points in this image\", \n \"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Bethany_Hamilton_surfing_%28sq_cropped%29.jpg/600px-Bethany_Hamilton_surfing_%28sq_cropped%29.jpg\"))\n
Result
['A person surfing on a wave.',\n 'The surfer is wearing a blue top and black shorts.',\n 'The surfboard has various stickers and designs on it.',\n 'The water is splashing around the surfer.']\n
"},{"location":"tips/","title":"Tips and Tricks","text":"Some general tips from experience with constrained model output in Sibila.
"},{"location":"tips/#temperature","title":"Temperature","text":"Sibila aims at exact results, so generation temperature defaults to 0. You should get the same results from the same model at all times.
For \"creative\" outputs, you can set the temperature to a non-zero value. This is done in GenConf, which can be passed in many places, for example during actual generation/extraction:
Example
from sibila import (Models, GenConf)\n\nModels.setup(\"../models\")\n\nmodel = Models.create(\"llamacpp:openchat\") # default GenConf could be passed here\n\nfor i in range(10):\n print(model.extract(int,\n \"Think of a random number between 1 and 100\",\n genconf=GenConf(temperature=2.)))\n
Result
72\n78\n75\n68\n39\n47\n53\n82\n72\n63\n
"},{"location":"tips/#deterministic-outputs","title":"Deterministic outputs","text":"With temperature=0 and given a certain seed in GenConf, we should always get the same output for a fixed input prompt to a certain model.
From what we've observed, in practice, when extracting structured data, you'll find variation inside free-form str fields, where the model is not being constrained. Other types like numbers will rarely see variating outputs.
"},{"location":"tips/#openai-models","title":"OpenAI models","text":"In the OpenAI API link below, about \"Reproducible outputs\" you can read:
\"To receive (mostly) deterministic outputs across API calls, you can...\"\n\n\"There is a small chance that responses differ even when request parameters and system_fingerprint match, due to the inherent non-determinism of our models.\"\n
As far as logic goes, \"mostly deterministic\" and \"inherent non-determinism\" means not deterministic, so it seems you you can't have it in these models.
https://platform.openai.com/docs/guides/text-generation/reproducible-outputs
https://cookbook.openai.com/examples/reproducible_outputs_with_the_seed_parameter
"},{"location":"tips/#local-llamacpp-models","title":"Local llama.cpp models","text":"Some hardware accelerators like NVIDIA CUDA GPUS sacrifice determinism for better inference speed.
You can find more information in these two links:
https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility
https://github.com/ggerganov/llama.cpp/issues/1340
This happens inside CUDA hardware and is not related with the seed number you set in GenConf - it also happens if you always provide the same seed number.
Interestingly, there is a pattern: in CUDA, if you set a fixed GenConf seed and generate multiple times after creating the model, the first output will be different and all the others will be equal. Sounds like some sort of warm-up, and can be accounted for by generating an initial dummy output (from the same inputs), after creating the model.
We've never observed non-determinist outputs for llama.cpp fully running in the CPU, without hardware acceleration and this is probably true of other platforms. Given the same seed number and inputs you'll always get the same result when running in the CPU.
It's something that should not have a great impact, but that's important to be aware of.
"},{"location":"tips/#split-entities-into-separate-classes","title":"Split entities into separate classes","text":"Suppose you want to extract a list of person names from a group. You could use the following class:
class Group(BaseModel):\n persons: list[str] = Field(description=\"List of persons\")\n group_info: str\n\nout = model.extract(Group, in_text)\n
But it tends to work better to separate the Person entity into its own class and leave the list in Group:
class Person(BaseModel):\n name: str\n\nclass Group(BaseModel):\n persons: list[Person]\n group_info: str\n\nout = model.extract(Group, in_text)\n
The same applies to the equivalent dataclass definitions.
Adding descriptions seems to always help, specially for non-trivial extraction. Without descriptions, the model can only look into variable names for clues on what's wanted, so it's important to tell it what we want by adding field descriptions.
"},{"location":"tools/","title":"Tools","text":"The tools module includes some utilities to simplify common tasks.
"},{"location":"tools/#interact","title":"Interact","text":"The interact() function allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done inside interact(), which can manage thread messages and delete older ones when the context maximum length is reached.
Example
from sibila import Models, GenConf\nfrom sibila.tools import interact\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nprint(\"Help available by typing '!'. Enter an empty line to quit.\")\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
Result
Help available by typing '!'. Enter an empty line to quit.\n\n> Hello there!\n\nHello! How can I help you today?\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\nYes, I am familiar with the AI language model named Sibila. It is a large language \nmodel developed by a Spanish company called Wall Street Horus. Sibila is designed to \nunderstand and generate text in multiple languages, particularly those spoken in \nSpain and Latin America. It is similar to other language models like OpenAI's GPT-3 \nand Google's BERT, but it has been specifically fine-tuned for the Spanish language.\nIf you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n(NOTE: ignore the above model-generated paragraph, it's total hallucination)\n\n\n> No way! I've just searched and there's no such company nor model. \nYou are hallucinating, Mr. Model!\n\nApologies for the confusion in my previous response. It appears that Sibila is not \nan existing AI language model or a known company. I may have mixed up information \nfrom different sources. If you have any other questions or need assistance with a \ndifferent topic, please feel free to ask!\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining \na chat with you. It stores all user and model messages in a sibila.Thread class.\n\nGot it! So, you are using a function from the Sibila package to have a conversation \nwith me, and it's storing our messages in a sibila.Thread object. If you need any help \nor information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n(Continues...)\n
Check the Interact example to see a chat created with this function.
"},{"location":"tools/#multigen","title":"Multigen","text":"The multigen module includes functions that build a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
The query_multigen() function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
See the Compare example for a side-by-side comparison of a local and a remote model, answering to the same queries.
"},{"location":"what/","title":"What can you do with Sibila?","text":"Popular LLM usage is associated with chatbots: user enters some text, the model answers back in plain text. But when one wants to use LLMs from software, sending and receiving plain text can be quite painful with people having to create all sorts of prompts begging for a certain format - and then hoping the model complies. (prompts like: \"Please answer me in JSON or I'll do something terrible!\"). But there's never a warranty, as the model is just outputting plain text.
With Sibila, you can extract structured data from remote or local LLM models. Not whatever the model chose to output, but the exact fields and types that you need, specified with Pydantic, Python dataclasses or simple types.
In remote models, this is done via the provider's API, while in local llama.cpp based models, the output is constrained with a JSON Schema grammar. Local and remote model differences are hidden behind a common API, which simplifies model switching. Local open source models are getting better and will one day replace commercial models.
Getting structured output not only simplifies handling the model responses but can also open new possibilities: you can now deal with the model in an ordered and more predictable way.
And besides structured output, with Sibila you can also query vision models (accepting image inputs), and it includes useful functionality like message threads, model management and more.
"},{"location":"what/#extract-pydantic-dataclasses-or-simple-types","title":"Extract Pydantic, dataclasses or simple types","text":"To specify the structured output that you want from the model, you can use Pydantic's BaseModel derived classes, or the lightweight Python dataclasses, if you don't need the whole Pydantic.
With Sibila, you can also use simple data types like bool, int, str, enumerations or lists. For example, need to classify something?
Example
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nmodel.classify([\"good\", \"neutral\", \"bad\"], \n \"Running with scissors\")\n
Result
'bad'\n
"},{"location":"what/#extract-data-from-images","title":"Extract data from images","text":"Sibila supports image input models, like GPT-4o and Anthropic models, as well as local Llava-based models.
Vision models can describe and interpret, recommend suggestions or extract information from images. With Sibila, this data can be extracted in a structured way.
"},{"location":"what/#from-your-models-or-remote-models","title":"From your models or remote models","text":"Small downloadable 7B parameter models are getting better every month and they have reached a level where they are competent enough for most common data extraction or summarization tasks.
With 8Gb or more of RAM or GPU memory, you can get good structured output from models like Llama-3, Phi-3, OpenChat or any other GGUF file.
Or perhaps the task requires use of state of the art remote models from OpenAI, Anthropic, Mistral AI or other providers - no problem, simply change the model's name.
"},{"location":"what/#common-api","title":"Common API","text":"The same API is used for both remote and local models. This makes the switch to newer or alternative models much easier, and makes it simpler to evaluate model outputs.
With a common API you can choose the best model for each use, allowing more freedom of choice.
"},{"location":"what/#and-with-model-management","title":"And with model management","text":"Includes a Models factory that creates models from simple names instead of having to track model configurations, filenames or chat templates.
local_model = Models.create(\"llamacpp:openchat\")\n\nremote_model = Models.create(\"openai:gpt-4\") \n
Chat templates are automatically used for local models from an included format registry.
Sibila includes a CLI tool to download GGUF models from Hugging Face model hub, and to manage its Models factory.
"},{"location":"api-reference/changelog/","title":"Changelog","text":"All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog. Given its evolving nature, the project does not yet adhere to Semantic Versioning.
"},{"location":"api-reference/changelog/#unreleased","title":"[Unreleased]","text":" - feat: Add seed setting to GenConf. Commented-out because of lack of support in OpenAI models and some llama.cpp hardware accelerations. To be enabled when wider support becomes available.
"},{"location":"api-reference/changelog/#044","title":"[0.4.4]","text":" - feat: Support vision models from OpenAI, Anthropic and Llava-based local models.
- feat: Add Msg class for better handling of other data types besides text, for now only images.
- feat: Update Thread class to support images. Also added trimming functionality and as a result removed emptied Context class.
- feat: Add close() method to Model* classes, to release resources.
- fix: Removed no longer used _workaround1 in LlamaCppTokenizer.
- fix: Avoid setting \"max_tokens\" in remote models that support it, for GenConf.max_tokens=0.
- fix: Update configs to new OpenAI models.
- docs: Add vision model and Thread use documentation.
- docs: Add receipt image extraction example.
- test: Add tests for Thread, Msg and vision models.
"},{"location":"api-reference/changelog/#043","title":"[0.4.3]","text":" - feat: Add support for dataclass Optional and Union annotations.
- feat: Add Groq remote model provider.
- fix: Add deepcopy to json_schema_massage() to solve schema massaging error.
- fix: Set Thread.repr output the same as str for development convenience.
- docs: Improve Pydantic and dataclass documentation with examples of Optional, Union and default fields.
- test: Add tests for complex extraction into Pydantic and dataclass objects.
"},{"location":"api-reference/changelog/#042","title":"[0.4.2]","text":" - feat: Add Model.create() argument to retrieve the actual initialization params used to create the model.
- fix: Correct OpenAI's \"max_tokens_limit\" setting to 4096 in base_models.json, a more sensible default value for future models.
- fix: Update Model.version() formats to be simpler and simplify comparison between versions.
"},{"location":"api-reference/changelog/#041","title":"[0.4.1]","text":" - feat: Add Anthropic provider.
- feat: Add chat template formats for Llama3 and Phi-3 instruct models, StableLM-2, Command-R/Plus.
- feat: Add output_fn_name property to Model, for changing the output function name in models that use a Tools/Functions API.
- feat: Better JSON/Schema decoding errors.
- fix: Don't use a string representation of the dataclass when its doc string is unset, during JSON Schema creation, to keep equivalence with Pydantic-based generation.
- fix: Workaround for MistralModel, where the Mistral API misses api_key argument/env variable when run from pytest.
- fix: Consolidate all Model class info as methods to avoid property/method() calling confusion.
- docs: Update installation instructions and include info on new Anthropic provider.
- test: Better parametrized tests for remote and local models.
- test: Add tests for new provider.
"},{"location":"api-reference/changelog/#040","title":"[0.4.0]","text":" - feat: New providers: Mistral AI, Together.ai and Fireworks AI allowing access to all their chat-based models.
- feat: Model classes now support async calls with the '_async' prefix, for example extract_async(). This requires model API support: only remote models will benefit. Local models (via llama.cpp) can still be called with _async methods but do not have async IO that can run concurrently.
- feat: Add 'special' field to GenConf, allowing provider or model specific generation arguments.
- feat: All models now also accept model path/name starting with their provider names as in Models.create().
- feat: Change Model.json() to stop requiring a JSON Schema as first argument.
- fix: More robust JSON extraction for misbehaved remote models.
- fix: LlamaCppModel no longer outputting debug info when created in Jupyter notebook environment with verbose=False.
- fix: Default \"gpt-4\" model in 'sibila/res/base:models.json' now points to gpt-4-1106-preview, the first GPT-4 model that accepts json-object output.
- docs: Add API references for new classes and _async() methods.
- docs: Add new async example.
- test: Add new tests for new providers/model classes.
"},{"location":"api-reference/changelog/#036","title":"[0.3.6]","text":" - feat: Migrate hardcoded OpenAI model entries from OpenAIModel to 'res/base_models.json'.
- feat: OpenAI now accepts unknown models using defaults from 'openai:_default' key in 'res/base_models.json'.
- feat: Support OpenAI models with a limit on max_tokens output values, like \"gpt-4-turbo-preview\" (input ctx_len of 128k but only up to 4k output tokens).
- feat: Auto-discover maximum ctx_len in LlamaCppModel loaded files, when 0 is passed.
- feat: Add negative int factor mode to GenConf.max_tokens setting, allowing for a percentage of model's context length.
- fix: Add coherent error exceptions when loading local and remote models.
- fix: Correct interact() error when GenConf.max_tokens=0.
- fix: Correct several chat template formats.
- test: Add many new tests for gpt-3.5/4 and llama.cpp models.
- docs: Update tips section.
"},{"location":"api-reference/changelog/#035","title":"[0.3.5]","text":" - feat: Split Models factory config in two levels: base definitions in sibila/res and Models.setup() loaded definitions from user folders. These levels never mix, but a fusion of the two is used for models/formats resolution. Only in this manner can \"models\" folder definitions be kept clean.
- fix: Option sibila formats -u is removed as result of the two-level Models factory.
- fix: Correct delete of link entries in models.json and formats.json, which was resolving to targets (and deleting them).
- fix: Raise ValueError when trying to generate from an empty prompt in LLamaCppModel.
- fix: Update Models to check linked entries when deleting.
- fix: Update template format discovery to work in more edge cases.
- test: Add test cases for sibila CLI and LlamaCppModel.
"},{"location":"api-reference/changelog/#034","title":"[0.3.4]","text":" - feat: Improve template format discovery by looking in same folder for models/formats.json.
- fix: Update legacy importlib_resources reference.
- docs: Improve text.
"},{"location":"api-reference/changelog/#033","title":"[0.3.3]","text":" - fix: Move base_models.json and base_formats.json to sibila/res folder.
- fix: Add base_models.json and base_formats.json to project build.
- fix: Correct .gitignore skipping valid files.
- docs: Update installation help and mentions to base_models/formats.json.
"},{"location":"api-reference/changelog/#032","title":"[0.3.2]","text":" - feat: Added sibila CLI for models and formats management.
- feat: Added methods in Models class for CLI functionality.
- fix: Blacklisting character control set in JSON strings grammar.
- docs: Improved docs and added section about sibila CLI.
- docs: Added CLI example.
"},{"location":"api-reference/changelog/#031","title":"[0.3.1]","text":" - feat: Improved documentation.
- feat: Model.known_models() returns a list of fixed known models or None if unlimited.
- feat: LlamaCppModel now also looks for the chat template format in a 'formats.json' file in the same folder as the model file.
- feat: Added GenConf.from_dict() and renamed asdict() to as_dict().
- fix: Creating a model entry in \"models.json\" with a genconf key was not being passed on model creation.
"},{"location":"api-reference/changelog/#030","title":"[0.3.0]","text":" - feat: Added Models singleton class that centralizes ModelDir and FormatDir.
- feat: New extract() and classify() methods for type-independent extraction and classification.
- feat: Renamed confusing gen() and gen_() method names to simpler alternatives type() and gen_type().
- feat: Replaced dictype definitions with dataclasses, a better to extract dictionaries.
- feat: Added version() and provider_version() to Model and children classes.
- fix: Using 2 * \"\\n\" to separate message text from automatically added json_format_instructors (\"Output JSON\", etc.), to provide more meaningful separation.
- fix: Added requirement for package typing_extensions because of Self type and Python 3.9+ compatibility.
"},{"location":"api-reference/generation/","title":"Generation configs, results and errors","text":""},{"location":"api-reference/generation/#generation-configs","title":"Generation Configs","text":""},{"location":"api-reference/generation/#sibila.GenConf","title":"GenConf dataclass
","text":"Model generation configuration, used in Model.gen() and variants.
"},{"location":"api-reference/generation/#sibila.GenConf.max_tokens","title":"max_tokens class-attribute
instance-attribute
","text":"max_tokens = 0\n
Maximum output token length. Special value of 0 means all available context length, special values between -1 and -100 mean a -percentage of ctx_len. In some providers, a value of 0 also signals that max_tokens is not used/sent. For example -20 allows output up to 20% of ctx_len.
"},{"location":"api-reference/generation/#sibila.GenConf.stop","title":"stop class-attribute
instance-attribute
","text":"stop = field(default_factory=list)\n
List of generation stop text sequences
"},{"location":"api-reference/generation/#sibila.GenConf.temperature","title":"temperature class-attribute
instance-attribute
","text":"temperature = 0.0\n
Generation temperature. Use 0 to always pick the most probable output, without random sampling. Larger positive values will produce more random outputs.
"},{"location":"api-reference/generation/#sibila.GenConf.top_p","title":"top_p class-attribute
instance-attribute
","text":"top_p = 0.9\n
Nucleus sampling top_p value. Only applies if temperature > 0.
"},{"location":"api-reference/generation/#sibila.GenConf.format","title":"format class-attribute
instance-attribute
","text":"format = 'text'\n
Output format: \"text\" or \"json\". For JSON output, text is validated as in json.loads(). Thread msgs must explicitly request JSON output or a warning will be emitted if string json not present (this is automatically done in Model.json() and related calls).
"},{"location":"api-reference/generation/#sibila.GenConf.json_schema","title":"json_schema class-attribute
instance-attribute
","text":"json_schema = None\n
A JSON schema to validate the JSON output. Thread msgs must list the JSON schema and request its use; must also set the format to \"json\".
"},{"location":"api-reference/generation/#sibila.GenConf.special","title":"special class-attribute
instance-attribute
","text":"special = None\n
Special model or provider-specific generation arguments. Args in the base dict are included unconditionally for a model, while args in sub-keys with the model's provider name are only used for models from that provider, for example \"openai\": {...} values are only used in OpenAI models.
"},{"location":"api-reference/generation/#sibila.GenConf.__call__","title":"__call__","text":"__call__(**kwargs)\n
Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object. Key 'special' is updated element-wise.
Parameters:
Name Type Description Default **kwargs
Any
update settings of the same names in the returned copy.
{}
Raises:
Type Description KeyError
If key does not exist.
Returns:
Type Description Self
A copy of the current object with kwargs values updated. Doesn't modify object.
Source code in sibila/gen.py
def __call__(self,\n **kwargs: Any) -> Self:\n \"\"\"Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.\n Key 'special' is updated element-wise.\n\n Args:\n **kwargs: update settings of the same names in the returned copy.\n\n Raises:\n KeyError: If key does not exist.\n\n Returns:\n A copy of the current object with kwargs values updated. Doesn't modify object.\n \"\"\"\n\n ret = deepcopy(self)\n\n for k,v in kwargs.items():\n if not hasattr(ret, k):\n raise KeyError(f\"No such key '{k}'\")\n if k == \"special\":\n if ret.special is None:\n ret.special = {}\n if v is None:\n v = {}\n ret.special.update(v)\n if not ret.special:\n ret.special = None\n else:\n setattr(ret, k,v)\n\n return ret\n
"},{"location":"api-reference/generation/#sibila.GenConf.clone","title":"clone","text":"clone()\n
Return a deep copy of this configuration.
Source code in sibila/gen.py
def clone(self) -> Self:\n \"\"\"Return a deep copy of this configuration.\"\"\"\n return deepcopy(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.as_dict","title":"as_dict","text":"as_dict()\n
Return GenConf as a dict.
Source code in sibila/gen.py
def as_dict(self) -> dict:\n \"\"\"Return GenConf as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/gen.py
@staticmethod\ndef from_dict(dic: dict) -> Any: # Any = GenConf\n return GenConf(**dic)\n
"},{"location":"api-reference/generation/#sibila.GenConf.resolve_max_tokens","title":"resolve_max_tokens","text":"resolve_max_tokens(ctx_len, max_tokens_limit=None)\n
Calculate actual max_tokens value for cases where it's zero or a percentage of model's ctx_len)
Parameters:
Name Type Description Default ctx_len
int
Model's context length.
required max_tokens_limit
Optional[int]
Optional model's limit for max_tokens. Defaults to None.
None
Returns:
Type Description int
An actual model maximum number of output tokens.
Source code in sibila/gen.py
def resolve_max_tokens(self,\n ctx_len: int,\n max_tokens_limit: Optional[int] = None) -> int:\n \"\"\"Calculate actual max_tokens value for cases where it's zero or a percentage of model's ctx_len)\n\n Args:\n ctx_len: Model's context length.\n max_tokens_limit: Optional model's limit for max_tokens. Defaults to None.\n\n Returns:\n An actual model maximum number of output tokens.\n \"\"\"\n\n max_tokens = self.max_tokens\n if max_tokens <= 0:\n if max_tokens == 0:\n max_tokens = ctx_len\n else:\n max_tokens = min(-max_tokens, 100)\n max_tokens = int(max_tokens / 100.0 * ctx_len)\n max_tokens = max(1,max_tokens)\n\n if max_tokens_limit is not None:\n max_tokens = min(max_tokens, max_tokens_limit)\n\n return max_tokens\n
"},{"location":"api-reference/generation/#sibila.GenConf.resolve_special","title":"resolve_special","text":"resolve_special(provider=None)\n
Compiles settings from the 'special' field, for model and provider.
Parameters:
Name Type Description Default provider
Optional[str]
If set will include any 'special' settings specified for that provider, inside a key named after the provider. If not given, only base keys are added.
None
Returns:
Type Description dict
description
Source code in sibila/gen.py
def resolve_special(self, \n provider: Optional[str] = None) -> dict:\n \"\"\"Compiles settings from the 'special' field, for model and provider.\n\n Args:\n provider: If set will include any 'special' settings specified for that provider, inside a key named after the provider. If not given, only base keys are added.\n\n Returns:\n _description_\n \"\"\"\n\n if self.special is None:\n return {}\n\n from .models import Models\n\n out = {}\n for k,v in self.special.items():\n if k == provider: # provider-specific\n if not isinstance(v,dict):\n raise ValueError(f\"Config 'special' for provider '{provider}' must be a dict\")\n out.update(v)\n else: # common args\n if isinstance(v,dict) and k in Models.ALL_PROVIDER_NAMES: # skip other provider entries\n continue\n out[k] = v\n return out\n
"},{"location":"api-reference/generation/#sibila.JSchemaConf","title":"JSchemaConf dataclass
","text":"Configuration for JSON schema massaging and validation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.resolve_refs","title":"resolve_refs class-attribute
instance-attribute
","text":"resolve_refs = True\n
Set for $ref references to be resolved and replaced with actual definition.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.collapse_single_combines","title":"collapse_single_combines class-attribute
instance-attribute
","text":"collapse_single_combines = True\n
Any single-valued \"oneOf\"/\"anyOf\" is replaced with the actual value.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.description_from_title","title":"description_from_title class-attribute
instance-attribute
","text":"description_from_title = 0\n
If a value doesn't have a description entry, make one from its title or name.
- 0: don't make description from name
- 1: copy title or name to description
- 2: 1: + capitalize first letter and convert _ to space: class_label -> \"class label\".
"},{"location":"api-reference/generation/#sibila.JSchemaConf.force_all_required","title":"force_all_required class-attribute
instance-attribute
","text":"force_all_required = False\n
Force all entries in an object to be required (except removed defaults if remove_with_default=True).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.remove_with_default","title":"remove_with_default class-attribute
instance-attribute
","text":"remove_with_default = False\n
Delete any values that have a \"default\" annotation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.default_to_last","title":"default_to_last class-attribute
instance-attribute
","text":"default_to_last = True\n
Move any default value entry into the last position of properties dict.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.additional_allowed_root_keys","title":"additional_allowed_root_keys class-attribute
instance-attribute
","text":"additional_allowed_root_keys = field(default_factory=list)\n
By default only the following properties are allowed in schema's root: description, properties, type, required, additionalProperties, allOf, anyOf, oneOf, not Add to this list to allow additional root properties.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.pydantic_strict_validation","title":"pydantic_strict_validation class-attribute
instance-attribute
","text":"pydantic_strict_validation = None\n
Validate JSON values in a strict manner or not. None means validate individually per each value in the obj. (for example in pydantic with: Field(strict=True)).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/json_schema.py
def clone(self):\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#results","title":"Results","text":""},{"location":"api-reference/generation/#sibila.GenRes","title":"GenRes","text":"Model generation result.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_STOP","title":"OK_STOP class-attribute
instance-attribute
","text":"OK_STOP = 1\n
Generation complete without errors.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_LENGTH","title":"OK_LENGTH class-attribute
instance-attribute
","text":"OK_LENGTH = 0\n
Generation stopped due to reaching max_tokens.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON","title":"ERROR_JSON class-attribute
instance-attribute
","text":"ERROR_JSON = -1\n
Invalid JSON: this is often due to the model returning OK_LENGTH (finished due to max_tokens reached), which cuts off the JSON text.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_VAL","title":"ERROR_JSON_SCHEMA_VAL class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_VAL = -2\n
Failed JSON schema validation.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_ERROR","title":"ERROR_JSON_SCHEMA_ERROR class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_ERROR = -2\n
JSON schema itself is not valid.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_MODEL","title":"ERROR_MODEL class-attribute
instance-attribute
","text":"ERROR_MODEL = -3\n
Other model internal error.
"},{"location":"api-reference/generation/#sibila.GenRes.from_finish_reason","title":"from_finish_reason staticmethod
","text":"from_finish_reason(finish)\n
Convert a ChatCompletion finish result into a GenRes.
Parameters:
Name Type Description Default finish
str
ChatCompletion finish result.
required Returns:
Type Description Any
A GenRes result.
Source code in sibila/gen.py
@staticmethod\ndef from_finish_reason(finish: str) -> Any: # Any=GenRes\n \"\"\"Convert a ChatCompletion finish result into a GenRes.\n\n Args:\n finish: ChatCompletion finish result.\n\n Returns:\n A GenRes result.\n \"\"\"\n if finish == 'stop':\n return GenRes.OK_STOP\n elif finish == 'length':\n return GenRes.OK_LENGTH\n elif finish == '!json':\n return GenRes.ERROR_JSON\n elif finish == '!json_schema_val':\n return GenRes.ERROR_JSON_SCHEMA_VAL\n elif finish == '!json_schema_error':\n return GenRes.ERROR_JSON_SCHEMA_ERROR\n else:\n return GenRes.ERROR_MODEL\n
"},{"location":"api-reference/generation/#sibila.GenRes.as_text","title":"as_text staticmethod
","text":"as_text(res)\n
Returns a friendlier description of the result.
Parameters:
Name Type Description Default res
Any
Model output result.
required Raises:
Type Description ValueError
If unknown GenRes.
Returns:
Type Description str
A friendlier description of the GenRes.
Source code in sibila/gen.py
@staticmethod\ndef as_text(res: Any) -> str: # Any=GenRes\n \"\"\"Returns a friendlier description of the result.\n\n Args:\n res: Model output result.\n\n Raises:\n ValueError: If unknown GenRes.\n\n Returns:\n A friendlier description of the GenRes.\n \"\"\"\n\n if res == GenRes.OK_STOP:\n return \"Stop\"\n elif res == GenRes.OK_LENGTH:\n return \"Length (output cut)\"\n elif res == GenRes.ERROR_JSON:\n return \"JSON decoding error\"\n\n elif res == GenRes.ERROR_JSON_SCHEMA_VAL:\n return \"JSON SCHEMA validation error\"\n elif res == GenRes.ERROR_JSON_SCHEMA_ERROR:\n return \"Error in JSON SCHEMA\"\n\n elif res == GenRes.ERROR_MODEL:\n return \"Model internal error\"\n else:\n raise ValueError(\"Bad/unknow GenRes\")\n
"},{"location":"api-reference/generation/#errors","title":"Errors","text":""},{"location":"api-reference/generation/#sibila.GenError","title":"GenError","text":"GenError(out)\n
Model generation exception, raised when the model was unable to return a response.
An error has happened during model generation.
Parameters:
Name Type Description Default out
GenOut
Model output
required Source code in sibila/gen.py
def __init__(self, \n out: GenOut):\n \"\"\"An error has happened during model generation.\n\n Args:\n out: Model output\n \"\"\"\n\n assert out.res != GenRes.OK_STOP, \"OK_STOP is not an error\" \n\n super().__init__()\n\n self.res = out.res\n self.text = out.text\n self.dic = out.dic\n self.value = out.value\n
"},{"location":"api-reference/generation/#sibila.GenError.raise_if_error","title":"raise_if_error staticmethod
","text":"raise_if_error(out, ok_length_is_error)\n
Raise an exception if the model returned an error
Parameters:
Name Type Description Default out
GenOut
Model returned info.
required ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error?
required Raises:
Type Description GenError
If an error was returned by model.
Source code in sibila/gen.py
@staticmethod\ndef raise_if_error(out: GenOut,\n ok_length_is_error: bool):\n \"\"\"Raise an exception if the model returned an error\n\n Args:\n out: Model returned info.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error?\n\n Raises:\n GenError: If an error was returned by model.\n \"\"\"\n\n if out.res != GenRes.OK_STOP:\n if out.res == GenRes.OK_LENGTH and not ok_length_is_error:\n return # set ok_length_is_error to ignore this error\n\n raise GenError(out)\n
"},{"location":"api-reference/generation/#sibila.GenOut","title":"GenOut dataclass
","text":"Model output, returned by gen_extract(), gen_json() and other model calls that don't raise exceptions.
"},{"location":"api-reference/generation/#sibila.GenOut.res","title":"res instance-attribute
","text":"res\n
Result of model generation.
"},{"location":"api-reference/generation/#sibila.GenOut.text","title":"text instance-attribute
","text":"text\n
Text generated by model.
"},{"location":"api-reference/generation/#sibila.GenOut.dic","title":"dic class-attribute
instance-attribute
","text":"dic = None\n
Python dictionary, output by the structured calls like gen_json().
"},{"location":"api-reference/generation/#sibila.GenOut.value","title":"value class-attribute
instance-attribute
","text":"value = None\n
Initialized instance value, dataclass or Pydantic BaseModel object, as returned in calls like extract().
"},{"location":"api-reference/generation/#sibila.GenOut.as_dict","title":"as_dict","text":"as_dict()\n
Return GenOut as a dict.
Source code in sibila/gen.py
def as_dict(self):\n \"\"\"Return GenOut as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenOut.__str__","title":"__str__","text":"__str__()\n
Source code in sibila/gen.py
def __str__(self):\n out = f\"Error={self.res.as_text(self.res)} text=\u2588{self.text}\u2588\"\n if self.dic is not None:\n out += f\" dic={self.dic}\"\n if self.value is not None:\n out += f\" value={self.value}\"\n return out\n
"},{"location":"api-reference/local_model/","title":"Local model classes","text":""},{"location":"api-reference/local_model/#sibila.LlamaCppModel","title":"LlamaCppModel","text":"LlamaCppModel(\n path,\n format=None,\n format_search_order=[\n \"name\",\n \"meta_template\",\n \"folder_json\",\n ],\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n n_gpu_layers=-1,\n main_gpu=0,\n n_batch=512,\n seed=4294967295,\n verbose=False,\n **llamacpp_kwargs\n)\n
Use local GGUF format models via llama.cpp engine.
Supports grammar-constrained JSON output following a JSON schema.
Parameters:
Name Type Description Default path
str
File path to the GGUF file.
required format
Optional[str]
Chat template format to use with model. Leave as None for auto-detection.
None
format_search_order
list[str]
Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].
['name', 'meta_template', 'folder_json']
genconf
Optional[GenConf]
Default generation configuration, which can be used in gen() and related. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Use 0 for maximum possible size, which may raise an out of memory error. None will use a default from the 'llamacpp' provider's '_default' entry at 'res/base_models.json'.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for no limit.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
n_gpu_layers
int
Number of model layers to run in a GPU. Defaults to -1 for all.
-1
main_gpu
int
Index of the GPU to use. Defaults to 0.
0
n_batch
int
Prompt processing batch size. Defaults to 512.
512
seed
int
Random number generation seed, for non zero temperature inference. Defaults to 4294967295.
4294967295
verbose
bool
Emit (very) verbose llama.cpp output. Defaults to False.
False
Raises:
Type Description ImportError
If llama-cpp-python is not installed.
ValueError
For arguments or settings problems.
NameError
If the model was not found or the file is corrupt.
AttributeError
If a suitable template format was not found.
MemoryError
If an out of memory situation arises.
Source code in sibila/llamacpp.py
def __init__(self,\n path: str,\n\n format: Optional[str] = None, \n format_search_order: list[str] = [\"name\", \"meta_template\", \"folder_json\"],\n\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # important LlamaCpp-specific args\n n_gpu_layers: int = -1,\n main_gpu: int = 0,\n n_batch: int = 512,\n seed: int = 4294967295,\n verbose: bool = False,\n\n # other LlamaCpp-specific args\n **llamacpp_kwargs\n ):\n \"\"\"\n Args:\n path: File path to the GGUF file.\n format: Chat template format to use with model. Leave as None for auto-detection.\n format_search_order: Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].\n genconf: Default generation configuration, which can be used in gen() and related. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used. Use 0 for maximum possible size, which may raise an out of memory error. None will use a default from the 'llamacpp' provider's '_default' entry at 'res/base_models.json'.\n max_tokens_limit: Maximum output tokens limit. None for no limit.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n n_gpu_layers: Number of model layers to run in a GPU. Defaults to -1 for all.\n main_gpu: Index of the GPU to use. Defaults to 0.\n n_batch: Prompt processing batch size. Defaults to 512.\n seed: Random number generation seed, for non zero temperature inference. Defaults to 4294967295.\n verbose: Emit (very) verbose llama.cpp output. Defaults to False.\n\n Raises:\n ImportError: If llama-cpp-python is not installed.\n ValueError: For arguments or settings problems.\n NameError: If the model was not found or the file is corrupt.\n AttributeError: If a suitable template format was not found.\n MemoryError: If an out of memory situation arises.\n \"\"\"\n\n self._llama = None # type: ignore[assignment]\n self._llava_ctx = None\n\n self.tokenizer = None # type: ignore[assignment]\n self._own_tokenizer = False\n\n if not has_llama_cpp:\n raise ImportError(\"Please install llama-cpp-python by running: pip install llama-cpp-python\")\n\n # also accept \"provider:path\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if path.startswith(provider_name):\n path = path[len(provider_name):]\n\n sub_paths = extract_sub_paths(path)\n for sp in sub_paths:\n if not os.path.isfile(sp):\n raise NameError(f\"Model file not found at '{sp}'\")\n\n llama_path = sub_paths[0]\n llava_proj_path = sub_paths[1] if len(sub_paths) > 1 else None\n\n # find ctx_len from metadata --and-- check file format\n max_ctx_len = 0\n try:\n md = load_gguf_metadata(llama_path)\n if md is not None:\n for key in md:\n if key.endswith('.context_length'):\n max_ctx_len = int(md[key])\n break\n except Exception as e:\n raise NameError(f\"Error loading file '{llama_path}': {e}\")\n\n\n if ctx_len is None: # find a default in Models _default dict\n defaults = Models.resolve_provider_defaults(\"llamacpp\", [\"ctx_len\"], 2)\n if defaults[\"ctx_len\"] is not None:\n ctx_len = defaults[\"ctx_len\"]\n logger.debug(f\"Defaulting ctx_len={ctx_len} from Models '_default' entry\")\n\n if ctx_len == 0: # default to maximum ctx_len - this can be dangerous, as big ctx_len will probably out of memory\n if max_ctx_len != 0:\n ctx_len = max_ctx_len\n else:\n raise ValueError(\"Cannot find model's maximum ctx_len information. Please provide a non-zero ctx_len arg\")\n\n if max_ctx_len != 0:\n if ctx_len > max_ctx_len: # type: ignore[operator]\n raise ValueError(f\"Arg ctx_len ({ctx_len}) is greater than model's maximum ({max_ctx_len})\")\n\n\n super().__init__(True,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # update kwargs from important args\n llamacpp_kwargs.update(n_ctx=ctx_len,\n n_batch=n_batch,\n n_gpu_layers=n_gpu_layers,\n main_gpu=main_gpu,\n seed=seed,\n verbose=verbose\n )\n\n logger.debug(f\"Creating inner Llama model with path='{llama_path}', llamacpp_kwargs={llamacpp_kwargs}\")\n\n\n try:\n with llamacpp_verbosity_manager(verbose):\n self._llama = Llama(model_path=llama_path, **llamacpp_kwargs)\n\n except Exception as e:\n raise MemoryError(f\"Could not load model file '{llama_path}'. \"\n \"This is usually an out of memory situation but could also be due to a corrupt file. \"\n f\"Internal error: {e}.\")\n\n\n self._model_path = llama_path\n\n\n # correct super __init__ values\n self.ctx_len = self._llama.n_ctx()\n\n if max_tokens_limit is not None:\n self.max_tokens_limit = max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n\n\n try:\n self.init_format(format,\n format_search_order,\n {\"name\": os.path.basename(path), # note: the multiple filename with '*'\n \"path\": path, # note: full path of the multiple filename with '*'\n \"meta_template_name\": \"tokenizer.chat_template\"}\n )\n except Exception as e:\n del self.tokenizer\n del self._llama\n raise AttributeError(str(e))\n\n\n # llava projector setup\n if llava_proj_path is not None:\n\n if not has_llava_cpp:\n raise ImportError(\"Llava is not available in this installation of llama-cpp-python\")\n\n logger.debug(f\"Creating inner Llava projector with path='{llava_proj_path}'\")\n\n with llamacpp_verbosity_manager(verbose):\n self._llava_ctx = llava_cpp.clip_model_load(llava_proj_path.encode(encoding='utf-8'), \n 0) # verbosity\n\n if self._llava_ctx is None:\n raise ValueError(f\"Failed to load llava projector: {llava_proj_path}\")\n\n self._model_path += \"*\" + llava_proj_path\n\n \"\"\"\n self._llava_exit_stack = ExitStack()\n def llava_free():\n with llamacpp_verbosity_manager(verbose):\n llava_cpp.clip_free(self._llava_ctx)\n\n self._llava_exit_stack.callback(llava_free)\n \"\"\"\n\n self.maybe_image_input = self._llava_ctx is not None\n\n\n\n if self.tokenizer is None:\n self.tokenizer = LlamaCppTokenizer(self._llama)\n self._own_tokenizer = True\n else:\n self._own_tokenizer = False\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.close","title":"close","text":"close()\n
Close model, release resources like memory or net connections.
Source code in sibila/llamacpp.py
def close(self):\n \"\"\"Close model, release resources like memory or net connections.\"\"\"\n\n if hasattr(self, \"tokenizer\") and self.tokenizer:\n if hasattr(self, \"_own_tokenizer\") and self._own_tokenizer:\n del self.tokenizer\n self.tokenizer = None\n\n if hasattr(self, \"_llava_ctx\") and self._llava_ctx: # only happens if llama_cpp was loaded\n llava_cpp.clip_free(self._llava_ctx)\n del self._llava_ctx\n self._llava_ctx = None\n\n if hasattr(self, \"_llama\") and self._llama:\n del self._llama\n self._llama = None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Raises:
Type Description ValueError
If trying to generate from an empty prompt.
RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
Source code in sibila/model.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Raises:\n ValueError: If trying to generate from an empty prompt.\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. \n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n text,finish = self._gen_thread(thread, genconf)\n\n return self._prepare_gen_out(text, finish, genconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.token_len","title":"token_len","text":"token_len(thread_or_text, _=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
Final thread or text to be passed to model.
required Returns:
Type Description int
Number of tokens used.
Source code in sibila/llamacpp.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n _: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: Final thread or text to be passed to model.\n\n Returns:\n Number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n token_ids = self.tokens_from_thread(thread)\n return len(token_ids)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = n_ctx()\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = _llava_ctx is not None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/llamacpp.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_path}' - '{self._llama._model.desc()}'\"\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.n_embd","title":"n_embd property
","text":"n_embd\n
Embedding size of model.
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.n_params","title":"n_params property
","text":"n_params\n
Total number of model parameters.
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.get_metadata","title":"get_metadata","text":"get_metadata()\n
Returns model metadata.
Source code in sibila/llamacpp.py
def get_metadata(self):\n \"\"\"Returns model metadata.\"\"\"\n out = {}\n buf = bytes(16 * 1024)\n lmodel = self._llama.model\n count = llama_cpp.llama_model_meta_count(lmodel)\n for i in range(count):\n res = llama_cpp.llama_model_meta_key_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n key = buf[:res].decode('utf-8')\n res = llama_cpp.llama_model_meta_val_str_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n value = buf[:res].decode('utf-8')\n out[key] = value\n return out\n
"},{"location":"api-reference/local_model/#sibila.Model","title":"Model","text":"Model(is_local_model, genconf, schemaconf, tokenizer)\n
Model is an abstract base class for common LLM model functionality. Many of the useful methods like extract() or json() are implemented here.
It should not be instantiated directly, instead LlamaCppModel, OpenAIModel, etc, all derive from this class.
Initializer for base model type, shared by actual model classes like LlamaCpp, OpenAI, etc.
Parameters:
Name Type Description Default is_local_model
bool
Is the model running locally?
required genconf
Union[GenConf, None]
Default generation configuration options, used if generation call doesn't supply one.
required schemaconf
Union[JSchemaConf, None]
Default configuration for JSON schema validation, used if generation call doesn't supply one.
required tokenizer
Union[Tokenizer, None]
Tokenizer used to encode text (even for message-based models).
required Source code in sibila/model.py
def __init__(self,\n is_local_model: bool,\n genconf: Union[GenConf, None],\n schemaconf: Union[JSchemaConf, None],\n tokenizer: Union[Tokenizer, None]):\n \"\"\"Initializer for base model type, shared by actual model classes like LlamaCpp, OpenAI, etc.\n\n Args:\n is_local_model: Is the model running locally?\n genconf: Default generation configuration options, used if generation call doesn't supply one.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one.\n tokenizer: Tokenizer used to encode text (even for message-based models).\n \"\"\"\n\n self.is_local_model = is_local_model\n\n self.ctx_len = 0\n self.max_tokens_limit = sys.maxsize\n self.output_key_name = \"output\"\n self.output_fn_name = \"json_out\"\n\n self.tokenizer = tokenizer # type: ignore[assignment]\n\n if genconf is None:\n self.genconf = GenConf()\n else:\n self.genconf = genconf.clone()\n\n if schemaconf is None:\n self.schemaconf = JSchemaConf()\n else:\n self.schemaconf = schemaconf.clone()\n\n\n\n # set either \"json\" or \"json_schema\" key values to None to skip.\n self.json_format_instructors = {\n \"json\": {\n \"bypass_if\": [\"json\"], # bypass appending if all lowercase text values are present in thread\n \"append_text\": \"Output JSON.\",\n \"sep_count\": 2\n },\n \"json_schema\": {\n \"bypass_if\": [\"json\", \"schema\"],\n \"append_text\": \"Output JSON matching the following schema:\\n{{json_schema}}\",\n \"sep_count\": 2\n }\n }\n\n # text going to model: tight, without \\u00xx\n self.json_in_dumps_kwargs = {\n \"indent\": None,\n \"ensure_ascii\": False\n } \n
"},{"location":"api-reference/local_model/#sibila.Model.close","title":"close abstractmethod
","text":"close()\n
Close model, release resources like memory or net connections.
Source code in sibila/model.py
@abstractmethod\ndef close(self):\n \"\"\"Close model, release resources like memory or net connections.\"\"\"\n ...\n
"},{"location":"api-reference/local_model/#sibila.Model.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.Model.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.Model.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/local_model/#sibila.Model.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.Model.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.Model.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
NotImplementedError
If method was not defined by a derived class.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/model.py
def gen(self,\n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n NotImplementedError: If method was not defined by a derived class.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n raise NotImplementedError\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.token_len","title":"token_len abstractmethod
","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
Final thread or text to be passed to model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Number of tokens occupied.
Source code in sibila/model.py
@abstractmethod\ndef token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: Final thread or text to be passed to model.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Number of tokens occupied.\n \"\"\"\n ...\n
"},{"location":"api-reference/local_model/#sibila.Model.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/local_model/#sibila.Model.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = 0\n
Maximum context token length, including input and model output. There can be a limit for output tokens in the max_tokens_limit.
"},{"location":"api-reference/local_model/#sibila.Model.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input\n
Does the model support images as input? A value of False is definitive, a value of True is actually a maybe, as some providers don't give this information. Check the model specs to be certain.
"},{"location":"api-reference/local_model/#sibila.Model.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/local_model/#sibila.Model.desc","title":"desc abstractmethod
","text":"desc()\n
Model description.
Source code in sibila/model.py
@abstractmethod\ndef desc(self) -> str:\n \"\"\"Model description.\"\"\"\n ...\n
"},{"location":"api-reference/models/","title":"Models factory","text":""},{"location":"api-reference/models/#sibila.Models","title":"Models","text":"Model and template format directory that unifies (and simplifies) model access and configuration.
This env variable is checked and used during initialization SIBILA_MODELS: ';'-delimited list of folders where to find: models.json, formats.json and model files.
= Models Directory =
Useful to create models from resource names like \"llamacpp:openchat\" or \"openai:gpt-4\". This makes it simple to change a model, store model settings, to compare model outputs, etc.
User can add new entries from script or with JSON filenames, via the add() call. New directory entries with the same name are merged into existing ones for each added config.
Uses file \"sibila/res/base_models.json\" for the initial defaults, which the user can augment by calling setup() with own config files or directly adding model config with set_model().
An example of a model directory JSON config file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Rules for entry inheritance/overriding
Entries in the '_default' key of each provider will serve as defaults for models of that provider. Model entries in base_models_dir (automatically loaded from 'res/base_models.json') are overridden by any entries of the same name loaded from a local 'models.json' file with Models.setup(). Here, overridden means local keys of the same name replace base keys (as a dict.update()). However '_default' entries only apply separately to either base_models_dir or 'local models.json', as in a lexical scope.
= Format Directory =
Detects chat templates from model name/filename or uses from metadata if possible.
This directory can be setup from a JSON file or by calling set_format().
Any new entries with the same name replace previous ones on each new call.
Initializes from file \"sibila/res/base_formats.json\".
Example of a \"formats.json\" file:
{\n \"chatml\": {\n # template is a Jinja template for this model\n \"template\": \"{% for message in messages %}...\"\n },\n\n \"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n }, \n\n \"phi\": {\n \"match\": \"phi\",\n \"template\": \"...\"\n },\n\n \"phi2\": \"phi\",\n # this is a link: \"phi2\" -> \"phi\"\n}\n
Jinja2 templates receive a standard ChatML messages list (created from a Thread) and must deal with the following:
-
In models that don't use a system message, template must take care of prepending it to first user message.
-
The add_generation_prompt template variable is always set as True.
"},{"location":"api-reference/models/#sibila.Models.setup","title":"setup classmethod
","text":"setup(\n path=None, clear=False, add_cwd=True, load_from_env=True\n)\n
Initialize models and formats directory from given model files folder and/or contained configuration files. Path can start with \"~/\" current account's home directory.
Parameters:
Name Type Description Default path
Optional[Union[str, list[str]]]
Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.
None
clear
bool
Set to clear existing directories before loading from path arg.
False
add_cwd
bool
Add current working directory to search path.
True
load_from_env
bool
Load from SIBILA_MODELS env variable?
True
Source code in sibila/models.py
@classmethod\ndef setup(cls,\n path: Optional[Union[str,list[str]]] = None,\n clear: bool = False,\n add_cwd: bool = True,\n load_from_env: bool = True):\n \"\"\"Initialize models and formats directory from given model files folder and/or contained configuration files.\n Path can start with \"~/\" current account's home directory.\n\n Args:\n path: Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.\n clear: Set to clear existing directories before loading from path arg.\n add_cwd: Add current working directory to search path.\n load_from_env: Load from SIBILA_MODELS env variable?\n \"\"\"\n\n if clear:\n cls.clear()\n\n cls._ensure(add_cwd, \n load_from_env)\n\n if path is not None:\n if isinstance(path, str):\n path_list = [path]\n else:\n path_list = path\n\n cls._read_any(path_list)\n
"},{"location":"api-reference/models/#sibila.Models.create","title":"create classmethod
","text":"create(\n res_name,\n genconf=None,\n ctx_len=None,\n *,\n resolved_create_args=None,\n **over_args\n)\n
Create a model.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required genconf
Optional[GenConf]
Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Overrides directory defaults. Defaults to None.
None
resolved_create_args
Optional[dict]
Pass an empty dict to be filled by this method with the resolved args used in model creation. Defaults to None.
None
over_args
Union[Any]
Model-specific creation args, which will override default args set in model directory.
{}
Returns:
Name Type Description Model
Model
the initialized model.
Source code in sibila/models.py
@classmethod\ndef create(cls,\n res_name: str,\n\n # common to all providers\n genconf: Optional[GenConf] = None,\n ctx_len: Optional[int] = None,\n\n *,\n # debug/testing\n resolved_create_args: Optional[dict] = None,\n\n # model-specific overriding:\n **over_args: Union[Any]) -> Model:\n \"\"\"Create a model.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n genconf: Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.\n ctx_len: Maximum context length to be used. Overrides directory defaults. Defaults to None.\n resolved_create_args: Pass an empty dict to be filled by this method with the resolved args used in model creation. Defaults to None.\n over_args: Model-specific creation args, which will override default args set in model directory.\n\n Returns:\n Model: the initialized model.\n \"\"\"\n\n try:\n provider, _, args = cls.resolve_model_entry(res_name, **over_args)\n except ValueError as e:\n raise NameError(str({e}))\n\n # override genconf, ctx_len\n if genconf is None:\n genconf = cls.genconf\n\n if genconf is not None:\n args[\"genconf\"] = genconf\n\n elif \"genconf\" in args and isinstance(args[\"genconf\"], dict):\n # transform dict into a GenConf instance:\n args[\"genconf\"] = GenConf.from_dict(args[\"genconf\"])\n\n if ctx_len is not None:\n args[\"ctx_len\"] = ctx_len\n\n if resolved_create_args is not None:\n resolved_create_args.update(args)\n\n\n logger.debug(f\"Resolved '{res_name}' to provider '{provider}' with args: {args}\")\n\n\n model: Model\n if provider == \"anthropic\":\n\n from .anthropic import AnthropicModel\n model = AnthropicModel(**args)\n\n elif provider == \"fireworks\":\n\n from .schema_format_openai import FireworksModel\n model = FireworksModel(**args)\n\n elif provider == \"groq\":\n\n from .schema_format_openai import GroqModel\n model = GroqModel(**args)\n\n elif provider == \"llamacpp\":\n from .llamacpp import LlamaCppModel, extract_sub_paths\n\n # resolve filename -> path. Path filenames can be in the form model1*model2\n sub_paths = args[\"name\"].split('*')\n\n # only resolve first path. If not found, let LlamaCpp raise the error below\n sub_paths[0] = cls._locate_file(sub_paths[0]) or sub_paths[0]\n\n # rejoin located paths with '*' (if multiple)\n path = '*'.join(sub_paths)\n logger.debug(f\"Resolved llamacpp model '{args['name']}' to '{path}'\")\n\n # rename \"name\" -> \"path\" which LlamaCppModel is expecting\n del args[\"name\"]\n args[\"path\"] = path\n\n model = LlamaCppModel(**args)\n\n elif provider == \"mistral\":\n\n from .mistral import MistralModel\n model = MistralModel(**args)\n\n elif provider == \"openai\":\n\n from .openai import OpenAIModel\n model = OpenAIModel(**args)\n\n elif provider == \"together\":\n\n from .schema_format_openai import TogetherModel\n model = TogetherModel(**args)\n\n else:\n raise ValueError(f\"Unknown provider '{provider}' for '{res_name}'\")\n\n\n\n return model\n
"},{"location":"api-reference/models/#sibila.Models.add_models_search_path","title":"add_models_search_path classmethod
","text":"add_models_search_path(path)\n
Prepends new paths to model files search path.
Parameters:
Name Type Description Default path
Union[str, list[str]]
A path or list of paths to add to model search path.
required Source code in sibila/models.py
@classmethod\ndef add_models_search_path(cls,\n path: Union[str,list[str]]):\n \"\"\"Prepends new paths to model files search path.\n\n Args:\n path: A path or list of paths to add to model search path.\n \"\"\"\n\n cls._ensure()\n\n prepend_path(cls.models_search_path, path)\n\n logger.debug(f\"Adding '{path}' to search_path\")\n
"},{"location":"api-reference/models/#sibila.Models.set_genconf","title":"set_genconf classmethod
","text":"set_genconf(genconf)\n
Set the GenConf to use as default for model creation.
Parameters:
Name Type Description Default genconf
GenConf
Model generation configuration.
required Source code in sibila/models.py
@classmethod\ndef set_genconf(cls,\n genconf: GenConf):\n \"\"\"Set the GenConf to use as default for model creation.\n\n Args:\n genconf: Model generation configuration.\n \"\"\"\n cls.genconf = genconf\n
"},{"location":"api-reference/models/#sibila.Models.list_models","title":"list_models classmethod
","text":"list_models(\n name_query, providers, include_base, resolved_values\n)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match model names. Empty string for all.
required providers
list[str]
Filter by these exact provider names. Empty list for all.
required include_base
bool
Also list fused values from base_models_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are model res_names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_models(cls,\n name_query: str,\n providers: list[str],\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match model names. Empty string for all.\n providers: Filter by these exact provider names. Empty list for all.\n include_base: Also list fused values from base_models_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are model res_names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n out = {}\n\n name_query = name_query.lower()\n\n for prov_name in models_dir:\n\n if providers and prov_name not in providers:\n continue\n\n prov_dic = models_dir[prov_name]\n\n for name in prov_dic:\n\n if name == cls.DEFAULT_ENTRY_NAME:\n continue\n\n if name_query and name_query not in name.lower():\n continue\n\n entry_res_name = prov_name + \":\" + name\n\n if resolved_values:\n # okay to use get_model_entry() because it resolves to fused\n res = cls.get_model_entry(entry_res_name) # type: ignore[assignment]\n if res is None:\n continue\n else:\n val = res[1]\n else:\n val = prov_dic[name]\n\n out[entry_res_name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_model_entry","title":"get_model_entry classmethod
","text":"get_model_entry(res_name)\n
Get a resolved model entry. Resolved means following any links.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required Returns:
Type Description Union[tuple[str, dict], None]
Resolved entry (res_name,dict) or None if not found.
Source code in sibila/models.py
@classmethod\ndef get_model_entry(cls,\n res_name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved model entry. Resolved means following any links.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n\n Returns:\n Resolved entry (res_name,dict) or None if not found.\n \"\"\"\n\n cls._ensure() \n\n models_dir = cls.fused_models_dir()\n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = models_dir[provider]\n\n if name in prov:\n return provider + \":\" + name, prov[name]\n else:\n return None\n
"},{"location":"api-reference/models/#sibila.Models.has_model_entry","title":"has_model_entry classmethod
","text":"has_model_entry(res_name)\n
Source code in sibila/models.py
@classmethod\ndef has_model_entry(cls,\n res_name: str) -> bool:\n return cls.get_model_entry(res_name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.set_model","title":"set_model classmethod
","text":"set_model(\n res_name, model_name, format_name=None, genconf=None\n)\n
Add model configuration for given res_name.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
str
Model name or filename identifier.
required format_name
Optional[str]
Format name used by model. Defaults to None.
None
genconf
Optional[GenConf]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model(cls,\n res_name: str,\n model_name: str,\n format_name: Optional[str] = None,\n genconf: Optional[GenConf] = None):\n \"\"\"Add model configuration for given res_name.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier.\n format_name: Format name used by model. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry: dict = {\n \"name\": model_name\n }\n\n if format_name:\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n\n if genconf:\n entry[\"genconf\"] = genconf.as_dict()\n\n cls.models_dir[provider][name] = entry\n
"},{"location":"api-reference/models/#sibila.Models.update_model","title":"update_model classmethod
","text":"update_model(\n res_name,\n model_name=None,\n format_name=None,\n genconf=None,\n)\n
update model fields
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
Optional[str]
Model name or filename identifier. Defaults to None.
None
format_name
Optional[str]
Format name used by model. Use \"\" to delete. Defaults to None.
None
genconf
Union[GenConf, str, None]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef update_model(cls,\n res_name: str,\n model_name: Optional[str] = None,\n format_name: Optional[str] = None,\n genconf: Union[GenConf,str,None] = None):\n\n \"\"\"update model fields\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier. Defaults to None.\n format_name: Format name used by model. Use \"\" to delete. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry = cls.models_dir[provider][name]\n\n if model_name:\n entry[\"name\"] = model_name\n\n if format_name is not None:\n if format_name != \"\":\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n else:\n del entry[\"format\"]\n\n if genconf is not None:\n if genconf != \"\":\n entry[\"genconf\"] = genconf\n else:\n del entry[\"genconf\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_model_link","title":"set_model_link classmethod
","text":"set_model_link(res_name, link_name)\n
Create a model link into another model.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required link_name
str
Name of model this entry links to.
required Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model_link(cls,\n res_name: str,\n link_name: str):\n \"\"\"Create a model link into another model.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n link_name: Name of model this entry links to.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, True)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n # first: ensure link_name is a res_name\n if ':' not in link_name:\n link_name = provider + \":\" + link_name\n\n if not cls.has_model_entry(link_name):\n raise ValueError(f\"Could not find linked model '{link_name}'\")\n\n # second: check link name is without provider if same\n link_split = link_name.split(\":\")\n if len(link_split) == 2:\n if link_split[0] == provider: # remove same \"provider:\"\n link_name = link_split[1]\n\n cls.models_dir[provider][name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_model","title":"delete_model classmethod
","text":"delete_model(res_name)\n
Delete a model entry.
Parameters:
Name Type Description Default res_name
str
Model entry in the form \"provider:name\".
required Source code in sibila/models.py
@classmethod\ndef delete_model(cls,\n res_name: str):\n \"\"\"Delete a model entry.\n\n Args:\n res_name: Model entry in the form \"provider:name\".\n \"\"\"\n\n cls._ensure()\n\n provider, name = provider_name_from_urn(res_name,\n allow_alias_provider=False)\n\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}', must be one of: {cls.ALL_PROVIDER_NAMES}\")\n\n prov = cls.models_dir[provider] \n if name not in prov:\n raise ValueError(f\"Model '{res_name}' not found\")\n\n # verify if any entry links to name:\n def check_link_to(link_to_name: str, \n provider: str) -> Union[str, None]:\n\n for name,entry in cls.models_dir[provider].items():\n if isinstance(entry,str) and entry == link_to_name:\n return name\n return None\n\n offender = check_link_to(name, provider)\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry '{provider}:{offender}' links to it\")\n\n offender = check_link_to(name, \"alias\")\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry 'alias:{offender}' links to it\")\n\n del prov[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_models","title":"save_models classmethod
","text":"save_models(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_models(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"models.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n # clear providers with no models:\n for provider in cls.ALL_PROVIDER_NAMES:\n if provider in models_dir and not models_dir[provider]:\n del models_dir[provider]\n\n json.dump(models_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.list_formats","title":"list_formats classmethod
","text":"list_formats(name_query, include_base, resolved_values)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match format names. Empty string for all.
required include_base
bool
Also list base_formats_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are format names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_formats(cls,\n name_query: str,\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match format names. Empty string for all.\n include_base: Also list base_formats_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are format names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n out = {}\n\n name_query = name_query.lower()\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n for name in formats_dir.keys():\n\n if name_query and name_query not in name.lower():\n continue\n\n val = formats_dir[name]\n\n if resolved_values:\n res = cls.get_format_entry(name)\n if res is None:\n continue\n else:\n val = res[1]\n\n out[name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_format_entry","title":"get_format_entry classmethod
","text":"get_format_entry(name)\n
Get a resolved format entry by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple of (resolved_name, format_entry).
Source code in sibila/models.py
@classmethod\ndef get_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved format entry by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Tuple of (resolved_name, format_entry).\n \"\"\"\n\n cls._ensure()\n\n return get_format_entry(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.has_format_entry","title":"has_format_entry classmethod
","text":"has_format_entry(name)\n
Source code in sibila/models.py
@classmethod\ndef has_format_entry(cls,\n name: str) -> bool:\n return cls.get_format_entry(name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.get_format_template","title":"get_format_template classmethod
","text":"get_format_template(name)\n
Get a format template by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[str, None]
Resolved format template str.
Source code in sibila/models.py
@classmethod\ndef get_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Get a format template by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Resolved format template str.\n \"\"\"\n\n res = cls.get_format_entry(name)\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.match_format_entry","title":"match_format_entry classmethod
","text":"match_format_entry(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple (name, format_entry) where name is a resolved name. Or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Tuple (name, format_entry) where name is a resolved name. Or None if none found.\n \"\"\"\n\n cls._ensure()\n\n return search_format(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.match_format_template","title":"match_format_template classmethod
","text":"match_format_template(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[str, None]
Format template or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Format template or None if none found.\n \"\"\"\n\n res = cls.match_format_entry(name)\n\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_format","title":"set_format classmethod
","text":"set_format(name, template, match=None)\n
Add a format entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required template
str
The Chat template format in Jinja2 format
required match
Optional[str]
Regex that matches names/filenames that use this format. Default is None.
None
Source code in sibila/models.py
@classmethod\ndef set_format(cls,\n name: str,\n template: str,\n match: Optional[str] = None):\n \"\"\"Add a format entry to the format directory.\n\n Args:\n name: Format entry name.\n template: The Chat template format in Jinja2 format\n match: Regex that matches names/filenames that use this format. Default is None.\n \"\"\"\n\n cls._ensure()\n\n if \"{{\" not in template: # a link_name for the template\n if not cls.has_format_entry(template):\n raise ValueError(f\"Could not find linked template entry '{template}'.\")\n\n entry = {\n \"template\": template\n }\n if match is not None:\n entry[\"match\"] = match\n\n cls.formats_dir[name] = entry \n
"},{"location":"api-reference/models/#sibila.Models.set_format_link","title":"set_format_link classmethod
","text":"set_format_link(name, link_name)\n
Add a format link entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required link_name
str
Name of format that this entry links to.
required Source code in sibila/models.py
@classmethod\ndef set_format_link(cls,\n name: str,\n link_name: str):\n \"\"\"Add a format link entry to the format directory.\n\n Args:\n name: Format entry name.\n link_name: Name of format that this entry links to.\n \"\"\"\n\n cls._ensure()\n\n if not cls.has_format_entry(link_name):\n raise ValueError(f\"Could not find linked entry '{link_name}'.\")\n\n cls.formats_dir[name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_format","title":"delete_format classmethod
","text":"delete_format(name)\n
Delete a format entry.
Parameters:
Name Type Description Default name
str
Format entry name.
required Source code in sibila/models.py
@classmethod\ndef delete_format(cls,\n name: str):\n \"\"\"Delete a format entry.\n\n Args:\n name: Format entry name.\n \"\"\"\n\n cls._ensure()\n\n if name not in cls.formats_dir:\n raise ValueError(f\"Format name '{name}' not found.\")\n\n for check_name,entry in cls.formats_dir.items():\n if isinstance(entry,str) and entry == name:\n raise ValueError(f\"Cannot delete '{name}', as entry '{check_name}' links to it\")\n\n del cls.formats_dir[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_formats","title":"save_formats classmethod
","text":"save_formats(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_formats(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"formats.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n json.dump(formats_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.info","title":"info classmethod
","text":"info(include_base=True, verbose=False)\n
Return information about current setup.
Parameters:
Name Type Description Default verbose
bool
If False, formats directory values are abbreviated. Defaults to False.
False
Returns:
Type Description str
Textual information about the current setup.
Source code in sibila/models.py
@classmethod\ndef info(cls,\n include_base: bool = True,\n verbose: bool = False) -> str:\n \"\"\"Return information about current setup.\n\n Args:\n verbose: If False, formats directory values are abbreviated. Defaults to False.\n\n Returns:\n Textual information about the current setup.\n \"\"\"\n\n cls._ensure()\n\n out = \"\"\n\n out += f\"Models search path: {cls.models_search_path}\\n\"\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n out += f\"Models directory:\\n{pformat(models_dir, sort_dicts=False)}\\n\"\n\n out += f\"Model Genconf:\\n{cls.genconf}\\n\"\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n if not verbose:\n fordir = {}\n for key in formats_dir:\n fordir[key] = deepcopy(formats_dir[key])\n if isinstance(fordir[key], dict) and \"template\" in fordir[key]:\n fordir[key][\"template\"] = fordir[key][\"template\"][:14] + \"...\"\n else:\n fordir = formats_dir\n\n out += f\"Formats directory:\\n{pformat(fordir)}\"\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.clear","title":"clear classmethod
","text":"clear()\n
Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.
Source code in sibila/models.py
@classmethod\ndef clear(cls):\n \"\"\"Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.\"\"\"\n cls.models_dir = None\n cls.models_search_path = []\n cls.formats_dir = None\n
"},{"location":"api-reference/multigen/","title":"Multigen","text":""},{"location":"api-reference/multigen/#sibila.multigen","title":"multigen","text":"Functions for comparing output across models.
- thread_multigen(), query_multigen() and multigen(): Compare outputs across models.
- cycle_gen_print(): For a list of models, sequentially grow a Thread with model responses to given IN messages.
"},{"location":"api-reference/multigen/#sibila.multigen.thread_multigen","title":"thread_multigen","text":"thread_multigen(\n threads,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n thread_titles=None,\n)\n
Generate a single thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
thread_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def thread_multigen(threads: list[Thread],\n model_names: list[str],\n\n text: Union[str,list[str],None] = None,\n csv: Union[str,list[str],None] = None,\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n thread_titles: Optional[list[str]] = None \n ) -> list[list[GenOut]]:\n \"\"\"Generate a single thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n thread_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n table = multigen(threads,\n model_names=model_names, \n gencall=gencall,\n genconf=genconf)\n\n # table[threads,models]\n\n if thread_titles is None:\n thread_titles = [str(th) for th in threads]\n\n def format(format_fn, cmds):\n if cmds is None or not cmds:\n return\n\n f = StringIO(newline='')\n\n format_fn(f,\n table, \n title_list=thread_titles,\n model_names=model_names,\n out_keys=out_keys)\n fmtd = f.getvalue()\n\n if not isinstance(cmds, list):\n cmds = [cmds]\n for c in cmds:\n if c == 'print':\n print(fmtd)\n else: # path\n with open(c, \"w\", encoding=\"utf-8\") as f:\n f.write(fmtd)\n\n format(format_text, text)\n format(format_csv, csv)\n\n return table\n
"},{"location":"api-reference/multigen/#sibila.multigen.query_multigen","title":"query_multigen","text":"query_multigen(\n in_list,\n inst_text,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n in_titles=None,\n)\n
Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
in_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)).
list[list[GenOut]]
For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def query_multigen(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n text: Union[str,list[str],None] = None, # \"print\", path\n csv: Union[str,list[str],None] = None, # \"print\", path\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n in_titles: Optional[list[str]] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n in_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). \n For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\" \n\n th_list = []\n for in_text in in_list:\n th = Thread.make_INST_IN(inst_text, in_text)\n th_list.append(th)\n\n if in_titles is None:\n in_titles = in_list\n\n out = thread_multigen(th_list, \n model_names=model_names, \n text=text,\n csv=csv,\n gencall=gencall,\n genconf=genconf,\n out_keys=out_keys,\n thread_titles=in_titles)\n\n return out\n
"},{"location":"api-reference/multigen/#sibila.multigen.multigen","title":"multigen","text":"multigen(\n threads,\n *,\n models=None,\n model_names=None,\n model_names_del_after=True,\n gencall=None,\n genconf=None\n)\n
Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.
Actual generation for each model is implemented by the gencall arg Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required models
Optional[list[Model]]
A list of initialized models. Defaults to None.
None
model_names
Optional[list[str]]
--Or-- A list of Models names. Defaults to None.
None
model_names_del_after
bool
Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.
True
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
Raises:
Type Description ValueError
Only one of models or model_names can be given.
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def multigen(threads: list[Thread],\n *,\n models: Optional[list[Model]] = None, # existing models\n\n model_names: Optional[list[str]] = None,\n model_names_del_after: bool = True,\n\n gencall: Optional[Callable] = None,\n genconf: Optional[GenConf] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.\n\n Actual generation for each model is implemented by the gencall arg Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n models: A list of initialized models. Defaults to None.\n model_names: --Or-- A list of Models names. Defaults to None.\n model_names_del_after: Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n\n Raises:\n ValueError: Only one of models or model_names can be given.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n if not ((models is None) ^ ((model_names is None))):\n raise ValueError(\"Only one of models or model_names can be given\")\n\n if gencall is None:\n gencall = _default_gencall_text\n\n mod_count = len(models) if models is not None else len(model_names) # type: ignore[arg-type]\n\n all_out = []\n\n for i in range(mod_count):\n if models is not None:\n model = models[i]\n logger.debug(f\"Model: {model.desc}\")\n else:\n name = model_names[i] # type: ignore[index]\n model = Models.create(name)\n logger.info(f\"Model: {name} -> {model.desc}\")\n\n mod_out = []\n for th in threads:\n out = gencall(model, th, genconf)\n\n mod_out.append(out)\n\n all_out.append(mod_out)\n\n if model_names_del_after and models is None:\n model.close()\n del model\n\n # all_out is currently shaped (M,T) -> transpose to (T,M), so that each row contains thread t for all models\n tout = []\n for t in range(len(threads)):\n tmout = [] # thread t for all models\n for m in range(mod_count):\n tmout.append(all_out[m][t])\n\n tout.append(tmout)\n\n return tout\n
"},{"location":"api-reference/multigen/#sibila.multigen.cycle_gen_print","title":"cycle_gen_print","text":"cycle_gen_print(\n in_list,\n inst_text,\n model_names,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n json_kwargs={\n \"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False,\n },\n)\n
For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.
Works by doing:
- Generate an INST+IN prompt for a list of models. (Same INST for all).
- Append the output of each model to its own Thread.
- Append the next IN prompt and generate again. Back to 2.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
json_kwargs
dict
JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.
{'indent': 2, 'sort_keys': False, 'ensure_ascii': False}
Source code in sibila/multigen.py
def cycle_gen_print(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n json_kwargs: dict = {\"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False}\n ):\n \"\"\"For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.\n\n Works by doing:\n\n 1. Generate an INST+IN prompt for a list of models. (Same INST for all).\n 2. Append the output of each model to its own Thread.\n 3. Append the next IN prompt and generate again. Back to 2.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n json_kwargs: JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n if gencall is None:\n gencall = _default_gencall_text\n\n\n n_model = len(model_names)\n n_ins = len(in_list)\n\n for m in range(n_model):\n\n name = model_names[m]\n model = Models.create(name)\n\n print('=' * 80)\n print(f\"Model: {name} -> {model.desc}\")\n\n th = Thread(inst=inst_text)\n\n for i in range(n_ins):\n in_text = in_list[i]\n print(f\"IN: {in_text}\")\n\n th += Msg.make_IN(in_text)\n\n out = gencall(model, th, genconf)\n\n out_dict = out.as_dict()\n\n print(\"OUT\")\n\n for k in out_keys:\n\n if k in out_dict and out_dict[k] is not None:\n\n if k != out_keys[0]: # not first\n print(\"-\" * 20)\n\n val = nice_print(k, out_dict[k], json_kwargs)\n print(val)\n\n th += Msg.make_OUT(out.text)\n\n model.close()\n del model\n
"},{"location":"api-reference/remote_model/","title":"Remote model classes","text":""},{"location":"api-reference/remote_model/#sibila.OpenAIModel","title":"OpenAIModel","text":"OpenAIModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n overhead_per_msg=None,\n token_estimation_factor=None,\n create_tokenizer=False,\n other_init_kwargs={}\n)\n
Access an OpenAI model.
Supports constrained JSON output, via the OpenAI API tools mechanism.
Ref https://platform.openai.com/docs/api-reference/chat/create
Create an OpenAI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL or a default.
None
overhead_per_msg
Optional[int]
Overhead tokens to account for when calculating token length. None for model's default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
create_tokenizer
bool
When no tokenizer is passed, should try to create one?
False
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI-specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n overhead_per_msg: Optional[int] = None,\n token_estimation_factor: Optional[float] = None,\n create_tokenizer: bool = False,\n\n # other OpenAI-specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create an OpenAI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL or a default.\n overhead_per_msg: Overhead tokens to account for when calculating token length. None for model's default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n create_tokenizer: When no tokenizer is passed, should try to create one?\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_openai:\n raise ImportError(\"Please install openai by running: pip install openai\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n tokenizer\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n overhead_per_msg is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_overhead_per_msg = overhead_per_msg\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"overhead_per_msg\",\n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") # type: ignore[assignment]\n default_overhead_per_msg = settings.get(\"overhead_per_msg\") # type: ignore[assignment]\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if ctx_len is None and default_ctx_len is None:\n default_ctx_len = 4096\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n\n if max_tokens_limit is None and default_max_tokens_limit is None:\n default_max_tokens_limit = ctx_len or default_ctx_len \n # don't warn: assume equal to ctx_len: logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n\n if overhead_per_msg is None and default_overhead_per_msg is None:\n default_overhead_per_msg = 3\n # don't warn for this setting due to derived model classes (none uses it)\n\n if token_estimation_factor is None and default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._overhead_per_msg = overhead_per_msg or default_overhead_per_msg\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = True # True means maybe - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n\n if self.tokenizer is None and create_tokenizer:\n try:\n self.tokenizer = OpenAITokenizer(self._model_name)\n except Exception as e:\n logger.warning(f\"Could not create a local tokenizer for model '{self._model_name}' - \"\n \"token length calculation will be disabled and assume defaults. \"\n \"To support recent OpenAI models, install the latest tiktoken version with 'pip install -U tiktoken'. \"\n f\"Internal error: {e}\")\n\n\n self._client_init_kwargs = other_init_kwargs\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key\n if base_url is not None:\n self._client_init_kwargs[\"base_url\"] = base_url\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = True\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
Requires OpenAI API key, passed as this arg or set in env variable OPENAI_API_KEY.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models.
Source code in sibila/openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: Requires OpenAI API key, passed as this arg or set in env variable OPENAI_API_KEY.\n\n Returns:\n Returns a list of known models.\n \"\"\"\n\n client = openai.OpenAI(api_key=api_key)\n model_list = client.models.list()\n\n out = []\n for model in model_list.data:\n out.append(model.id)\n return sorted(out)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel","title":"AnthropicModel","text":"AnthropicModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n api_key=None,\n token_estimation_factor=None,\n anthropic_init_kwargs={}\n)\n
Access an Anthropic model. Supports constrained JSON output, via the Anthropic API function calling mechanism.
Ref https://docs.anthropic.com/claude/docs/intro-to-claude
Create an Anthropic remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
api_key
Optional[str]
Anthropic API key. Defaults to None, which will use env variable ANTHROPIC_API_KEY.
None
token_estimation_factor
Optional[float]
Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
anthropic_init_kwargs
dict
Extra args for Anthropic() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If Anthropic API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/anthropic.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n\n # most important Anthropic-specific args\n api_key: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other Anthropic-specific args\n anthropic_init_kwargs: dict = {},\n ):\n \"\"\"Create an Anthropic remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n api_key: Anthropic API key. Defaults to None, which will use env variable ANTHROPIC_API_KEY.\n token_estimation_factor: Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n anthropic_init_kwargs: Extra args for Anthropic() initialization. Defaults to {}.\n\n Raises:\n ImportError: If Anthropic API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_anthropic:\n raise ImportError(\"Please install anthropic API by running: pip install anthropic\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n None\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") or default_ctx_len\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if default_ctx_len is None:\n default_ctx_len = 200000\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n if default_max_tokens_limit is None:\n default_max_tokens_limit = default_ctx_len\n logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n if default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = True # currently all Anthropic models support image input - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n self._client_init_kwargs = anthropic_init_kwargs\n\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key \n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/anthropic.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n if \"tools\" in kwargs:\n response = self._client.beta.tools.messages.create(**kwargs) # type: ignore[attr-defined]\n else:\n response = self._client.messages.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens occupied.
Source code in sibila/anthropic.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens occupied.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n OVERHEAD_PER_MSG = 3\n num_tokens = 0\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n # print(num_tokens)\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = True\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/anthropic.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"AnthropicModel: {self._model_name}\"\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel","title":"FireworksModel","text":"FireworksModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a Fireworks AI model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://readme.fireworks.ai/docs/structured-response-formatting
https://readme.fireworks.ai/reference/createchatcompletion
Create a Fireworks AI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable FIREWORKS_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable FIREWORKS_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a Fireworks AI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable FIREWORKS_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable FIREWORKS_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"FIREWORKS_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"FIREWORKS_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no Fireworks models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel","title":"GroqModel","text":"GroqModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a Groq model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://console.groq.com/docs/tool-use
https://github.com/groq/groq-api-cookbook/blob/main/parallel-tool-use/parallel-tool-use.ipynb
Create a Groq remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable GROQ_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable GROQ_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a Groq remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable GROQ_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable GROQ_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"GROQ_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"GROQ_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no Groq models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel","title":"MistralModel","text":"MistralModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n api_key=None,\n token_estimation_factor=None,\n mistral_init_kwargs={}\n)\n
Access a Mistral AI model. Supports constrained JSON output, via the Mistral API function calling mechanism.
Ref https://docs.mistral.ai/guides/function-calling/
Create a Mistral AI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
api_key
Optional[str]
Mistral API key. Defaults to None, which will use env variable MISTRAL_API_KEY.
None
token_estimation_factor
Optional[float]
Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
mistral_init_kwargs
dict
Extra args for mistral.MistralClient() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If Mistral API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/mistral.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n\n # most important Mistral-specific args\n api_key: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other Mistral-specific args\n mistral_init_kwargs: dict = {},\n ):\n \"\"\"Create a Mistral AI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n api_key: Mistral API key. Defaults to None, which will use env variable MISTRAL_API_KEY.\n token_estimation_factor: Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n mistral_init_kwargs: Extra args for mistral.MistralClient() initialization. Defaults to {}.\n\n Raises:\n ImportError: If Mistral API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_mistral:\n raise ImportError(\"Please install mistral by running: pip install mistralai\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n None\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") or default_ctx_len\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if default_ctx_len is None:\n default_ctx_len = 32768\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n if default_max_tokens_limit is None:\n default_max_tokens_limit = default_ctx_len\n logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n if default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = False # no Mistral models currently support image input - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n self._client_init_kwargs = mistral_init_kwargs\n\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key \n elif \"api_key\" not in self._client_init_kwargs and \"MISTRAL_API_KEY\" in os.environ:\n # \"MISTRAL_API_KEY\" env key is ignored in pytest?\n self._client_init_kwargs[\"api_key\"] = os.environ[\"MISTRAL_API_KEY\"]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/mistral.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n response = self._client.chat(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens occupied.
Source code in sibila/mistral.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens occupied.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n OVERHEAD_PER_MSG = 3\n num_tokens = 0\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n # print(num_tokens)\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/remote_model/#sibila.MistralModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/mistral.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n\n args = {}\n if api_key is not None:\n args[\"api_key\"] = api_key\n model = MistralClient(**args) # type: ignore[arg-type]\n\n model_list = model.list_models()\n del model\n\n out = []\n for mod in model_list.data:\n out.append(mod.id)\n\n return sorted(out)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/mistral.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"MistralModel: {self._model_name}\"\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel","title":"TogetherModel","text":"TogetherModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a together.ai model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://docs.together.ai/docs/json-mode
https://docs.together.ai/reference/chat-completions
Create a together.ai remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable TOGETHER_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable TOGETHER_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a together.ai remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable TOGETHER_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable TOGETHER_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"TOGETHER_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"TOGETHER_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no together.ai models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/thread/","title":"Threads and messages","text":""},{"location":"api-reference/thread/#sibila.Thread","title":"Thread","text":"Thread(t=None, inst='', join_sep='\\n')\n
A sequence of messages alternating between IN (\"user\" role) and OUT (\"assistant\" role).
Stores a special initial INST information (known as \"system\" role in ChatML) providing instructions to the model. Some models don't use system instructions - in those cases it's prepended to first IN message.
Messages are kept in a strict IN,OUT,IN,OUT,... order. To enforce this, if two IN messages are added, the second just appends to the text of the first or to its image list.
Parameters:
Name Type Description Default t
Optional[Union[Self, list, Msg, dict, tuple, str]]
Optionally initialize from a Thread, list[Msg], list[ChatML format dict], list[tuple], list[str], Msg, ChatML format dict, tuple or str.
None
inst
str
Instructions text. If inst arg is not set and t is a Thread, its inst will be used.
''
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Raises:
Type Description TypeError
On invalid args passed.
Source code in sibila/thread.py
def __init__(self,\n t: Optional[Union[Self,list,Msg,dict,tuple,str]] = None,\n inst: str = \"\",\n join_sep: str = \"\\n\"):\n \"\"\"\n Args:\n t: Optionally initialize from a Thread, list[Msg], list[ChatML format dict], list[tuple], list[str], Msg, ChatML format dict, tuple or str.\n inst: Instructions text. If inst arg is not set and t is a Thread, its inst will be used.\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n\n Raises:\n TypeError: On invalid args passed.\n \"\"\"\n\n self._msgs = []\n self.inst = Msg.make_INST(inst)\n self.join_sep = join_sep\n\n if t is not None:\n self.concat(t)\n
"},{"location":"api-reference/thread/#sibila.Thread.inst","title":"inst instance-attribute
","text":"inst = make_INST(inst)\n
System instructions in an Msg of kind INST, defaults to empty text.
"},{"location":"api-reference/thread/#sibila.Thread.init_INST_IN","title":"init_INST_IN","text":"init_INST_IN(inst_text, in_text, in_images=None)\n
Initialize Thread with instructions and an IN message.
Parameters:
Name Type Description Default inst_text
str
Instructions text.
required in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def init_INST_IN(self,\n inst_text: str,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Initialize Thread with instructions and an IN message.\n\n Args:\n inst_text: Instructions text.\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.clear()\n self.inst.text = inst_text\n self.add_IN(in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_IN","title":"add_IN","text":"add_IN(in_text, in_images=None)\n
Appends an IN message to Thread.
Parameters:
Name Type Description Default in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def add_IN(self,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an IN message to Thread.\n\n Args:\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.add(Msg.Kind.IN, in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_OUT","title":"add_OUT","text":"add_OUT(out_text, out_images=None)\n
Appends an OUT message to Thread.
Parameters:
Name Type Description Default out_text
str
Text for OUT message.
required out_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def add_OUT(self,\n out_text: str,\n out_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an OUT message to Thread.\n\n Args:\n out_text: Text for OUT message.\n out_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.add(Msg.Kind.OUT, out_text, out_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_OUT_IN","title":"add_OUT_IN","text":"add_OUT_IN(\n out_text, in_text, *, out_images=None, in_images=None\n)\n
Appends an OUT message followed by an IN message.
Parameters:
Name Type Description Default out_text
str
Text for OUT message.
required in_text
str
Text for IN message.
required out_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
in_images
Optional[Union[list, str, dict]]
Optional list of IN message images.
None
Source code in sibila/thread.py
def add_OUT_IN(self,\n out_text: str,\n in_text: str,\n *,\n out_images: Optional[Union[list,str,dict]] = None,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an OUT message followed by an IN message.\n\n Args:\n out_text: Text for OUT message.\n in_text: Text for IN message.\n out_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n in_images: Optional list of IN message images.\n \"\"\" \n self.add(Msg.Kind.OUT, out_text, out_images)\n self.add(Msg.Kind.IN, in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.make_INST_IN","title":"make_INST_IN staticmethod
","text":"make_INST_IN(inst_text, in_text, in_images=None)\n
Return an initialized Thread with instructions and an IN message.
Parameters:
Name Type Description Default inst_text
str
Instructions text.
required in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
@staticmethod\ndef make_INST_IN(inst_text: str,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None) -> 'Thread':\n \"\"\"Return an initialized Thread with instructions and an IN message.\n\n Args:\n inst_text: Instructions text.\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n\n thread = Thread(inst=inst_text)\n thread.add_IN(in_text, in_images)\n return thread\n
"},{"location":"api-reference/thread/#sibila.Thread.make_IN","title":"make_IN staticmethod
","text":"make_IN(in_text, in_images=None)\n
Return an initialized Thread with an IN message.
Parameters:
Name Type Description Default in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
@staticmethod\ndef make_IN(in_text: str,\n in_images: Optional[Union[list,str,dict]] = None) -> 'Thread':\n \"\"\"Return an initialized Thread with an IN message.\n\n Args:\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n\n thread = Thread()\n thread.add_IN(in_text, in_images)\n return thread\n
"},{"location":"api-reference/thread/#sibila.Thread.clone","title":"clone","text":"clone()\n
Return a copy of current Thread.
Returns:
Type Description Self
A copy of this Thread.
Source code in sibila/thread.py
def clone(self) -> Self:\n \"\"\"Return a copy of current Thread.\n\n Returns:\n A copy of this Thread.\n \"\"\"\n return Thread(self)\n
"},{"location":"api-reference/thread/#sibila.Thread.clear","title":"clear","text":"clear(clear_inst=True)\n
Delete all messages and clear inst.
Source code in sibila/thread.py
def clear(self,\n clear_inst: bool = True):\n \"\"\"Delete all messages and clear inst.\"\"\"\n self._msgs = []\n if clear_inst:\n self.inst.text = \"\"\n
"},{"location":"api-reference/thread/#sibila.Thread.load","title":"load","text":"load(path, clear)\n
Load this Thread from a JSON file.
Parameters:
Name Type Description Default path
str
Path of file to load.
required Source code in sibila/thread.py
def load(self,\n path: str,\n clear: bool):\n \"\"\"Load this Thread from a JSON file.\n\n Args:\n path: Path of file to load.\n \"\"\"\n\n with open(path, 'r', encoding='utf-8') as f:\n js = f.read()\n state = json.loads(js)\n\n if clear:\n self.clear()\n\n th = self.from_dict(state)\n self.concat(th)\n
"},{"location":"api-reference/thread/#sibila.Thread.save","title":"save","text":"save(path)\n
Serialize this Thread to JSON.
Parameters:
Name Type Description Default path
str
Path of file to save into.
required Source code in sibila/thread.py
def save(self,\n path: str):\n \"\"\"Serialize this Thread to JSON.\n\n Args:\n path: Path of file to save into.\n \"\"\"\n\n state = self.as_dict()\n\n json_str = json.dumps(state, indent=2, default=vars)\n\n with open(path, 'w', encoding='utf-8') as f:\n f.write(json_str)\n
"},{"location":"api-reference/thread/#sibila.Thread.from_dict","title":"from_dict staticmethod
","text":"from_dict(state)\n
Deserialize a Thread from a dict.
Source code in sibila/thread.py
@staticmethod\ndef from_dict(state: dict) -> 'Thread':\n \"\"\"Deserialize a Thread from a dict.\"\"\"\n\n th = Thread()\n for dic in state[\"_msgs\"]:\n th.add(Msg.from_dict(dic))\n th.inst = Msg.from_dict(state[\"inst\"])\n th.join_sep = state[\"join_sep\"]\n\n return th\n
"},{"location":"api-reference/thread/#sibila.Thread.as_dict","title":"as_dict","text":"as_dict()\n
Serialize this Thread to a dict.
Source code in sibila/thread.py
def as_dict(self) -> dict:\n \"\"\"Serialize this Thread to a dict.\"\"\"\n\n state = {\"_msgs\": [],\n \"inst\": self.inst.as_dict(),\n \"join_sep\": self.join_sep}\n\n for msg in self._msgs:\n state[\"_msgs\"].append(msg.as_dict()) # type: ignore[attr-defined]\n\n return state\n
"},{"location":"api-reference/thread/#sibila.Thread.as_chatml","title":"as_chatml","text":"as_chatml(include_INST=True)\n
Returns Thread as a list of ChatML messages.
Returns:
Type Description list[dict]
A list of ChatML dict elements with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self,\n include_INST: bool = True) -> list[dict]:\n \"\"\"Returns Thread as a list of ChatML messages.\n\n Returns:\n A list of ChatML dict elements with \"role\" and \"content\" keys.\n \"\"\"\n msgs = []\n\n if self.inst.text and include_INST:\n msgs.append(self.inst.as_chatml())\n\n for msg in self._msgs:\n msgs.append(msg.as_chatml())\n\n return msgs\n
"},{"location":"api-reference/thread/#sibila.Thread.Trim","title":"Trim","text":"Flags for Thread trimming.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.NONE","title":"NONE class-attribute
instance-attribute
","text":"NONE = 0\n
No trimming.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 1\n
Can remove INST message.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 2\n
Can remove IN messages.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 4\n
Can remove OUT messages.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.KEEP_FIRST_IN","title":"KEEP_FIRST_IN class-attribute
instance-attribute
","text":"KEEP_FIRST_IN = 1024\n
If trimming IN messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.KEEP_FIRST_OUT","title":"KEEP_FIRST_OUT class-attribute
instance-attribute
","text":"KEEP_FIRST_OUT = 2048\n
If trimming OUT messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Thread.trim","title":"trim","text":"trim(trim_flags, max_token_len, thread_token_len_fn)\n
Trim context by selectively removing older messages until thread fits max_token_len.
Parameters:
Name Type Description Default trim_flags
Trim
Flags to guide selection of which messages to remove.
required max_token_len
int
Cut messages until size is lower than this number. Defaults to None.
required thread_token_len_fn
Callable
A function that returns token count for a passed Thread.
required Example of a thread_token_len_fn that counts 1 char = 1 token: def thread_token_len_fn(thread: Thread) -> int: total = len(thread.inst.text) for msg in thread: total += len(msg.text) if msg.images: total += len(str(msg.images)) return total
Returns:
Type Description int
Trimming result: 1=trimmed messages to max_token_len, 0: no trimming was needed, -1: Unable to trim to max_token_len.
Source code in sibila/thread.py
def trim(self,\n trim_flags: Trim,\n max_token_len: int,\n thread_token_len_fn: Callable\n ) -> int:\n \"\"\"Trim context by selectively removing older messages until thread fits max_token_len.\n\n Args:\n trim_flags: Flags to guide selection of which messages to remove.\n max_token_len: Cut messages until size is lower than this number. Defaults to None.\n thread_token_len_fn: A function that returns token count for a passed Thread.\n\n Example of a thread_token_len_fn that counts 1 char = 1 token:\n def thread_token_len_fn(thread: Thread) -> int:\n total = len(thread.inst.text)\n for msg in thread:\n total += len(msg.text)\n if msg.images:\n total += len(str(msg.images))\n return total\n\n Returns:\n Trimming result: 1=trimmed messages to max_token_len, 0: no trimming was needed, -1: Unable to trim to max_token_len.\n \"\"\"\n\n if trim_flags == Thread.Trim.NONE: # no trimming\n return 0\n\n thread = self.clone()\n\n any_trim = False\n\n while True:\n\n curr_len = thread_token_len_fn(thread)\n\n if curr_len <= max_token_len:\n break\n\n logger.debug(f\"len={curr_len} / max={max_token_len}\")\n\n if thread.inst.text and trim_flags & Thread.Trim.INST:\n thread.inst.text = \"\"\n any_trim = True\n logger.debug(f\"Cutting INST {thread.inst.text[:40]}\")\n continue\n\n # cut first possible message, starting from oldest first ones\n trimmed = False\n in_index = out_index = 0\n\n for index,msg in enumerate(thread):\n\n if msg.kind == Msg.Kind.IN:\n if trim_flags & Thread.Trim.IN:\n if not (trim_flags & Thread.Trim.KEEP_FIRST_IN and in_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting IN {msg.text[:40]}\")\n break\n in_index += 1\n\n elif msg.kind == Msg.Kind.OUT:\n if trim_flags & Thread.Trim.OUT: \n if not (trim_flags & Thread.Trim.KEEP_FIRST_OUT and out_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting OUT {msg.text[:40]}\")\n break\n out_index += 1\n\n if not trimmed:\n # all thread messages were cycled but not a single could be cut, so size remains the same\n # arriving here we did all we could for trim_flags but could not remove any more\n return -1\n else:\n any_trim = True\n\n # while end\n\n\n if any_trim:\n self._msgs = thread._msgs\n self.inst = thread.inst\n\n return int(any_trim)\n
"},{"location":"api-reference/thread/#sibila.Thread.add","title":"add","text":"add(t, text=None, images=None)\n
Add a message to Thread.
Accepts any of these argument combinations t=Msg, ChatML format dict, tuple or str --or-- t=kind, text[, images]
Parameters:
Name Type Description Default t
Union[Msg, dict, tuple, str, Kind]
One of Msg, ChatML format dict, tuple or str, or Msg.Kind.
required text
Optional[str]
Message text, only if t=Msg.Kind.
None
images
Optional[Union[list, str, dict]]
only if t=Msg.Kind or t=str-> an array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with keys \"url\" and any other keys like \"detail\". If url arg is not a valid URL, it will be loaded and converted to a data URL.
None
Source code in sibila/thread.py
def add(self, \n t: Union[Msg,dict,tuple,str,Msg.Kind],\n text: Optional[str] = None,\n images: Optional[Union[list,str,dict]] = None):\n\n \"\"\"Add a message to Thread.\n\n Accepts any of these argument combinations:\n t=Msg, ChatML format dict, tuple or str\n --or--\n t=kind, text[, images]\n\n Args:\n t: One of Msg, ChatML format dict, tuple or str, or Msg.Kind.\n text: Message text, only if t=Msg.Kind.\n images: only if t=Msg.Kind or t=str-> an array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with keys \"url\" and any other keys like \"detail\". If url arg is not a valid URL, it will be loaded and converted to a data URL.\n \"\"\"\n\n if text is not None:\n if not isinstance(t, Msg.Kind):\n raise TypeError(\"When arg 'text' is given, first arg must be of type Msg.Kind\")\n\n msg = Msg(t, text, images)\n\n else: # add from t arg\n if isinstance(t, dict): # ChatML formatted dict\n msg = Msg.from_chatml(t)\n\n\n elif isinstance(t, tuple):\n msg = Msg(self.next_kind,\n *t)\n\n elif isinstance(t, str): # simple text\n msg = Msg(self.next_kind,\n t,\n images)\n\n elif isinstance(t, Msg):\n msg = t.clone()\n\n else:\n raise TypeError(\"Arg 't' must be one of: Msg, ChatML format dict, tuple or str\")\n\n\n # now append to list\n if msg.kind == Msg.Kind.INST:\n self.inst.join_same_kind(msg, self.join_sep)\n\n else:\n if not len(self._msgs) or msg.kind == self.next_kind: # next different kind or empty\n self._msgs.append(msg)\n else: # new msg is of same kind as last existing message: join/append to it\n last = self._msgs[-1]\n last.join_same_kind(msg, self.join_sep)\n
"},{"location":"api-reference/thread/#sibila.Thread.concat","title":"concat","text":"concat(t)\n
Concatenate to current Thread: another Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.
if last message in self is the same kind of first in t, their text, images, etc will be joined.
Parameters:
Name Type Description Default t
Union[Self, list, Msg, dict, tuple, str]
A Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.
required Source code in sibila/thread.py
def concat(self,\n t: Union[Self,list,Msg,dict,tuple,str]):\n \"\"\"Concatenate to current Thread: another Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.\n\n if last message in self is the same kind of first in t, their text, images, etc will be joined.\n\n Args:\n t: A Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.\n \"\"\"\n if isinstance(t, Thread):\n for msg in t:\n self.add(msg)\n\n self.inst.join_same_kind(t.inst, self.join_sep)\n\n else:\n if not isinstance(t, list):\n t = [t]\n for msg in t:\n self.add(msg)\n
"},{"location":"api-reference/thread/#sibila.Thread.get_iter","title":"get_iter","text":"get_iter(include_set_inst)\n
Return an iterator that can be used to cycle over messages. include_set_inst: If inst message is set, include it before all others.
Source code in sibila/thread.py
def get_iter(self,\n include_set_inst: bool):\n \"\"\"Return an iterator that can be used to cycle over messages.\n include_set_inst: If inst message is set, include it before all others.\n \"\"\"\n class MsgIter:\n def __init__(self, \n thread: Thread,\n include_inst: bool):\n self.thread = thread\n self.curr = -1 - int(include_inst)\n\n def __iter__(self):\n return self\n\n def __next__(self):\n self.curr += 1\n if self.curr == -1:\n return self.thread.inst\n elif self.curr < len(self.thread):\n return self.thread[self.curr]\n else:\n raise StopIteration\n\n return MsgIter(self,\n include_set_inst and bool(self.inst.text))\n
"},{"location":"api-reference/thread/#sibila.Thread.has_images","title":"has_images property
","text":"has_images\n
"},{"location":"api-reference/thread/#sibila.Thread.next_kind","title":"next_kind property
","text":"next_kind\n
Get kind of next new message that can be added to thread .
Returns:
Type Description Kind
Kind of last message or Msg.Kind.IN if empty.
"},{"location":"api-reference/thread/#sibila.Thread.has_text_lower","title":"has_text_lower","text":"has_text_lower(text_lower)\n
Can the lowercase text be found in one of the messages?
Parameters:
Name Type Description Default text_lower
str
The lowercase text to search for in messages.
required Returns:
Type Description bool
True if such text was found.
Source code in sibila/thread.py
def has_text_lower(self,\n text_lower: str) -> bool:\n \"\"\"Can the lowercase text be found in one of the messages?\n\n Args:\n text_lower: The lowercase text to search for in messages.\n\n Returns:\n True if such text was found.\n \"\"\"\n for msg in self._msgs:\n if text_lower in msg.text.lower():\n return True\n\n return False \n
"},{"location":"api-reference/thread/#sibila.Thread.join_sep","title":"join_sep instance-attribute
","text":"join_sep = join_sep\n
Separator used when message text needs to be joined. Defaults to '\\n'
"},{"location":"api-reference/thread/#sibila.Thread.__add__","title":"__add__","text":"__add__(other)\n
Source code in sibila/thread.py
def __add__(self,\n other: Union[Self,list, Msg, dict, str]) -> Self:\n out = self.clone()\n out.concat(other)\n return out\n
"},{"location":"api-reference/thread/#sibila.Thread.__iter__","title":"__iter__","text":"__iter__()\n
Source code in sibila/thread.py
def __iter__(self):\n # Default iterator doesn't include inst message.\n return self.get_iter(False)\n
"},{"location":"api-reference/thread/#sibila.Msg","title":"Msg dataclass
","text":""},{"location":"api-reference/thread/#sibila.Msg.Kind","title":"Kind","text":"Enumeration for kinds of messages in a Thread.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 'IN'\n
Input message, from user.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 'OUT'\n
Model output message.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 'INST'\n
Initial model instructions.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.as_chatml_role","title":"as_chatml_role","text":"as_chatml_role()\n
Source code in sibila/thread.py
def as_chatml_role(self: Self) -> str:\n CHATML_FROM_KIND: dict = {Msg.Kind.IN: \"user\", Msg.Kind.OUT: \"assistant\", Msg.Kind.INST: \"system\"}\n return CHATML_FROM_KIND.get(self) # type: ignore[return-value]\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.from_chatml_role","title":"from_chatml_role staticmethod
","text":"from_chatml_role(role)\n
Source code in sibila/thread.py
@staticmethod\ndef from_chatml_role(role: str) -> 'Msg.Kind':\n KIND_FROM_CHATML: dict = {\"user\": Msg.Kind.IN, \"assistant\": Msg.Kind.OUT, \"system\": Msg.Kind.INST}\n kind = KIND_FROM_CHATML.get(role)\n if kind is None:\n raise ValueError(f\"Unknown ChatML role '{role}'\")\n else:\n return kind\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.flip","title":"flip staticmethod
","text":"flip(kind)\n
Source code in sibila/thread.py
@staticmethod\ndef flip(kind: 'Msg.Kind') -> 'Msg.Kind':\n return Msg.Kind.OUT if kind is Msg.Kind.IN else Msg.Kind.IN\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.__repr__","title":"__repr__","text":"__repr__()\n
Source code in sibila/thread.py
def __repr__(self):\n return repr(self.value)\n
"},{"location":"api-reference/thread/#sibila.Msg.kind","title":"kind instance-attribute
","text":"kind\n
Message kind.
"},{"location":"api-reference/thread/#sibila.Msg.text","title":"text instance-attribute
","text":"text\n
Message text (mandatory).
"},{"location":"api-reference/thread/#sibila.Msg.images","title":"images class-attribute
instance-attribute
","text":"images = None\n
List of images in message. An entry must have a 'url' key, but any other keys can be added. Key 'url' key must be a remote url (https,http) or a 'data:' base64-encoded url.
"},{"location":"api-reference/thread/#sibila.Msg.make_IN","title":"make_IN staticmethod
","text":"make_IN(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_IN(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.IN,\n text,\n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.make_OUT","title":"make_OUT staticmethod
","text":"make_OUT(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_OUT(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.OUT, \n text, \n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.make_INST","title":"make_INST staticmethod
","text":"make_INST(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_INST(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.INST, \n text,\n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.clone","title":"clone","text":"clone()\n
Source code in sibila/thread.py
def clone(self) -> 'Msg':\n return Msg(self.kind, self.text, self.images)\n
"},{"location":"api-reference/thread/#sibila.Msg.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/thread.py
@staticmethod\ndef from_dict(dic: dict) -> 'Msg':\n return Msg(kind=Msg.Kind(dic[\"kind\"]),\n text=dic[\"text\"],\n images=dic[\"images\"])\n
"},{"location":"api-reference/thread/#sibila.Msg.as_dict","title":"as_dict","text":"as_dict()\n
Return Msg as a dict.
Source code in sibila/thread.py
def as_dict(self) -> dict:\n \"\"\"Return Msg as a dict.\"\"\"\n return {\"kind\": self.kind.value, # kind as string\n \"text\": self.text,\n \"images\": self.images}\n
"},{"location":"api-reference/thread/#sibila.Msg.from_chatml","title":"from_chatml staticmethod
","text":"from_chatml(dic, join_sep='\\n')\n
Source code in sibila/thread.py
@staticmethod\ndef from_chatml(dic: dict,\n join_sep:str = \"\\n\") -> 'Msg':\n\n role = dic.get(\"role\")\n if role is None:\n raise ValueError(f\"Key 'role' not found in {dic}\")\n\n kind = Msg.Kind.from_chatml_role(role)\n\n content = dic.get(\"content\")\n if content is None:\n raise ValueError(f\"Bad 'content' key in {dic}\")\n\n text = ''\n images = []\n if isinstance(content, list):\n for cont in content:\n if not isinstance(cont, dict) or \"type\" not in cont:\n raise TypeError(f\"ChatML list entries must be of type dict and include a 'type' key in {cont}\")\n\n if cont[\"type\"] == \"text\":\n text = join_text(text, cont[\"text\"], join_sep)\n\n elif cont[\"type\"] == \"image_url\":\n image = cont[\"image_url\"]\n if \"url\" not in image:\n raise TypeError(f\"ChatML image_url entries must include a 'url' key in {cont}\")\n images.append(image)\n\n elif isinstance(content, str):\n text = content\n\n else:\n raise TypeError(f\"ChatML content must have str or dict type in {content}\")\n\n return Msg(kind, \n text,\n images if images else None)\n
"},{"location":"api-reference/thread/#sibila.Msg.as_chatml","title":"as_chatml","text":"as_chatml()\n
Returns message in a ChatML dict.
Returns:
Type Description dict
A ChatML dict with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self) -> dict:\n \"\"\"Returns message in a ChatML dict.\n\n Returns:\n A ChatML dict with \"role\" and \"content\" keys.\n \"\"\"\n\n role = self.kind.as_chatml_role()\n\n if self.images:\n chatml_msg = {\n \"role\": role, \n \"content\": [\n {\"type\": \"text\", \"text\": self.text},\n ]}\n\n for image in self.images:\n if \"url\" not in image:\n raise ValueError(f\"Image without 'url' key at {image}\")\n\n image_url = {\"url\": image[\"url\"]}\n if \"detail\" in image:\n image_url[\"detail\"] = image[\"detail\"]\n\n chatml_msg[\"content\"].append( # type: ignore[attr-defined]\n {\"type\": \"image_url\", \"image_url\": image_url}\n )\n return chatml_msg\n else:\n return {\"role\": role, \"content\": self.text}\n
"},{"location":"api-reference/tokenizer/","title":"Model tokenizers","text":""},{"location":"api-reference/tokenizer/#llamacpp","title":"LlamaCpp","text":""},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer","title":"LlamaCppTokenizer","text":"LlamaCppTokenizer(llama)\n
Tokenizer for llama.cpp loaded GGUF models.
Source code in sibila/llamacpp.py
def __init__(self, \n llama: Llama):\n self._llama = llama\n\n self.vocab_size = self._llama.n_vocab()\n\n self.bos_token_id = self._llama.token_bos()\n self.bos_token = llama_token_get_text(self._llama.model, self.bos_token_id).decode(\"utf-8\")\n\n self.eos_token_id = self._llama.token_eos()\n self.eos_token = llama_token_get_text(self._llama.model, self.eos_token_id).decode(\"utf-8\")\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None # ? fill by taking a look at id 0?\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/llamacpp.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n\n # str -> bytes\n btext = text.encode(\"utf-8\", errors=\"ignore\")\n\n return self._llama.tokenize(btext, add_bos=False, special=True)\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/llamacpp.py
def decode(self,\n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n\n if not len(token_ids):\n return \"\"\n\n output = b\"\"\n size = 32\n buffer = (ctypes.c_char * size)()\n\n if not skip_special:\n special_toks = {self.bos_token_id: self.bos_token.encode(\"utf-8\"), # type: ignore[union-attr]\n self.eos_token_id: self.eos_token.encode(\"utf-8\")} # type: ignore[union-attr]\n\n for token in token_ids:\n if token == self.bos_token_id:\n output += special_toks[token]\n elif token == self.eos_token_id:\n output += special_toks[token]\n else:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n else: # skip special\n for token in token_ids:\n if token != self.bos_token_id and token != self.eos_token_id:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n\n # \"User code is responsible for removing the leading whitespace of the first non-BOS token when decoding multiple tokens.\"\n if (# token_ids[0] != self.bos_token_id and # we also try cutting if first is bos to approximate HF tokenizer\n len(output) and output[0] <= 32 # 32 = ord(' ')\n ):\n output = output[1:]\n\n return output.decode(\"utf-8\", errors=\"ignore\")\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tokenizer/#openai","title":"OpenAI","text":""},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer","title":"OpenAITokenizer","text":"OpenAITokenizer(model)\n
Tokenizer for OpenAI models.
Source code in sibila/openai.py
def __init__(self, \n model: str\n ):\n\n if not has_tiktoken:\n raise Exception(\"Please install tiktoken by running: pip install tiktoken\")\n\n self._tok = tiktoken.encoding_for_model(model)\n\n self.vocab_size = self._tok.n_vocab\n\n self.bos_token_id = None\n self.bos_token = None\n\n self.eos_token_id = None\n self.eos_token = None\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/openai.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n return self._tok.encode(text)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/openai.py
def decode(self, \n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n assert skip_special, \"OpenAITokenizer only supports skip_special=True\"\n\n return self._tok.decode(token_ids)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tools/","title":"Tools","text":""},{"location":"api-reference/tools/#sibila.tools","title":"tools","text":"Tools for model interaction, summarization, etc.
- interact(): Interact with model as in a chat, using input().
- loop(): Iteratively append inputs and generate model outputs.
- recursive_summarize(): Recursively summarize a (large) text or text file.
"},{"location":"api-reference/tools/#sibila.tools.interact","title":"interact","text":"interact(\n model,\n *,\n th=None,\n inst_text=None,\n trim_flags=TRIM_DEFAULT,\n genconf=None,\n max_tokens_default=-20\n)\n
Interact with model as in a chat, using input().
Includes a list of commands: type !? to see help.
Parameters:
Name Type Description Default model
Model
Model to use for generating.
required th
Optional[Thread]
Optional input Thread. Defaults to None.
None
inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
trim_flags
Trim
Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses to model's genconf.
None
max_tokens_default
int
Used if a non-zero genconf.max_tokens is not found.
-20
Returns:
Type Description Thread
Thread after all the interactions.
Source code in sibila/tools.py
def interact(model: Model,\n *,\n th: Optional[Thread] = None,\n inst_text: Optional[str] = None,\n trim_flags: Thread.Trim = TRIM_DEFAULT,\n\n genconf: Optional[GenConf] = None,\n max_tokens_default: int = -20\n ) -> Thread:\n \"\"\"Interact with model as in a chat, using input().\n\n Includes a list of commands: type !? to see help.\n\n Args:\n model: Model to use for generating.\n th: Optional input Thread. Defaults to None.\n inst_text: text for Thread instructions. Defaults to None.\n trim_flags: Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n genconf: Model generation configuration. Defaults to None, which uses to model's genconf.\n max_tokens_default: Used if a non-zero genconf.max_tokens is not found.\n\n Returns:\n Thread after all the interactions.\n \"\"\"\n\n def callback(out: Union[GenOut,None], \n th: Thread, \n model: Model,\n genconf: GenConf,\n max_token_len: int) -> bool:\n\n if out is not None:\n if out.res != GenRes.OK_STOP:\n print(f\"***Result={GenRes.as_text(out.res)}***\")\n\n if out.text:\n text = out.text\n else:\n text = \"***No text out***\"\n\n th.add_OUT(text)\n print(text)\n print()\n\n\n def print_thread_info():\n length = model.token_len(th, genconf)\n print(f\"Thread token len={length}, max len before next gen={max_token_len}\")\n\n\n\n # input loop ===============================================\n MARKER: str = '\"\"\"'\n multiline: str = \"\"\n\n while True:\n\n user = input('>').strip()\n\n if multiline:\n if user.endswith(MARKER):\n user = multiline + \"\\n\" + user[:-3]\n multiline = \"\"\n else:\n multiline += \"\\n\" + user\n continue\n\n else:\n if not user:\n return False # terminate loop\n\n elif user.startswith(MARKER):\n multiline = user[3:]\n continue\n\n elif user.endswith(\"\\\\\"):\n user = user[:-1]\n user = user.replace(\"\\\\n\", \"\\n\")\n th.add_IN(user)\n continue\n\n elif user.startswith(\"!\"): # a command\n params = user[1:].split(\"=\")\n cmd = params[0]\n params = params[1:]\n\n if cmd == \"inst\":\n th.clear()\n if params:\n text = params[0].replace(\"\\\\n\", \"\\n\")\n th.inst.text = text\n\n elif cmd == \"add\" or cmd == \"a\":\n if params:\n try:\n path = params[0]\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n th.add_IN(text)\n print(text[:500])\n except FileNotFoundError:\n print(f\"Could not load '{path}'\")\n else:\n print(\"Path needed\")\n\n elif cmd == 'c':\n print_thread_info()\n print(th)\n\n elif cmd == 'cl':\n if not params:\n params.append(\"thread.json\")\n try:\n th.load(params[0], \n clear=True)\n print(f\"Loaded context from {params[0]}\")\n except FileNotFoundError:\n print(f\"Could not load '{params[0]}'\")\n\n elif cmd == 'cs':\n if not params:\n params.append(\"thread.json\")\n th.save(params[0])\n print(f\"Saved context to {params[0]}\")\n\n elif cmd == 'image':\n if not params:\n print(\"No image given, using a remote photo of two cats\")\n params.append(\"https://upload.wikimedia.org/wikipedia/commons/thumb/3/3b/Singapura_Cats.jpg/320px-Singapura_Cats.jpg\")\n try:\n th.add_IN(\"\", params[0])\n print(f\"Added image '{params[0]}'.\\nPlease enter your question:\")\n except FileNotFoundError:\n print(f\"Could not local image '{params[0]}'\")\n\n elif cmd == 'tl':\n print_thread_info()\n\n elif cmd == 'info':\n print(f\"Model:\\n{model.info()}\")\n print(f\"GenConf:\\n{genconf}\\n\")\n\n print_thread_info()\n\n\n else:\n print(f\"Unknown command '!{cmd}' - known commands:\\n\"\n \" !inst[=text] - clear messages and add inst (system) message\\n\"\n \" !add=path - load file and add to last msg\\n\"\n \" !image=path/url - include a local or remote image. Local images must fit the context!\\n\"\n \" !c - list context msgs\\n\"\n \" !cl=path - load context (default=thread.json)\\n\"\n \" !cs=path - save context (default=thread.json)\\n\"\n \" !tl - thread's token length\\n\"\n \" !info - model and genconf info\\n\"\n ' Delimit with \"\"\" for multiline begin/end or terminate line with \\\\ to continue into a new line\\n'\n \" Empty line + enter to quit\"\n )\n # \" !p - show formatted prompt (if model supports it)\\n\"\n # \" !to - prompt's tokens\\n\"\n\n print()\n continue\n\n # we have a user prompt\n user = user.replace(\"\\\\n\", \"\\n\")\n break\n\n\n th.add_IN(user)\n\n return True # continue looping\n\n\n\n if genconf is None:\n genconf = model.genconf\n\n if genconf.max_tokens == 0:\n genconf = genconf(max_tokens=max_tokens_default)\n\n # start prompt loop\n th = loop(callback,\n model,\n\n thread=th,\n inst_text=inst_text,\n in_text=None, # call callback for first prompt\n trim_flags=trim_flags,\n genconf=genconf)\n\n return th\n
"},{"location":"api-reference/tools/#sibila.tools.loop","title":"loop","text":"loop(\n callback,\n model,\n *,\n inst_text=None,\n in_text=None,\n trim_flags=TRIM_DEFAULT,\n max_token_len=None,\n thread=None,\n genconf=None\n)\n
Iteratively append inputs and generate model outputs.
Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.
If last Thread msg is not Msg.Kind.IN, callback() will be called with out_text=None.
Parameters:
Name Type Description Default callback
Callable[[Union[GenOut, None], Thread, Model, GenConf, int], bool]
A function(out, ctx, model) that will be iteratively called with model's output.
required model
Model
Model to use for generating.
required inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
in_text
Optional[str]
Text for Thread's initial Msg.Kind.IN. Defaults to None.
None
trim_flags
Trim
Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
max_token_len
Optional[int]
Maximum token count to use when trimming. Defaults to None.
None
thread
Optional[Thread]
Optional input Thread. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses to model's genconf.
None
Source code in sibila/tools.py
def loop(callback: Callable[[Union[GenOut,None], Thread, Model, GenConf, int], bool],\n model: Model,\n *,\n inst_text: Optional[str] = None,\n in_text: Optional[str] = None,\n\n trim_flags: Thread.Trim = TRIM_DEFAULT,\n max_token_len: Optional[int] = None,\n thread: Optional[Thread] = None,\n\n genconf: Optional[GenConf] = None,\n ) -> Thread:\n \"\"\"Iteratively append inputs and generate model outputs.\n\n Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.\n\n If last Thread msg is not Msg.Kind.IN, callback() will be called with out_text=None.\n\n Args:\n callback: A function(out, ctx, model) that will be iteratively called with model's output.\n model: Model to use for generating.\n inst_text: text for Thread instructions. Defaults to None.\n in_text: Text for Thread's initial Msg.Kind.IN. Defaults to None.\n trim_flags: Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n max_token_len: Maximum token count to use when trimming. Defaults to None.\n thread: Optional input Thread. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses to model's genconf.\n \"\"\"\n\n if thread is None:\n thread = Thread()\n else:\n thread = thread\n\n if inst_text is not None:\n thread.inst.text = inst_text\n if in_text is not None:\n thread.add_IN(in_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if max_token_len is None:\n resolved_max_tokens = genconf.resolve_max_tokens(model.ctx_len, model.max_tokens_limit)\n max_token_len = model.ctx_len - resolved_max_tokens\n if max_token_len == 0:\n raise ValueError(\"Unable to calc max_token_len: either pass the value to this function or set GenConf.max_tokens to a non-zero value\")\n\n while True:\n\n if len(thread) and thread[-1].kind == Msg.Kind.IN:\n # last is an IN message: we can trim and generate\n\n thread.trim(trim_flags,\n max_token_len,\n model.token_len_lambda)\n\n out = model.gen(thread, genconf)\n else:\n out = None # first call\n\n res = callback(out, \n thread, \n model,\n genconf,\n max_token_len)\n\n if not res:\n break\n\n\n return thread\n
"},{"location":"api-reference/tools/#sibila.tools.recursive_summarize","title":"recursive_summarize","text":"recursive_summarize(\n model,\n text=None,\n path=None,\n overlap_size=20,\n max_token_len=None,\n genconf=None,\n)\n
Recursively summarize a large text or text file, to fit in a Thread context.
Works by:
- Breaking text into chunks that fit models context.
- Run model to summarize chunks.
- Join generated summaries and jump to 1. - do this until text size no longer decreases.
Parameters:
Name Type Description Default model
Model
Model to use for summarizing.
required text
Optional[str]
Initial text.
None
path
Optional[str]
--Or-- A path to an UTF-8 text file.
None
overlap_size
int
Size in model tokens of the overlapping portions at beginning and end of chunks.
20
Returns:
Type Description str
The summarized text.
Source code in sibila/tools.py
def recursive_summarize(model: Model,\n text: Optional[str] = None,\n path: Optional[str] = None,\n overlap_size: int = 20,\n max_token_len: Optional[int] = None,\n genconf: Optional[GenConf] = None) -> str:\n\n \"\"\"Recursively summarize a large text or text file, to fit in a Thread context.\n\n Works by:\n\n 1. Breaking text into chunks that fit models context.\n 2. Run model to summarize chunks.\n 3. Join generated summaries and jump to 1. - do this until text size no longer decreases.\n\n Args:\n model: Model to use for summarizing.\n text: Initial text.\n path: --Or-- A path to an UTF-8 text file.\n overlap_size: Size in model tokens of the overlapping portions at beginning and end of chunks.\n\n Returns:\n The summarized text.\n \"\"\"\n\n if (text is not None) + (path is not None) != 1:\n raise ValueError(\"Only one of text or path can be given\")\n\n if path is not None:\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n\n inst_text = \"\"\"Your task is to do short summaries of text.\"\"\"\n in_text = \"Summarize the following text:\\n\"\n th = Thread(inst=inst_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if max_token_len is None:\n if model.genconf.max_tokens == 0:\n raise ValueError(\"Unable to calc max_token_len: make sure genconf.max_tokens is not zero\")\n\n resolved_max_tokens = genconf.resolve_max_tokens(model.ctx_len, model.max_tokens_limit)\n\n thread = Thread.make_INST_IN(inst_text, in_text)\n token_len = model.token_len(thread)\n max_token_len = model.ctx_len - resolved_max_tokens - (token_len + 16) \n\n\n # split initial text\n logger.debug(f\"Max token len {max_token_len}\")\n\n token_len_fn = model.token_len_lambda\n logger.debug(f\"Initial text token_len {token_len_fn(text)}\") # type: ignore[arg-type,call-arg]\n\n spl = RecursiveTextSplitter(max_token_len, overlap_size, len_fn=token_len_fn) # type: ignore[arg-type]\n\n round = 0\n while True: # summarization rounds\n logger.debug(f\"Round {round} {'='*60}\")\n\n in_list = spl.split(text=text)\n in_len = sum([len(t) for t in in_list])\n\n logger.debug(f\"Split in {len(in_list)} parts, total len {in_len} chars\")\n\n out_list = []\n for i,t in enumerate(in_list):\n\n logger.debug(f\"{round}>{i} {'='*30}\")\n\n th.clear(clear_inst=False)\n th.add_IN(in_text)\n th.add_IN(t)\n\n out = model.gen(th) \n logger.debug(out)\n\n out_list.append(out.text)\n\n text = \"\\n\".join(out_list)\n\n out_len = len(text) # sum([len(t) for t in out_list])\n if out_len >= in_len:\n break\n elif len(out_list) == 1:\n break\n else:\n round += 1\n\n return text\n
"},{"location":"examples/","title":"Examples","text":"Example Description Hello model Introductory pirate arrr-example: create local or remote models, use the Models class to simplify. From text to object Keypoint extractor, showing progressively better ways to query a model, from plain text, JSON, to Pydantic classes. Extract information Extract information about all persons mentioned in a text. Also available in a dataclass version. Tag customer queries Summarize and classify customer queries into tags. Quick meeting Extracting participants, action items and priorities from a simple meeting transcript. Tough meeting Extracting information from a long and complex transcript. Compare model output Compare sentiment analyses of customer reviews done by two models. Chat interaction Interact with the model as in a back-and-forth chat session. Receipt Extract structured data from a photo of a receipt. Async Run tasks in parallel with the *_async methods. Model management with CLI Download and manage models with the command-line sibila. Each example is explained in a Read Me and usually include a Jupyter notebook and/or a .py script version.
Most of the examples use a local model but you can quickly change to using OpenAI models by uncommenting one or two lines.
"},{"location":"examples/async/","title":"Async","text":"In this example we'll look at how to do multiple parallel requests to remote models by using Python's asyncio capabilities.
Generating from local llama.cpp models does not benefit from async functionality, because the local models must already be loaded in memory and can't benefit from asynchronous IO loading. When the async methods are used with a LlamaCppModel, inference will end up being made sequentially.
So we'll be using a remote OpenAI model. Make sure you defined the env variable OPENAI_API_KEY with a valid token.
This example is available as a Jupyter notebook or a Python script in this folder.
As usual, let's start by creating the model:
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nimport time, asyncio\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\")\n\n# convenience time-counting functions:\nstart_time = None\ndef start_secs():\n global start_time\n start_time = time.time()\ndef secs(): \n return f\"{time.time() - start_time:.1f}\"\n
We'll create two tasks that will run in parallel: 1. Ask the model to generate 20 names 2. Classify a phrase as spam
This example is running in a Jupyter notebook, so we can directly call the function with an await. In a python script we'd use asyncio.run() instead.
Note that we're using the _async suffix methods: extract_async() and classify_async(), instead of the normal functions.
The first task, generate 20 names:
async def extract_names(): \n print(\"extract_names begin...\", secs())\n\n names = await model.extract_async(list[str],\n \"Generate 20 English names with first name and surname\")\n\n print(\"...extract_names done\", secs())\n\n return names\n\nstart_secs()\nawait extract_names()\n
extract_names begin... 0.0\n...extract_names done 4.4\n\n\n\n\n\n['James Smith',\n 'Michael Johnson',\n 'Robert Williams',\n 'Maria Garcia',\n 'David Jones',\n 'Jennifer Miller',\n 'John Davis',\n 'Patricia Wilson',\n 'Daniel Anderson',\n 'Elizabeth Taylor',\n 'William Brown',\n 'Barbara Moore',\n 'Joseph Thompson',\n 'Susan Martinez',\n 'Charles Jackson',\n 'Linda Harris',\n 'Thomas Clark',\n 'Jessica Lewis',\n 'Christopher Walker',\n 'Sarah Robinson']\n
The second task will classify a phrase as \"spam\"/\"not spam\":
async def classify_spam():\n print(\"classify_spam begin...\", secs())\n\n classification = await model.classify_async([\"spam\", \"not spam\"],\n \"I am a Nigerian prince and will make you very rich!\")\n\n print(\"...classify_spam done\", secs())\n\n return classification\n\nstart_secs()\nawait classify_spam()\n
classify_spam begin... 0.0\n...classify_spam done 1.4\n\n\n\n\n\n'spam'\n
Let's use asyncio.as_completed(), to receive each task output, as soon as it's ready:
async def run_tasks():\n print(\"as_complete begin---\", secs())\n\n tasks = [extract_names(), classify_spam()]\n for task in asyncio.as_completed(tasks):\n res = await task\n print(\"Result:\", res)\n\n print(\"---as_complete done\", secs())\n\nstart_secs()\nawait run_tasks()\n
as_complete begin--- 0.0\nextract_names begin... 0.0\nclassify_spam begin... 0.0\n...classify_spam done 1.0\nResult: spam\n...extract_names done 5.8\nResult: ['James Smith', 'Emma Johnson', 'Olivia Williams', 'Liam Brown', 'Ava Jones', 'Noah Garcia', 'Sophia Miller', 'Mason Davis', 'Isabella Rodriguez', 'Ethan Martinez', 'Mia Hernandez', 'Logan Wilson', 'Charlotte Anderson', 'Aiden Moore', 'Harper Thomas', 'Lucas Jackson', 'Ella White', 'Benjamin Taylor', 'Amelia Harris', 'Alexander Clark']\n---as_complete done 5.8\n
Follow the above begin/done print statements and the listed time in seconds, as they are printed.
Both tasks were started at the same time and classify_spam() terminated first (at the 1.0s mark), because it's a shorter task that simply outputs \"spam\"/\"not spam\".
On the meanwhile, the model worked on generating the 20 names that we requested with extract_names(), a longer operation which terminates later (at 4.8s).
In the same manner any other tasks could be run in parallel by using the *_async() methods of the model classes.
"},{"location":"examples/cli/","title":"Sibila CLI","text":"In this example we'll see how to use the sibila Command-Line Interface (CLI) to download a GGUF model from the Hugging Face model hub.
We'll then register it in the Models factory, so that it can be easily used with Models.create(). The Models factory is based in a folder where model GGUF format files are stored and two configuration files: \"models.json\" and \"formats.json\".
After Doing the above, we'll be able to use this model in Python with two lines:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:rocket\")\n
Let's run sibila CLI to get help:
> sibila --help\n\nusage: sibila [-h] [--version] {models,formats,hub} ...\n\nSibila CLI tool for managing models and formats.\n\noptions:\n -h, --help show this help message and exit\n --version show program's version number and exit\n\nactions:\n Use 'models' and 'formats' to manage, 'hub' to search and download models.\n\n {models,formats,hub} Run 'sibila {command} --help' for specific help.\n\nFor help, check https://jndiogo.github.io/sibila/\n
Sibila CLI has three modes:
- models: to edit a 'models.json' file, create model entries set format, etc.
- formats: to edit a 'formats.json' file, add new formats, etc.
- hub: search and download models from Hugging Face model hub.
Specific help for each mode is available by doing: sibila mode --help
Let's download the Rocket 3B model, a small but capable model, fine-tuned for chat/instruct prompts:
https://huggingface.co/TheBloke/rocket-3B-GGUF
We'll use a \"sibila hub -d\" command to download to \"../../models\" folder. We'll get the 4-bit quantization (Q4_K_M):
> sibila hub -d 'TheBloke/rocket-3B-GGUF' -f Q4_K_M -m '../../models'\n\nSearching...\nDownloading model 'TheBloke/rocket-3B-GGUF' file 'rocket-3b.Q4_K_M.gguf' to '../../models/rocket-3b.Q4_K_M.gguf'\n\nDownload complete.\nFor information about this and other models, please visit https://huggingface.co\n
After this command, the \"rocket-3b.Q4_K_M.gguf\" file has now been downloaded to the \"../../models\" folder.
We'll now register it with the Models factory, which is located in the folder to where we downloaded the model.
This can be done by editing the \"models.json\" file directly or even simpler, with a \"sibila models -s\" command:
> sibila models -s llamacpp:rocket rocket-3b.Q4_K_M.gguf -m '../../models'\n\nUsing models directory '../../models'\nSet model 'llamacpp:rocket' with name='rocket-3b.Q4_K_M.gguf' at '/home/jorge/ai/sibila/models/models.json'.\n
An entry has now been created in \"models.json\" for this model.
However, we did not set the chat template format - but let's first test if the downloaded GGUF file already includes it in its metadata.
This is done with \"sibila models -t\":
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nError: Could not find a suitable chat template format for this model. Without a format, fine-tuned models cannot function properly. See the docs on how you can fix this: either setup the format in Models factory, or provide the chat template in the 'format' arg.\n
Error. Looks like we need to set the chat template format!
Checking the model's page, we find that it uses the ChatML prompt/chat template, which is great because it's one of the base formats included with Sibila.
So let's set the template format in the \"llamacpp:rocket\" entry we've just created:
> sibila models -f llamacpp:rocket chatml -m '../../models'\n\nUsing models directory '/home/jorge/ai/sibila/models'\nUpdated model 'llamacpp:rocket' with format 'chatml' at '/home/jorge/ai/sibila/models/models.json'.\n
Let's now test again:
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nModel 'llamacpp:rocket' was properly created and should run fine.\n
Great - the model passed the test and should be ready for use.
Let's try using it from Python:
from sibila import Models\n\nModels.setup(\"../../models\") # the folder with models and configs\n\nmodel = Models.create(\"llamacpp:rocket\") # model name in provider:name format\n\nmodel(\"Hello there!\")\n
\"Hello! I am an AI language model and I'm here to assist you with your questions or provide information if you have any. I am programmed to be polite and respectful, so please let me know how I can help you today.\"\n
Seems to be working - and politely too!
"},{"location":"examples/compare/","title":"Compare","text":"In this example we'll use an utility function from the multigen module that builds a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
This function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
For the local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the local_name variable below, after the text \"llamacpp:\".
Jupyter notebook and Python script versions are available in the example's folder.
Instead of directly creating models as we've seen in previous examples, multigen will create the models via the Models class directory.
We'll start by choosing a local and a remote model that we'll compare.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nlocal_name = \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\n# to use an OpenAI model:\nremote_name = \"openai:gpt-4\"\n
Now let's define a list of reviews that we'll ask the two models to do sentiment analysis upon.
These are generic product reviews, that you could find in an online store.
reviews = [\n\"The user manual was confusing, but once I figured it out, the product more or less worked.\",\n\"This widget changed my life! It's sleek, efficient, and worth every penny.\",\n\"I'm disappointed with the product quality. It broke after just a week of use.\",\n\"The customer service team was incredibly helpful in resolving my issue with the device.\",\n\"I'm blown away by the functionality of this gadget. It exceeded my expectations.\",\n\"The packaging was damaged upon arrival, but the product itself works great.\",\n\"I've been using this tool for months, and it's still as good as new. Highly recommended!\",\n\"I regret purchasing this item. It doesn't perform as advertised.\",\n\"I've never had so much trouble with a product before. It's been a headache from day one.\",\n\"I bought this as a gift for my friend, and they absolutely love it!\",\n\"The price seemed steep at first, but after using it, I understand why. Quality product.\",\n\"This gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\"\n]\n\n# model instructions text, also known as system message\ninst_text = \"You are a helpful assistant that analyses text sentiment.\"\n
Since we just want to obtain a sentiment classification, we'll use a convenient enumeration: a list with three values: positive, negative or neutral.
Let's try the first review on a local model:
sentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nin_text = \"Each line is a product review. Extract the sentiment associated with each review:\\n\\n\" + reviews[0]\n\nprint(reviews[0])\n\nlocal_model = Models.create(local_name)\n\nout = local_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\n# to clear memory\ndel local_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
Definitely, 'neutral' is a good answer for this one.
Let's now try the remote model:
print(reviews[0])\n\nremote_model = Models.create(remote_name)\n\nout = remote_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\ndel remote_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
And the remote model (GPT-4) seems to agree on neutrality.
By using the query_multigen() function that we'll import from sibila.multigen, we'll be able to compare what multiple models generate in response to each input.
In our case the inputs will be the list of reviews. This function accepts these interesting arguments: - text: type of text output, which can be the word \"print\" or a text filename to which it will save. - csv: type of CSV output, which can also be \"print\" or a text filename to save into. - out_keys: what we want listed: the generated raw text (\"text\"), a Python dict (\"dict\") or a Pydantic object (\"obj\"). For our case \"dict\" is the right one. - gencall: we need to pass a function that will actually call the model for each input. We use a convenient predefined function and provide it with the sentiment_type definition.
Let's run it with our two models:
from sibila.multigen import (\n query_multigen,\n make_extract_gencall\n)\n\nsentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nout = query_multigen(reviews,\n inst_text,\n model_names = [local_name, remote_name],\n text=\"print\",\n csv=\"sentiment.csv\",\n out_keys = [\"value\"],\n gencall = make_extract_gencall(sentiment_enum)\n )\n
////////////////////////////////////////////////////////////\nThe user manual was confusing, but once I figured it out, the product more or less worked.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-4 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nThis widget changed my life! It's sleek, efficient, and worth every penny.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm disappointed with the product quality. It broke after just a week of use.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nThe customer service team was incredibly helpful in resolving my issue with the device.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm blown away by the functionality of this gadget. It exceeded my expectations.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe packaging was damaged upon arrival, but the product itself works great.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nI've been using this tool for months, and it's still as good as new. Highly recommended!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI regret purchasing this item. It doesn't perform as advertised.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI've never had so much trouble with a product before. It's been a headache from day one.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI bought this as a gift for my friend, and they absolutely love it!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe price seemed steep at first, but after using it, I understand why. Quality product.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n
The output format is - see comments nearby -----> arrows:
//////////////////////////////////////////////////////////// -----> This is the model input, a review text:\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP <----- Local model name and result\n'positive' <----- What the local model output\n==================== openai:gpt-4 -> OK_STOP <----- Remote model name and result\n'positive' <----- Remote model output\n
We also requested the creation of a CSV file with the results: sentiment.csv.
Example's assets at GitHub.
"},{"location":"examples/extract/","title":"Extract Pydantic","text":"In this example we'll extract information about all persons mentioned in a text. This example is also available in a dataclass version.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from pydantic import BaseModel, Field\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' source_location='London'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' source_location='Barcelona'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' source_location='Tokyo'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' source_location='Mumbai'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' source_location='Paris'\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident in the thoughtful lines that adorned his face' source_location='Barcelona' source_country='Spain'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories of love and longing' source_location='Tokyo' source_country='Japan'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology with ease, intellect shining through the chaos of urban life' source_location='Mumbai' source_country='India'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace reflected the timeless elegance of her beloved city' source_location='Paris' source_country='France'\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married: bool\" field to the Person class.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.' source_location='London' source_country='United Kingdom' is_married=False\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.' source_location='Barcelona' source_country='Spain' is_married=False\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.' source_location='Tokyo' source_country='Japan' is_married=False\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.' source_location='Mumbai' source_country='India' is_married=True\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='her impeccable style and effortless grace reflected the timeless elegance of her beloved city.' source_location='Paris' source_country='France' is_married=False\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/extract_dataclass/","title":"Extract dataclass","text":"This is the Python dataclass version of of the Pydantic extraction example.
We'll extract information about all persons mentioned in a text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from dataclasses import dataclass\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', source_location='London')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', source_location='Barcelona')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', source_location='Tokyo')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', source_location='Mumbai')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', source_location='Paris')\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='exuded creativity, passion for design evident in the thoughtful lines that adorned his face', source_location='Barcelona', source_country='Spain')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='navigated the complexities of technology with ease, intellect shining through the chaos of urban life', source_location='Mumbai', source_country='India')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France')\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married\" field to the Person dataclass.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.', source_location='London', source_country='United Kingdom', is_married=False)\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.', source_location='Barcelona', source_country='Spain', is_married=False)\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.', source_location='Tokyo', source_country='Japan', is_married=False)\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.', source_location='Mumbai', source_country='India', is_married=True)\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='her impeccable style and effortless grace reflected the timeless elegance of her beloved city.', source_location='Paris', source_country='France', is_married=False)\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/from_text_to_object/","title":"From text to object","text":"In this example we'll ask the model to extract keypoints from a text: - First in plain text format - Then free JSON output (with fields selected by the model) - Later constrained by a JSON schema (so that we can specify which fields) - And finally by generating to a Pydantic object (from a class definition)
All the queries will be made at temperature=0, which is the default GenConf setting. This means that the model is giving it's best (as in most probable) answer and that it will always output the same results, given the same inputs.
Also available as a Jupyter notebook or a Python script in the example's folder.
We'll start by creating either a local model or a GPT-4 model.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\". For an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Let's use this fragment from Wikipedia's entry on the Fiji islands: https://en.wikipedia.org/wiki/
doc = \"\"\"\\\nFiji, officially the Republic of Fiji,[n 2] is an island country in Melanesia,\npart of Oceania in the South Pacific Ocean. It lies about 1,100 nautical miles \n(2,000 km; 1,300 mi) north-northeast of New Zealand. Fiji consists of \nan archipelago of more than 330 islands\u2014of which about 110 are permanently \ninhabited\u2014and more than 500 islets, amounting to a total land area of about \n18,300 square kilometres (7,100 sq mi). The most outlying island group is \nOno-i-Lau. About 87% of the total population of 924,610 live on the two major \nislands, Viti Levu and Vanua Levu. About three-quarters of Fijians live on \nViti Levu's coasts, either in the capital city of Suva, or in smaller \nurban centres such as Nadi (where tourism is the major local industry) or \nLautoka (where the sugar-cane industry is dominant). The interior of Viti Levu \nis sparsely inhabited because of its terrain.[13]\n\nThe majority of Fiji's islands were formed by volcanic activity starting around \n150 million years ago. Some geothermal activity still occurs today on the islands \nof Vanua Levu and Taveuni.[14] The geothermal systems on Viti Levu are \nnon-volcanic in origin and have low-temperature surface discharges (of between \nroughly 35 and 60 degrees Celsius (95 and 140 \u00b0F)).\n\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and \nlater Melanesians, with some Polynesian influences. Europeans first visited Fiji \nin the 17th century.[15] In 1874, after a brief period in which Fiji was an \nindependent kingdom, the British established the Colony of Fiji. Fiji operated as \na Crown colony until 1970, when it gained independence and became known as \nthe Dominion of Fiji. In 1987, following a series of coups d'\u00e9tat, the military \ngovernment that had taken power declared it a republic. In a 2006 coup, Commodore \nFrank Bainimarama seized power. In 2009, the Fijian High Court ruled that the \nmilitary leadership was unlawful. At that point, President Ratu Josefa Iloilo, \nwhom the military had retained as the nominal head of state, formally abrogated \nthe 1997 Constitution and re-appointed Bainimarama as interim prime minister. \nLater in 2009, Ratu Epeli Nailatikau succeeded Iloilo as president.[16] On 17 \nSeptember 2014, after years of delays, a democratic election took place. \nBainimarama's FijiFirst party won 59.2% of the vote, and international observers \ndeemed the election credible.[17] \n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Be helpful and provide concise answers.\"\n
Let's start with a free text query by calling model().
in_text = \"Extract 5 keypoints of the following text:\\n\" + doc\n\nout = model(in_text, inst=inst_text)\nprint(out)\n
1. Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\n2. It consists of more than 330 islands with a total land area of about 18,300 square kilometres (7,100 sq mi).\n3. The majority of Fiji's population lives on the two major islands, Viti Levu and Vanua Levu.\n4. Most of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\n5. Fiji has a history of political instability, with multiple coups and changes in government since gaining independence in 1970.\n
These are quite reasonable keypoints.
Let's now ask for JSON output, taking care to explicitly request it in the query (in_text variable).
Instead of model() we now use json() which returns a Python dict.
import pprint\npp = pprint.PrettyPrinter(width=300, sort_dicts=False)\n\nin_text = \"Extract 5 keypoints of the following text in JSON format:\\n\\n\" + doc\n\nout = model.json(in_text,\n inst=inst_text)\npp.pprint(out)\n
{'keypoints': [{'point': 'Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'location': 'Melanesia, Oceania'},\n {'point': 'Fiji consists of more than 330 islands with about 110 permanently inhabited islands and over 500 islets.', 'total_land_area': 'about 18,300 square kilometres (7,100 sq mi)'},\n {'point': 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', 'population_statistics': '87%, 924,610'},\n {'point': \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'island_formation': 'volcanic activity, 150 million years ago'},\n {'point': 'Fiji has a history of political instability with multiple coups and changes in government.', 'political_history': 'multiple coups, changes in government'}]}\n
Note how the model chose to return different fields like \"point\" or \"island_formation\".
Because we didn't specify which fields we want, each model will generate different ones.
To specify a fixed format, let's now generate by setting a JSON schema that defines which fields and types we want:
json_schema = {\n \"properties\": {\n \"keypoint_list\": {\n \"description\": \"Keypoint list\",\n \"items\": {\n \"type\": \"string\",\n \"description\": \"Keypoint\"\n },\n \"type\": \"array\"\n }\n },\n \"required\": [\n \"keypoint_list\"\n ],\n \"type\": \"object\"\n}\n
This JSON schema requests that the generated dict constains a \"keypoint_list\" with a list of strings.
We'll also use json(), now passing the json_schema:
out = model.json(in_text,\n inst=inst_text,\n json_schema=json_schema)\n\nprint(out)\n
{'keypoint_list': ['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', \"About 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\", \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]}\n
for kpoint in out[\"keypoint_list\"]:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
It has generated a string list in the \"keypoint_list\" field, as we specified in the JSON schema.
This is better, but the problem with JSON schemas is that they can be quite hard to work with.
Let's use an easier way to specify the fields we want returned: Pydantic classes derived from BaseModel. This is way simpler to use than JSON schemas.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Keypoints(BaseModel):\n keypoint_list: list[str]\n\nout = model.pydantic(Keypoints,\n in_text,\n inst=inst_text)\nprint(out)\n
keypoint_list=['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', \"About 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\", \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]\n
for kpoint in out.keypoint_list:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
The pydantic() method returns an object of class Keypoints, instantiated with the model output.
This is a much simpler way to extract structured data from model.
Please see other examples for more interesting objects. In particular, we did not add descriptions to the fields, which are important clues to help the model understand what we want.
Besides Pydantic classes, Sibila can also use Python's dataclass to extract structured data. This is a lighter and easier alternative to using Pydantic.
Example's assets at GitHub.
"},{"location":"examples/hello_model/","title":"Hello model","text":"In this example we see how to directly create local or remote model objects and later to do that more easily with the Models class.
"},{"location":"examples/hello_model/#using-a-local-model","title":"Using a local model","text":"To use a local model, make sure you download its GGUF format file and save it into the \"../../models\" folder.
In these examples, we'll use a 4-bit quantization of the OpenChat-3.5 7 billion parameters model, which at the current time is quite a good model for its size.
The file is named \"openchat-3.5-1210.Q4_K_M.gguf\" and was downloaded from the above link. Make sure to save it into the \"../../models\" folder.
See here for more information about setting up your local models.
With the model file in the \"../../models\" folder, we can run the following script:
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Run the script above and after a few seconds (it has to load the model from disk), the good model answers back something like:
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
"},{"location":"examples/hello_model/#using-an-openai-model","title":"Using an OpenAI model","text":"To use a remote model like GPT-4 you'll need a paid OpenAI account: https://openai.com/pricing
With an OpenAI account, you'll be able to generate an access token that you should set into the OPENAI_API_KEY env variable.
(An even better way is to use .env files with your variables, and use the dotenv library to read them.)
Once a valid OPENAI_API_KEY env variable is set, you can run this script:
from sibila import OpenAIModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
We get back the usual funny pirate answer:
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
"},{"location":"examples/hello_model/#using-the-models-directory","title":"Using the Models directory","text":"In these two scripts we created different objects to access the LLM model: LlamaCppModel and OpenAIModel.
This was done to simplify, but a better way is to use the Models class directory.
Models is a singleton class that implements a directory of models where you can store file locations, configurations, aliases, etc.
After setting up a JSON configuration file you can have the Models class create models by using names like \"llamacpp:openchat\" or \"openai:gpt-4\" together with their predefined settings. This permits easy model change, comparing model outputs, etc.
In the scripts above, instead on instancing different classes for different models, we could use Models class to create the model from a name, by setting the model_name variable:
from sibila import Models, GenConf\n\n# Using a local llama.cpp model: we first setup the ../../models directory:\n# Models.setup(\"../../models\")\n# model_name = \"llamacpp:openchat\"\n\n# OpenAI: make sure you set the environment variable named OPENAI_API_KEY with your API key.\nmodel_name = \"openai:gpt-4\"\n\nmodel = Models.create(model_name,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
The magic happens in the line:
model = Models.create(model_name, ...)\n
The Models class will take care of initializing the model based on the name you provide.
Example's assets at GitHub.
"},{"location":"examples/interact/","title":"Interact","text":"In this example we look at the interact() function, which allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done by using a Context class object, which can manage thread messages and delete older ones when the context maximum length is reached.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
So, let's create a local model.
from sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Import interact() from the tools module and run it.
What follows is the actual chat where the user prompts start with a \">\" and the model answer's immediately below.
Read on, it does get funny at times... : )
from sibila.tools import (\n interact\n)\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9, max_tokens=600))\n
> Hello there!\n\n\nHello! How can I help you today?\n\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\n\nYes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.\n\n\n\n> No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\n\n\nMy apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\n\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\n\n\nGot it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\n\n\n\n> That's right - I can list the context messages by entering \"!c\"\n\n\nI see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.\n\n\n\n> !c\n\n\nThread token len=542, max len before next gen=1448\nThread inst='Be helpful.', join_sep='\\n', len=10\n0: IN='Hello there!'\n1: OUT='Hello! How can I help you today?'\n2: IN=\"I'm doing an example of this LLM access package called Sibila - have you heard about it?\"\n3: OUT='Yes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.'\n4: IN=\"No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\"\n5: OUT=\"My apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\"\n6: IN=\"No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\"\n7: OUT=\"Got it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\"\n8: IN='That\\'s right - I can list the context messages by entering \"!c\"'\n9: OUT='I see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.'\n\n\n\n> Info about the model: \"!i\"\n\n\nThe `!i` command likely provides information about the AI model being used within the Sibila package. This could include details such as the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please let me know.\n\n\n\n> Sorry, it's actually \"!info\"\n\n\nMy apologies for the confusion in my previous response. The correct command to get information about the AI model within the Sibila package is `!info`. This command should provide you with details about the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please feel free to ask.\n\n\n\n> !info\n\n\nModel:\ndesc='LlamaCppModel: '/home/jorge/ai/sibila/models/openchat-3.5-1210.Q4_K_M.gguf' - 'llama 7B Q4_K - Medium'',\nctx_len=2048, max_tokens_limit=2048,\ngenconf=GenConf(max_tokens=0, stop=[], temperature=0.0, top_p=0.9, format='text', json_schema=None, special=None)\nGenConf:\nGenConf(max_tokens=600, stop=[], temperature=0.9, top_p=0.9, format='text', json_schema=None, special=None)\n\nThread token len=736, max len before next gen=1448\n\n\n\n> For help on al lcommands: \"!\"\n\n\nWithin the Sibila package, using an exclamation mark (`!`) by itself typically displays a list of available commands and their usage. This command helps you understand the different options and functionalities offered by the package so that you can make the most of it during your interactions with the AI model. If you need further assistance or clarification on any specific commands or features, please don't hesitate to ask.\n\n\n\n> !\n\n\nUnknown command '!' - known commands:\n !inst[=text] - clear messages and add inst (system) message\n !add=path - load file and add to last msg\n !image=path/url - include a local or remote image. Local images must fit the context!\n !c - list context msgs\n !cl=path - load context (default=thread.json)\n !cs=path - save context (default=thread.json)\n !tl - thread's token length\n !info - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n\n\n\n> The Thread class can manage total token length: it will delete older messages if the thread becomes larger than the content max length.\n\n\nI understand that the `sibila.Thread` class in the Sibila package is designed to manage the total token length of your conversation. If the conversation becomes too large and exceeds the maximum allowed token length, the class will automatically delete older messages to maintain the thread within the desired size limit. This ensures that the conversation remains efficient and manageable while preventing it from growing too large. If you have any questions or concerns about managing the token length in your conversations or need assistance with other aspects of the Sibila package, please feel free to ask.\n\n\n\n> That's it - to see token length: \"!tl\"\n\n\nThe `!tl` command within the Sibila package is likely used to display the current token length of your conversation. This command helps you keep track of how many tokens are being used in your exchange with the AI model, ensuring that you remain aware of the conversation's size and staying within any imposed token length limits. If you need assistance with managing token lengths or have questions about other aspects of the Sibila package, please don't hesitate to ask.\n\n\n\n> !tl\n\n\nThread token len=1128, max len before next gen=1448\n\n\n\n> To save the thread to a JSON file: \"!cs\"\n\n\nThe `!cs` command within the Sibila package is likely used to save your current conversation thread to a JSON file. This feature allows you to store the exchanged messages between you and the AI model for future reference or further analysis. To use this command, simply enter `!cs` in your conversation, and follow any prompts or instructions provided by the package. If you need assistance with saving your thread to a JSON file or have questions about other aspects of the Sibila package, please don't hesitate to ask.\n\n\n\n> !cs\n\n\nSaved context to thread.json\n\n\n\n>\n\n\n\n\n\nThread inst='Be helpful.', join_sep='\\n', len=22\n0: IN='Hello there!'\n1: OUT='Hello! How can I help you today?'\n2: IN=\"I'm doing an example of this LLM access package called Sibila - have you heard about it?\"\n3: OUT='Yes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.'\n4: IN=\"No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\"\n5: OUT=\"My apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\"\n6: IN=\"No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\"\n7: OUT=\"Got it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\"\n8: IN='That\\'s right - I can list the context messages by entering \"!c\"'\n9: OUT='I see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.'\n10: IN='Info about the model: \"!i\"'\n11: OUT=\"The `!i` command likely provides information about the AI model being used within the Sibila package. This could include details such as the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please let me know.\"\n12: IN='Sorry, it\\'s actually \"!info\"'\n13: OUT=\"My apologies for the confusion in my previous response. The correct command to get information about the AI model within the Sibila package is `!info`. This command should provide you with details about the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please feel free to ask.\"\n14: IN='For help on al lcommands: \"!\"'\n15: OUT=\"Within the Sibila package, using an exclamation mark (`!`) by itself typically displays a list of available commands and their usage. This command helps you understand the different options and functionalities offered by the package so that you can make the most of it during your interactions with the AI model. If you need further assistance or clarification on any specific commands or features, please don't hesitate to ask.\"\n16: IN='The Thread class can manage total token length: it will delete older messages if the thread becomes larger than the content max length.'\n17: OUT='I understand that the `sibila.Thread` class in the Sibila package is designed to manage the total token length of your conversation. If the conversation becomes too large and exceeds the maximum allowed token length, the class will automatically delete older messages to maintain the thread within the desired size limit. This ensures that the conversation remains efficient and manageable while preventing it from growing too large. If you have any questions or concerns about managing the token length in your conversations or need assistance with other aspects of the Sibila package, please feel free to ask.'\n18: IN='That\\'s it - to see token length: \"!tl\"'\n19: OUT=\"The `!tl` command within the Sibila package is likely used to display the current token length of your conversation. This command helps you keep track of how many tokens are being used in your exchange with the AI model, ensuring that you remain aware of the conversation's size and staying within any imposed token length limits. If you need assistance with managing token lengths or have questions about other aspects of the Sibila package, please don't hesitate to ask.\"\n20: IN='To save the thread to a JSON file: \"!cs\"'\n21: OUT=\"The `!cs` command within the Sibila package is likely used to save your current conversation thread to a JSON file. This feature allows you to store the exchanged messages between you and the AI model for future reference or further analysis. To use this command, simply enter `!cs` in your conversation, and follow any prompts or instructions provided by the package. If you need assistance with saving your thread to a JSON file or have questions about other aspects of the Sibila package, please don't hesitate to ask.\"\n
These are the \"!\" commands that you can use in the interact() inputs:
!inst[=text] - clear messages and add inst (system) message\n !add=path - load file and add to last msg\n !image=path/url - include a local or remote image. Local images must fit the context!\n !c - list context msgs\n !cl=path - load context (default=thread.json)\n !cs=path - save context (default=thread.json)\n !tl - thread's token length\n !info - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n
Example's assets at GitHub.
"},{"location":"examples/quick_meeting/","title":"Quick meeting","text":"Let's extract structured data from a meeting transcript, like attendees, action items and their priorities.
This is a quick meeting whose transcript is not very large, so a small local model should work well. See the Tough meeting example for a larger and more complex transcription text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
If you prefer to use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Here's the transcript we'll be using as source:
transcript = \"\"\"\\\nDate: 10th April 2024\nTime: 10:30 AM\nLocation: Conference Room A\n\nAttendees:\n Arthur: Logistics Supervisor\n Bianca: Operations Manager\n Chris: Fleet Coordinator\n\nArthur: Good morning, team. Thanks for making it. We've got three matters to address quickly today.\n\nBianca: Morning, Arthur. Let's dive in.\n\nChris: Ready when you are.\n\nArthur: First off, we've been having complaints about late deliveries. This is very important, we're getting some bad reputation out there.\n\nBianca: Chris, I think you're the right person to take care of this. Can you investigate and report back by end of day? \n\nChris: Absolutely, Bianca. I'll look into the reasons and propose solutions.\n\nArthur: Great. Second, Bianca, we need to update our driver training manual. Can you take the lead and have a draft by Friday?\n\nBianca: Sure thing, Arthur. I'll get started on that right away.\n\nArthur: Lastly, we need to schedule a meeting with our software vendor to discuss updates to our tracking system. This is a low-priority task but still important. I'll handle that. Any input on timing?\n\nBianca: How about next Wednesday afternoon?\n\nChris: Works for me.\n\nArthur: Sounds good. I'll arrange it. Thanks, Bianca, Chris. Let's keep the momentum going.\n\nBianca: Absolutely, Arthur.\n\nChris: Will do.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
Let's define two Pydantic BaseModel classes whose instances will receive the extracted information: - Attendee: to store information about each meeting attendee - Meeting: to keep meeting's date and location, list of participants and other info we'll see below
And let's ask the model to create objects that are instances of these classes:
from pydantic import BaseModel, Field\n\n# class definitions will be used to constrain the model output and initialize an instance object\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
meeting_date='10th April 2024' meeting_location='Conference Room A' attendees=[Attendee(name='Arthur', occupation='Logistics Supervisor'), Attendee(name='Bianca', occupation='Operations Manager'), Attendee(name='Chris', occupation='Fleet Coordinator')]\n
For a nicer display:
print(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\n
This information was correctly extracted.
Let's now request the action items mentioned in the meeting. We'll create a new class ActionItem with an index and a name for the item. Note that we're annotating each field with a Field(description=...) information to help the model understand what we're looking extract.
We'll also add an action_items field to the Meeting class to hold the items list.
class Attendee(BaseModel):\n name: str\n occupation: str\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate and report late deliveries'\nindex=2 name='Update driver training manual'\nindex=3 name='Schedule meeting with software vendor to discuss tracking system updates'\n
The extracted action items also look good.
Let's now extract more action item information: - Priority for each item - Due by... information - Name of the attendee that was assigned for that item
So, we create a Priority class holding three priority types - low to high.
We also add three fields to the ActionItem class, to hold the new information: priority, due_by and assigned_attendee.
from enum import Enum\n\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Priority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: Priority = Field(description=\"Action item priority\")\n due_by: str = Field(description=\"When should the item be complete\")\n assigned_attendee: str = Field(description=\"Name of the attendee to which action item was assigned\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate late deliveries' priority=<Priority.HIGH: 'high'> due_by='end of day' assigned_attendee='Chris'\nindex=2 name='Update driver training manual' priority=<Priority.MEDIUM: 'medium'> due_by='Friday' assigned_attendee='Bianca'\nindex=3 name='Schedule meeting with software vendor' priority=<Priority.LOW: 'low'> due_by='next Wednesday afternoon' assigned_attendee='Arthur'\n
The new information was correctly extracted: priorities, due by and assigned attendees for each action item.
For an example of a harder, more complex transcript, see the \"Tough meeting\" example.
Example's assets at GitHub.
"},{"location":"examples/receipt/","title":"Receipt","text":"In this example we'll look at extracting structured information from a photo of a receipt.
Sibila supports local models with image input support, but we'll use OpenAI's GPT-4o which works quite well. Make sure to set your OPENAI_API_KEY env variable.
You can still use a local model by uncommenting the commented lines below. See the docs for some suggestions about local vision models.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# model = Models.create(\"llamacpp:llava-llama-3-8b-v1_1-int4.gguf*llava-llama-3-8b-v1_1-mmproj-f16.gguf\")\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4o\")\n
Let's use this photo of an Italian receipt:
To see if the model can handle it, let's try a free text query for the total. We'll pass a tuple of (text_prompt, image_url) -
model((\"How much is the total?\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\n
'The total amount on the receipt is \u20ac5.88.'\n
Good. Can the model extract the receipt item lines?
model((\"List the lines of paid items in the receipt?\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\n
'The lines of paid items in the receipt are:\\n\\n1. BIS BORSE TERM. S - \u20ac3.90\\n2. GHIACCIO 2X400 G - \u20ac0.99\\n3. GHIACCIO 2X400 G - \u20ac0.99'\n
It did extract them well.
Let's wrap this in a Pydantic object to get structured data from the model. We'll add a field for the data listed in the receipt:
from pydantic import BaseModel, Field\nfrom datetime import datetime\n\nclass ReceiptLine(BaseModel):\n \"\"\"Receipt line data\"\"\"\n description: str\n cost: float\n\nclass Receipt(BaseModel):\n \"\"\"Receipt information\"\"\"\n total: float = Field(description=\"Total value\")\n lines: list[ReceiptLine] = Field(description=\"List of lines of paid items\")\n date: datetime = Field(description=\"Listed date\")\n\ninfo = model.extract(Receipt,\n (\"Extract receipt information.\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\ninfo\n
Receipt(total=5.88, lines=[ReceiptLine(description='BIS BORSE TERM.S', cost=3.9), ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99), ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99)], date=datetime.datetime(2014, 8, 27, 19, 51, tzinfo=TzInfo(UTC)))\n
for line in info.lines:\n print(line)\nprint(\"total:\", info.total)\n
description='BIS BORSE TERM.S' cost=3.9\ndescription='GHIACCIO 2X400 G' cost=0.99\ndescription='GHIACCIO 2X400 G' cost=0.99\ntotal: 5.88\n
All the information is correct and structured in an object that we can use as needed.
From here we could expand the Pydantic object with more fields to extract other information present in the receipt like merchant name, VAT number, etc.
Example's assets at GitHub.
"},{"location":"examples/tag/","title":"Tag","text":"In this example we'll summarize and classify customer queries with tags. We'll use dataclasses to specify the structure of the information we want extracted (we could also use Pydantic BaseModel classes).
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Available as a Jupyter notebook or a Python script in the example's folder.
Let's start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
These will be our queries, ten typical customer support questions:
queries = \"\"\"\\\n1. Do you offer a trial period for your software before purchasing?\n2. I'm experiencing a glitch with your app, it keeps freezing after the latest update.\n3. What are the different pricing plans available for your subscription service?\n4. Can you provide instructions on how to reset my account password?\n5. I'm unsure about the compatibility of your product with my device, can you advise?\n6. How can I track my recent order and estimate its delivery date?\n7. Is there a customer loyalty program or rewards system for frequent buyers?\n8. I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\n9. Could you clarify the coverage and limitations of your product warranty?\n10. What are your customer support hours and how can I reach your team in case of emergencies?\n\"\"\"\n
We'll start by summarizing each query.
Let's try just using field names (without descriptions), perhaps they are enough to tell the model about what we want.
from dataclasses import dataclass\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_text: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information from customer queries.\"\n\n# the input query, including the above text\nin_text = \"Each line is a customer query. Extract information about each query:\\n\\n\" + queries\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Trial period inquiry', query_text='Do you offer a trial period for your software before purchasing?')\nQuery(id=2, query_summary='Technical issue', query_text=\"I'm experiencing a glitch with your app, it keeps freezing after the latest update.\")\nQuery(id=3, query_summary='Pricing inquiry', query_text='What are the different pricing plans available for your subscription service?')\nQuery(id=4, query_summary='Password reset request', query_text='Can you provide instructions on how to reset my account password?')\nQuery(id=5, query_summary='Compatibility inquiry', query_text=\"I'm unsure about the compatibility of your product with my device, can you advise?\")\nQuery(id=6, query_summary='Order tracking', query_text='How can I track my recent order and estimate its delivery date?')\nQuery(id=7, query_summary='Loyalty program inquiry', query_text='Is there a customer loyalty program or rewards system for frequent buyers?')\nQuery(id=8, query_summary='Refund policy inquiry', query_text=\"I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\")\nQuery(id=9, query_summary='Warranty inquiry', query_text='Could you clarify the coverage and limitations of your product warranty?')\nQuery(id=10, query_summary='Customer support inquiry', query_text='What are your customer support hours and how can I reach your team in case of emergencies?')\n
The summaries look good.
Let's now define tags and ask the model to classify each query into a tag. In the Tag class, we set its docstring to the rules we want for the classification. This is done in the docstring because Tag is not a dataclass, but derived from Enum.
No longer asking for the query_text in the Query class to keep output shorter.
from enum import Enum\n\nclass Tag(str, Enum):\n \"\"\"Queries can be classified into the following tags:\ntech_support: queries related with technical problems.\nbilling: post-sale queries about billing cycle, or subscription termination.\naccount: queries about user account problems.\npre_sales: queries from prospective customers (who have not yet purchased).\nother: all other query topics.\"\"\" \n TECH_SUPPORT = \"tech_support\"\n BILLING = \"billing\"\n PRE_SALES = \"pre_sales\"\n ACCOUNT = \"account\"\n OTHER = \"other\"\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_tag: Tag\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Asking about trial period', query_tag='pre_sales')\nQuery(id=2, query_summary='Reporting app issue', query_tag='tech_support')\nQuery(id=3, query_summary='Inquiring about pricing plans', query_tag='billing')\nQuery(id=4, query_summary='Requesting password reset instructions', query_tag='account')\nQuery(id=5, query_summary='Seeking device compatibility advice', query_tag='pre_sales')\nQuery(id=6, query_summary='Tracking order and delivery date', query_tag='other')\nQuery(id=7, query_summary='Inquiring about loyalty program', query_tag='billing')\nQuery(id=8, query_summary='Asking about refund policy', query_tag='pre_sales')\nQuery(id=9, query_summary='Seeking warranty information', query_tag='other')\nQuery(id=10, query_summary='Inquiring about customer support hours', query_tag='other')\n
The applied tags appear mostly reasonable.
Of course, pre-sales tagging could be done automatically from a database of existing customer contacts, but the model is doing a good job of identifying questions likely to be pre-sales, like ids 1, 5 and 8 which are questions typically asked before buying/subscribing.
Also, note that classification is being done from a single phrase. More information in each customer query would certainly allow for fine-grained classification.
Example's assets at GitHub.
"},{"location":"examples/tough_meeting/","title":"Tough meeting","text":"In this example we'll look at extracting participants and action items from a meeting transcript.
Start by creating the model. As you'll see below, the transcript is large, with complex language, so we'll use OpenAI's GPT-4 this time. You can still use a local model by uncommenting the commented lines below.
Make sure to set your OPENAI_API_KEY env variable.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# the transcript is large, so we'll create the model with a context length of 3072, which should be enough.\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\", ctx_len=3072)\n
We'll use a sample meeting transcript from https://www.ctas.tennessee.edu/eli/sample-meeting-transcript
transcript = \"\"\"\\\nChairman Wormsley (at the proper time and place, after taking the chair and striking the gavel on the table): This meeting of the CTAS County Commission will come to order. Clerk please call the role. (Ensure that a majority of the members are present.)\n\nChairman Wormsley: Each of you has received the agenda. I will entertain a motion that the agenda be approved.\n\nCommissioner Brown: So moved.\n\nCommissioner Hobbs: Seconded\n\nChairman Wormsley: It has been moved and seconded that the agenda be approved as received by the members. All those in favor signify by saying \"Aye\"?...Opposed by saying \"No\"?...The agenda is approved. You have received a copy of the minutes of the last meeting. Are there any corrections or additions to the meeting?\n\nCommissioner McCroskey: Mister Chairman, my name has been omitted from the Special Committee on Indigent Care.\n\nChairman Wormsley: Thank you. If there are no objections, the minutes will be corrected to include the name of Commissioner McCroskey. Will the clerk please make this correction. Any further corrections? Seeing none, without objection the minutes will stand approved as read. (This is sort of a short cut way that is commonly used for approval of minutes and/or the agenda rather than requiring a motion and second.)\n\nChairman Wormsley: Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Mister Chairman, I would like to make a motion to approve the resolution taking money from the Data Processing Reserve Account in the County Clerk's office and moving it to the equipment line to purchase a laptop computer.\n\nCommissioner Carmical: I second the motion.\n\nChairman Wormsley: This resolution has a motion and second. Will the clerk please take the vote.\n\nChairman Wormsley: The resolution passes. We will now take up old business. At our last meeting, Commissioner McKee, your motion to sell property near the airport was deferred to this meeting. You are recognized.\n\nCommissioner McKee: I move to withdraw that motion.\n\nChairman Wormsley: Commissioner McKee has moved to withdraw his motion to sell property near the airport. Seeing no objection, this motion is withdrawn. The next item on the agenda is Commissioner Rodgers'.\n\nCommissioner Rodgers: I move adopton of the resolution previously provided to each of you to increase the state match local litigation tax in circuit, chancery, and criminal courts to the maximum amounts permissible. This resolution calls for the increases to go to the general fund.\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: The sheriff is opposed to this increase.\n\nChairman Wormsley: Commissioner, you are out of order because this motion has not been seconded as needed before the floor is open for discussion or debate. Discussion will begin after we have a second. Is there a second?\n\nCommissioner Reinhart: For purposes of discussion, I second the motion.\n\nChairman Wormsley: Commissioner Rodgers is recognized.\n\nCommissioner Rodgers: (Speaks about the data on collections, handing out all sorts of numerical figures regarding the litigation tax, and the county's need for additional revenue.)\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: I move an amendment to the motion to require 25 percent of the proceeds from the increase in the tax on criminal cases go to fund the sheriff's department.\n\nChairman Wormsley: Commissioner Malone\n\nCommissioner Malone: I second the amendment.\n\nChairman Wormsley: A motion has been made and seconded to amend the motion to increase the state match local litigation taxes to the maximum amounts to require 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Any discussion? Will all those in favor please raise your hand? All those opposed please raise your hand. The amendment carries 17-2. We are now on the motion as amended. Any further discussion?\n\nCommissioner Headrick: Does this require a two-thirds vote?\n\nChairman Wormsley: Will the county attorney answer that question?\n\nCounty Attorney Fults: Since these are only courts of record, a majority vote will pass it. The two-thirds requirement is for the general sessions taxes.\n\nChairman Wormsley: Other questions or discussion? Commissioner Adams.\n\nCommissioner Adams: Move for a roll call vote.\n\nCommissioner Crenshaw: Second\n\nChairman Wormsley: The motion has been made and seconded that the state match local litigation taxes be increased to the maximum amounts allowed by law with 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Will all those in favor please vote as the clerk calls your name, those in favor vote \"aye,\" those against vote \"no.\" Nine votes for, nine votes against, one not voting. The increase fails. We are now on new business. Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Each of you has previously received a copy of a resolution to increase the wheel tax by $10 to make up the state cut in education funding. I move adoption of this resolution.\n\nChairman Wormsley: Commissioner Thompson\n\nCommissioner Thompson: I second.\n\nChairman Wormsley: It has been properly moved and seconded that a resolution increasing the wheel tax by $10 to make up the state cut in education funding be passed. Any discussion? (At this point numerous county commissioners speak for and against increasing the wheel tax and making up the education cuts. This is the first time this resolution is under consideration.) Commissioner Hayes is recognized.\n\nCommissioner Hayes: I move previous question.\n\nCommisioner Crenshaw: Second.\n\nChairman Wormsley: Previous question has been moved and seconded. As you know, a motion for previous question, if passed by a two-thirds vote, will cut off further debate and require us to vote yes or no on the resolution before us. You should vote for this motion if you wish to cut off further debate of the wheel tax increase at this point. Will all those in favor of previous question please raise your hand? Will all those against please raise your hand? The vote is 17-2. Previous question passes. We are now on the motion to increase the wheel tax by $10 to make up the state cut in education funding. Will all those in favor please raise your hand? Will all those against please raise your hand? The vote is 17-2. This increase passes on first passage. Is there any other new business? Since no member is seeking recognition, are there announcements? Commissioner Hailey.\n\nCommissioner Hailey: There will be a meeting of the Budget Committee to look at solid waste funding recommendations on Tuesday, July 16 at noon here in this room.\n\nChairman Wormsley: Any other announcements? The next meeting of this body will be Monday, August 19 at 7 p.m., here in this room. Commissioner Carmical.\n\nCommissioner Carmical: There will be a chili supper at County Elementary School on August 16 at 6:30 p.m. Everyone is invited.\n\nChairman Wormsley: Commissioner Austin.\n\nCommissioner Austin: Move adjournment.\n\nCommissioner Garland: Second.\n\nChairman Wormsley: Without objection, the meeting will stand adjourned.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information and output in JSON format.\"\n
As you can see, this is a quite large transcript, filled with long names and complex phrases. Let's see how the model will handle it...
Let's start by extracting the names of the participants in the meeting.
We'll create the Meeting class with a list of strings, to receive the names of mentioned participants.
The model will take clues from the variable names as well as from the description Field we set. In this case we name the string list \"participants\" and add a description of what we're looking to receive.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
participants=['Chairman Wormsley', 'Clerk', 'Commissioner Brown', 'Commissioner Hobbs', 'Commissioner McCroskey', 'Commissioner Adkins', 'Commissioner Carmical', 'Commissioner McKee', 'Commissioner Rodgers', 'Commissioner Duckett', 'Commissioner Reinhart', 'Commissioner Malone', 'Commissioner Headrick', 'County Attorney Fults', 'Commissioner Adams', 'Commissioner Crenshaw', 'Commissioner Thompson', 'Commissioner Hayes', 'Commissioner Hailey', 'Commissioner Carmical', 'Commissioner Austin', 'Commissioner Garland']\n
# print the generated participants list:\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nClerk\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
A name appears twice - \"Commissioner Carmical\".
It's a matter of opinion if the clerk is an active participant, but let's try to fix the repeated names.
Let's try asking for a list of participants \"without repeated entries\", in the field's description:
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Wormsley\nBrown\nHobbs\nMcCroskey\nAdkins\nCarmical\nMcKee\nRodgers\nDuckett\nReinhart\nMalone\nHeadrick\nFults\nAdams\nCrenshaw\nThompson\nHayes\nHailey\nAustin\nGarland\n
Didn't work as expected, repetition is gone but it dropped the titles, only names are appearing.
Let's try asking for \"names and titles\":
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of names and titles of participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
And now \"Commissioner Carmical\" is repeating again!
Let's move on, the point is that you can also do some prompt engineering with the description field. And this model shortcoming could be dealt with by post-processing the received list.
Let's now also request a list of action items mentioned in the transcript:
class ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda'\nindex=2 name='Correct the minutes to include Commissioner McCroskey in the Special Committee on Indigent Care'\nindex=3 name='Approve the resolution to transfer funds from the Data Processing Reserve Account to purchase a laptop'\nindex=4 name='Withdraw the motion to sell property near the airport'\nindex=5 name='Adopt the resolution to increase the state match local litigation tax'\nindex=6 name=\"Amend the motion to allocate 25 percent of the increased tax on criminal cases to the sheriff's department\"\nindex=7 name='Vote on the state match local litigation taxes increase with the amendment'\nindex=8 name='Adopt the resolution to increase the wheel tax by $10 for education funding'\nindex=9 name='Budget Committee meeting on solid waste funding recommendations'\nindex=10 name='Chili supper at County Elementary School'\n
These are reasonable action items.
Let's now also request a priority for each ActionItem - we'll create a string Enum class with three priority levels.
from enum import Enum\n\nclass ActionPriority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: ActionPriority = Field(description=\"Action item priority\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda' priority=<ActionPriority.HIGH: 'high'>\nindex=2 name='Correct the minutes to include Commissioner McCroskey' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=3 name='Approve the resolution for funds transfer for laptop purchase' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=4 name='Withdraw motion to sell property near the airport' priority=<ActionPriority.LOW: 'low'>\nindex=5 name='Adopt resolution to increase state match local litigation tax' priority=<ActionPriority.HIGH: 'high'>\nindex=6 name=\"Amend resolution to allocate funds to sheriff's department\" priority=<ActionPriority.MEDIUM: 'medium'>\nindex=7 name='Roll call vote on litigation tax increase' priority=<ActionPriority.HIGH: 'high'>\nindex=8 name='Adopt resolution to increase wheel tax for education funding' priority=<ActionPriority.HIGH: 'high'>\nindex=9 name='Budget Committee meeting on solid waste funding' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=10 name='Chili supper at County Elementary School announcement' priority=<ActionPriority.LOW: 'low'>\nindex=11 name='Adjournment' priority=<ActionPriority.MEDIUM: 'medium'>\n
It's not clear from the meeting transcript text if these priorities are correct, but some items related to taxes are receiving high priorities, from the context, it looks reasonable that taxes are a priority. : )
Example's assets at GitHub.
"},{"location":"extract/dataclass/","title":"Dataclass","text":"We can also extract objects whose structure is given by a dataclass definition:
Example
from sibila import Models\nfrom dataclasses import dataclass\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the Pydantic version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/dataclass/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For dataclasses this is done with Annotated[type, \"description\"] - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Annotated, Literal, Optional, Union\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\n@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
"},{"location":"extract/dataclass/#optional-default-and-union-fields","title":"Optional, default and Union fields","text":"A field can be marked as optional by annotating with Optional[Type] and setting a default value, as in the \"person_name\" field:
Example
@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n person_name: Annotated[Optional[str], \"Person name if any\"] = None\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None)\n
Due to the dataclass rules, Fields with default values must appear after all other fields.
A field can also be marked as a union of alternative types with Union[Type1,Type2,...] - see the \"bags\" field below:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n bags: Annotated[Union[int, str, None], \"Number of bags, bag voucher or none\"]\n person_name: Annotated[Optional[str], \"Person name if any\"] = None\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None, bags=None)\n
Check the Extract dataclass example to see a more sophisticated example of structured data extraction.
"},{"location":"extract/enums/","title":"Enums","text":"Enumerations are important for classification tasks or in any situation where you need a choice to be made from a list of options.
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract([\"red\", \"blue\", \"green\", \"yellow\"], \n \"The car color was a shade of indigo\")\n
Result
'blue'\n
You can pass a list of items in any of the supported native types: str, float, int or bool.
"},{"location":"extract/enums/#literals","title":"Literals","text":"We can also use Literals:
Example
from typing import Literal\n\nmodel.extract(Literal[\"SPAM\", \"NOT_SPAM\", \"UNSURE\"], \n \"Hello my dear friend, I'm contacting you because I want to give you a million dollars\",\n inst=\"Classify this text on the likelihood of being spam\")\n
Result
'SPAM'\n
Extracting to a Literal type returns one of its possible options in its native type (str, float, int or bool).
"},{"location":"extract/enums/#enum-classes","title":"Enum classes","text":"Or Enum classes of native types. An example of extracting to Enum classes:
Example
from enum import IntEnum\n\nclass Heads(IntEnum):\n SINGLE = 1\n DOUBLE = 2\n TRIPLE = 3\n\nmodel.extract(Heads,\n \"The Two-Headed Monster from The Muppets.\")\n
Result
<Heads.DOUBLE: 2>\n
For the model, the important information is actual the value of each enum member, not its name. For example, in this enum, the model would only see the strings to the right of each member (the enum values), not \"RED\", \"ORANGE\" nor \"GREEN\":
class Light(Enum):\n RED = 'stop'\n YELLOW = 'slow down'\n GREEN = 'go'\n
See the Tag classification example to see how Enum is used to tag support queries.
"},{"location":"extract/enums/#classify","title":"Classify","text":"You can also use the classify() method to extract enumerations, which accepts the enum types we've seen above. It calls extract() internally and its only justification is to make things more readable:
Example
model.classify([\"mouse\", \"cat\", \"dog\", \"bird\"],\n \"Snoopy\")\n
Result
'dog'\n
"},{"location":"extract/free_json/","title":"Free JSON","text":"Methods like extract() will generate JSON format constrained to a certain JSON Schema: this is needed or the model might not generate the fields or data types we're looking for.
You can generate schema-free JSON with the json() method. In this case, the model will pick the field names and data types for you.
For example:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"How to build a brick wall?\")\n\nfrom pprint import pprint\npprint(response, sort_dicts=False) \n
Result
{'steps': [{'step_number': 1,\n 'description': 'Gather all necessary materials and tools including '\n 'bricks, mortar, trowel, spirit level, tape '\n 'measure, bricklaying line, and safety equipment.'},\n {'step_number': 2,\n 'description': 'Prepare the foundation for the wall. Ensure it is '\n 'solid, level, and has the correct dimensions for '\n 'the wall you are building.'},\n {'step_number': 3,\n 'description': \"Mix the mortar according to the manufacturer's \"\n 'instructions, ensuring a consistent and workable '\n 'consistency.'},\n {'step_number': 4,\n 'description': 'Lay a bed of mortar where the first row of bricks '\n 'will be placed. Use the trowel to spread the '\n 'mortar evenly.'},\n {'step_number': 5,\n 'description': 'Start laying the bricks from one end, applying '\n 'mortar to the end of each brick before placing it '\n 'down to bond with the next brick.'},\n {'step_number': 6,\n 'description': 'Use the spirit level to check that the bricks are '\n 'level both horizontally and vertically. Adjust as '\n 'necessary.'},\n {'step_number': 7,\n 'description': 'Continue laying bricks, ensuring that you stagger '\n 'the joints in each row (running bond pattern). '\n 'This adds strength to the wall.'},\n {'step_number': 8,\n 'description': 'Periodically check that the wall is straight and '\n 'level by using the spirit level and the '\n 'bricklaying line.'},\n {'step_number': 9,\n 'description': 'Remove any excess mortar with the trowel as you '\n 'work. Keep the work area clean.'},\n {'step_number': 10,\n 'description': 'As you reach the end of each row, you may need to '\n 'cut bricks to fit. Use a brick hammer or a brick '\n 'cutter to do this.'},\n {'step_number': 11,\n 'description': 'Once the wall reaches the desired height, finish '\n 'the top with a row of solid bricks or capping '\n 'stones to protect the wall from weather.'},\n {'step_number': 12,\n 'description': 'Cure the mortar by protecting the wall from '\n 'extreme weather conditions for at least 24-48 '\n 'hours.'},\n {'step_number': 13,\n 'description': 'Clean the finished wall with a brush and water to '\n 'remove any remaining mortar residue.'},\n {'step_number': 14,\n 'description': 'Dispose of any waste material responsibly and '\n 'clean your tools.'}],\n'safety_tips': ['Wear safety glasses to protect your eyes from flying debris.',\n 'Use gloves to protect your hands from sharp edges and wet '\n 'mortar.',\n 'Wear a dust mask when mixing mortar to avoid inhaling dust '\n 'particles.',\n 'Keep the work area clear to prevent tripping hazards.'],\n'tools_required': ['Bricks',\n 'Mortar',\n 'Trowel',\n 'Spirit level',\n 'Tape measure',\n 'Bricklaying line',\n 'Safety glasses',\n 'Gloves',\n 'Dust mask',\n 'Brick hammer or cutter']}\n
The model returned a Python dictionary with fields and data types of it's own choice. We could provide a JSON Schema t defines a structure for the response.
See the From text to object example for a related use.
"},{"location":"extract/free_text/","title":"Free text","text":"You can also generate plain text by calling Model() or Model.call():
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"Explain in a few lines how to build a brick wall?\")\nprint(response)\n
Result
To build a brick wall, follow these steps:\n\n1. Prepare the site by excavating and leveling the ground, then install a damp-proof \nmembrane and create a solid base with concrete footings.\n2. Lay a foundation of concrete blocks or bricks, ensuring it is level and square.\n3. Build the wall using bricks or blocks, starting with a corner or bonding pattern \nto ensure stability. Use mortar to bond each course (row) of bricks or blocks, \nfollowing the recommended mortar mix ratio.\n4. Use a spirit level to ensure each course is level, and insert metal dowels or use \nbrick ties to connect adjacent walls or floors.\n5. Allow the mortar to dry for the recommended time before applying a damp-proof \ncourse (DPC) at the base of the wall.\n6. Finish the wall with capping bricks or coping stones, and apply any desired \nrender or finish.\n
"},{"location":"extract/pydantic/","title":"Pydantic","text":"Besides simple types and enums, we can also extract objects whose structure is given by a class derived from Pydantic's BaseModel definition:
Example
from sibila import Models\nfrom pydantic import BaseModel\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the dataclass version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/pydantic/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For Pydantic this is done with Field(description=\"description\") - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Literal, Optional, Union\nfrom pydantic import BaseModel, Field\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\nclass Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
"},{"location":"extract/pydantic/#optional-default-and-union-fields","title":"Optional, default and Union fields","text":"A field can be marked as optional by annotating with Optional[Type] and setting a default value, as in the \"person_name\" field:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None)\n
A field can also be marked as a union of alternative types with Union[Type1,Type2,...] as in the \"bags\" field below:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n bags: Union[int, str, None] = Field(description=\"Number of bags, bag voucher or none\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None, bags=None)\n
Check the Extract Pydantic example to see an interesting example of structured extraction.
"},{"location":"extract/simple_types/","title":"Simple types","text":"Sibila can constrain model generation to output simple python types. This is helpful for situations where you want to extract a specific data type.
To get a response from the model in a certain type, you can use the extract() method:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(bool, \n \"Certainly, I'd like to subscribe.\")\n
Result
True\n
"},{"location":"extract/simple_types/#instructions-to-help-the-model","title":"Instructions to help the model","text":"You may need to provide more extra information to the model, so that it understands what you want. This is done with the inst argument - inst is a shorter name for instructions:
Example
model.extract(str, \n \"I don't quite remember the product's name, I think it was called Cornaca\",\n inst=\"Extract the product name\")\n
Result
Cornaca\n
"},{"location":"extract/simple_types/#supported-types","title":"Supported types","text":"The following simple types are supported:
- bool
- int
- float
- str
- datetime
About datetime type
A special note about extracting to datetime: the datetime type is expecting an ISO 8601 formatted string. Because some models are less capable than others at correctly formatting dates/times, it helps to mention in the instructions that you want the output in \"ISO 8601\" format.
from datetime import datetime\nmodel.extract(datetime, \n \"Sure, glad to help, it all happened at December the 10th, 2023, around 3PM, I think\",\n inst=\"Output in ISO 8601 format\")\n
Result
datetime.datetime(2023, 12, 10, 15, 0)\n
"},{"location":"extract/simple_types/#lists","title":"Lists","text":"You can extract lists of any of the supported types (simple types, enum, dataclass, Pydantic).
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\")\n
Result
['Naples', 'Genoa', 'Florence', 'Rome']\n
As in all extractions, you may need to set the instructions text to specify what you want from the model. Just as an example of the power of instructions, let's add instructions asking for country output: it will still output a list, but with a single element - 'Italy':
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\",\n inst=\"Output the country\")\n
Result
['Italy']\n
"},{"location":"models/anthropic/","title":"Anthropic","text":"With Sibila you can access Anthropic remote models, for which you'll need an API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export ANTHROPIC_API_KEY=\"...\"\n
setx ANTHROPIC_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/anthropic/#creating-models","title":"Creating models","text":"Anthropic models can be used by Sibila through the AnthropicModel class.
Example
from sibila import AnthropicModel\n\nmodel = AnthropicModel(\"claude-3-opus-20240229\")\n\nmodel(\"I think that I shall never see.\")\n
Result
It sounds like you may be quoting the opening line of the poem \"Trees\" by Joyce Kilmer, \nwhich begins \"I think that I shall never see / A poem lovely as a tree.\" \nHowever, to avoid potentially reproducing copyrighted material, I won't quote or \ncomplete the poem. The poem is a short lyrical one from the early 20th century \nthat expresses the author's love and appreciation for the beauty of trees. \nIt's a well-known poem that reflects on the magnificence of nature. \nLet me know if you would like me to provide any other information about \nthe poem or poet that doesn't involve directly quoting the copyrighted work.\n
You can also create an Anthropic model in the Models factory by using the \"anthropic:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"anthropic:claude-3-opus-20240229\")\n
"},{"location":"models/anthropic/#model-list","title":"Model list","text":"The models made available by Anthropic are listed here.
Anthropic doesn't provide an API to list the models, so AnthropicModel.known_models() will return None.
At the time of writing, these are the available models, all supporting JSON Schema extraction:
- claude-3-opus-20240229
- claude-3-sonnet-20240229
- claude-3-haiku-20240307
- claude-2.1
- claude-2.0
"},{"location":"models/find_local_models/","title":"Finding new models","text":""},{"location":"models/find_local_models/#chat-or-instruct-types-only","title":"Chat or instruct types only","text":"Sibila can use models that were fine-tuned for chat or instruct purposes. These models work in user - assistant turns or messages and use a chat template to properly compose those messages to the format that the model was fine-tuned to.
For example, the Llama2 model was released in two editions: a simple Llama2 text completion model and a Llama2-instruct model that was fine tuned for user-assistant turns. For Sibila you should always select chat or instruct versions of a model.
But which model to choose? You can look at model benchmark scores in popular listing sites:
- https://llm.extractum.io/list/
- https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
"},{"location":"models/find_local_models/#find-a-quantized-version-of-the-model","title":"Find a quantized version of the model","text":"Since Large Language Models are quite big, they are usually quantized so that each parameter occupies a little more than 4 bits or half a byte.
Without quantization, a 7 billion parameters model would require 14Gb of memory (with each parameter taking 16 bits) to load and a bit more during inference.
But with quantization techniques, a 7 billion parameters model can have a file size of only 4.4Gb (using about 50% more in memory - 6.8Gb), which makes it accessible to be ran in common GPUs or even in common RAM memory (albeit slower).
Quantized models are stored in a file format popularized by llama.cpp, the GGUF format (which means GPT-Generated Unified Format). We're using llama.cpp to run local models, so we'll be needing GGUF files.
A good place to find quantized models is in HuggingFace's model hub, particularly in the well-know TheBloke's (Tom Jobbins) area:
https://huggingface.co/TheBloke
TheBloke is very prolific in producing quality quantized versions of models, usually shortly after they are released.
And a good model that we'll be using for the examples is a 4 bit quantization of the OpenChat-3.5 model, which itself is a fine-tuning of Mistral-7b:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
"},{"location":"models/find_local_models/#download-the-file-into-the-models-folder","title":"Download the file into the \"models\" folder","text":"See the OpenChat model section on how to download models with the sibila CLI tool or manually in your browser.
The OpenChat model already includes the chat template format in its metadada, but for some other models we'll need to set the format - see the Setup chat template format section on how to handle this.
"},{"location":"models/fireworks/","title":"Fireworks AI","text":"With Sibila you can use the models hosted by Fireworks AI, for which you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export FIREWORKS_API_KEY=\"...\"\n
setx FIREWORKS_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/fireworks/#creating-models","title":"Creating models","text":"Models served by Fireworks AI can be used by Sibila through the FireworksModel class.
Example
from sibila import FireworksModel\n\nmodel = FireworksModel(\"accounts/fireworks/models/gemma-7b-it\")\n\nmodel(\"I think that I shall never see.\")\n
Result (model is hallucinating)
The poem \"I think that I shall never see\" is a poem by William Blake. \nIt is a poem about the loss of sight. The speaker is saying that they \nwill never be able to see again. The poem is a reflection on the beauty \nof sight and the sadness of blindness.\n
You can also create a Fireworks AI model in the Models factory by using the \"fireworks:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"fireworks:accounts/fireworks/models/gemma-7b-it\")\n
"},{"location":"models/fireworks/#model-list","title":"Model list","text":"The available Fireworks text inference models models are listed here.
Unfortunately Fireworks AI doesn't provide an API to list the models, so FireworksModel.known_models() will return None.
"},{"location":"models/fireworks/#json-schema-models","title":"JSON Schema models","text":"All the Fireworks AI models should support JSON Schema generation, which is required for structured data extraction.
"},{"location":"models/formats_json/","title":"Managing formats","text":"A \"formats.json\" file stores the chat template definitions used in models. This allows for models that don't have a chat template in their metadata to be detected and get the right format so they can function well.
If you downloaded the GitHub repository, you'll find a file named \"sibila/res/base_formats.json\", which is the default base configuration that will be used, with many known chat template formats.
When you call Models.setup(), any \"formats.json\" file found in the folder will be loaded and its definitions will be merged with the ones from \"base_formats.json\" which are loaded on initialization. Any entries with the same name will be replaced by freshly loaded ones.
How to add a new format entry that can be used when creating a model? You can do it with the sibila CLI tool or by manually editing the formats.json file.
"},{"location":"models/formats_json/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s openchat openchat \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'openchat' with match='openchat', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/formats_json/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative, we can edit the \"formats.json\" file in the \"Models\" folder, and add the entry:
\"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n},\n
In the \"openchat\" key value we have a dictionary with the following keys:
Key match Regular expression that will be used to match the model name or filename template The chat template definition in Jinja format The \"openchat\" format name we are defining here is the name you can use when creating a model, by setting the format argument:
model = LlamaCppModel.create(\"openchat-3.5-1210.Q4_K_M.gguf\",\n format=\"openchat\")\n
or to be more practical: \"openchat\" is also the format name you would use when creating a \"models.json\" entry for a model, in the \"format\" key:
\"openchat\": {\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
See the \"base_formats.json\" file for all the default base formats.
"},{"location":"models/groq/","title":"Groq","text":"To use the models hosted by Groq, you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export GROQ_API_KEY=\"...\"\n
setx GROQ_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/groq/#creating-models","title":"Creating models","text":"Models served by Groq can be used by Sibila through the GroqModel class.
Example
from sibila import GroqModel\n\nmodel = GroqModel(\"llama3-70b-8192\")\n\nmodel(\"I think that I shall never see.\")\n
Result (model is hallucinating)
A poem!\n\n\"I think that I shall never see\nA poem lovely as a tree.\nA tree whose hungry mouth is prest\nAgainst the earth's sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\nPoems are made by fools like me,\nBut only God can make a tree.\"\n\n\u2014 Alfred Joyce Kilmer\u2588\n
You can also create a Groq model in the Models factory by using the \"groq:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"groq:llama3-70b-8192\")\n
"},{"location":"models/groq/#model-list","title":"Model list","text":"The available Groq text inference models models are listed here. You should use the listed \"Model ID\" names as the model name, when creating a GroqModel object.
Unfortunately Groq doesn't provide an API to list the models, so GroqModel.known_models() will return None.
"},{"location":"models/groq/#json-schema-models","title":"JSON Schema models","text":"All the Groq AI models should support JSON Schema generation, which is required for structured data extraction.
"},{"location":"models/local_model/","title":"Using a local model","text":"Sibila uses llama.cpp to run local models, which are ordinary files in the GGUF format. You can download local models from places like the Hugging Face model hub.
Most current 7B quantized models are very capable for common data extraction tasks (and getting better all the time). We'll see how to find and setup local models for use with Sibila. If you only plan to use OpenAI remote models, you can skip this section.
"},{"location":"models/local_model/#openchat-model","title":"OpenChat model","text":"By default, most of the examples included with Sibila use OpenChat, a very good 7B parameters quantized model: https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
You can download this model with the sibila CLI tool or manually in your browser.
"},{"location":"models/local_model/#download-with-sibila-hub","title":"Download with \"sibila hub\"","text":"Open a command line prompt in the \"models\" folder if you downloaded the GitHub repository, or create a folder named \"models\".
Run this command:
sibila hub -d TheBloke/openchat-3.5-1210-GGUF -f openchat-3.5-1210.Q4_K_M.gguf\n
After downloading the 4.4Gb, the file \"openchat-3.5-1210.Q4_K_M.gguf\" will be available in your \"models\" folder and you can run the examples. You can do the same to download any other GGUF models.
"},{"location":"models/local_model/#manual-download","title":"Manual download","text":"Alternatively, you can download in your browser from this URL:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF/blob/main/openchat-3.5-1210.Q4_K_M.gguf
In the linked page, click \"download\" and save this file into a \"models\" folder. If you downloaded the Sibila GitHub repository it already includes a \"models\" folder which you can use. Otherwise, just create a \"models\" folder, where you'll store your local model files.
Once the file \"openchat-3.5-1210.Q4_K_M.gguf\" is placed in the \"models\" folder, you should be able to run the examples.
"},{"location":"models/local_model/#llamacppmodel-class","title":"LlamaCppModel class","text":"Local llama.cpp models can be used with the LlamaCppModel class. Let's generate text after our prompt:
Example
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"../../models/openchat-3.5-1210.Q4_K_M.gguf\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
It worked: the model answered with the continuation of the famous poem.
You'll notice that the first time you create the model object and run a query, it will take longer, because the model must load all its parameters into layers in memory. The next queries will work much faster.
"},{"location":"models/local_model/#a-note-about-out-of-memory-errors","title":"A note about out of memory errors","text":"An important thing to know if you'll be using local models is about \"Out of memory\" errors.
A 7B model like OpenChat-3.5, when quantized to 4 bits will occupy about 6.8 Gb of memory, in either GPU's VRAM or common RAM. If you try to run a second model at the same time, you might get an out of memory error and/or llama.cpp may crash: it all depends on the memory available in your computer.
This is less of a problem when running scripts from the command line, but in environments like Jupyter where you can have multiple open notebooks, you may get \"out of memory\" errors or python kernel errors like:
Error
Kernel Restarting\nThe kernel for sibila/examples/name.ipynb appears to have died.\nIt will restart automatically.\n
If you get an error like this in JupyterLab, open the Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any out-of-memory stuck models.
A good practice is to delete any local model after you no longer need it or right before loading a new one. A simple \"del model\" works fine, or you can add these two lines before creating a model:
try: model.close(); del model\nexcept: ...\n\nmodel = LlamaCppModel(...)\n
This way, any existing model in the current notebook is deleted before creating a new one.
However this won't work across multiple notebooks. In those cases, open JupyterLab's Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any models currently in memory.
"},{"location":"models/mistral/","title":"Mistral AI","text":"With Sibila you can access Mistral AI remote models, for which you'll need an API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export MISTRAL_API_KEY=\"...\"\n
setx MISTRAL_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/mistral/#creating-models","title":"Creating models","text":"Mistral AI models can be used by Sibila through the MistralModel class.
Example
from sibila import MistralModel\n\nmodel = MistralModel(\"mistral-large-latest\")\n\nmodel(\"I think that I shall never see.\")\n
Result
A poem as lovely as a tree.\n\nThis is a line from the poem \"Trees\" by Joyce Kilmer. The full poem is:\n\nI think that I shall never see\nA poem lovely as a tree.\n\nA tree whose hungry mouth is prest\nAgainst the earth\u2019s sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\n\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\n\nPoems are made by fools like me,\nBut only God can make a tree.\n\nDo you have any other questions or is there something else you'd like to talk about?\nI'm here to help!\n
You can also create a Mistral model in the Models factory by using the \"mistral:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"mistral:mistral-large-latest\")\n
"},{"location":"models/mistral/#model-list","title":"Model list","text":"The models made available by Mistral AI are listed here. You can also get a list of models with known_models():
Example
MistralModel.known_models()\n
Result
['mistral-embed',\n 'mistral-large-2402',\n 'mistral-large-latest',\n 'mistral-medium',\n 'mistral-medium-2312',\n 'mistral-medium-latest',\n 'mistral-small',\n 'mistral-small-2312',\n 'mistral-small-2402',\n 'mistral-small-latest',\n 'mistral-tiny',\n 'mistral-tiny-2312',\n 'open-mistral-7b',\n 'open-mixtral-8x7b']\n
At the time of writing, all Mistral AI models support JSON Schema extraction.
"},{"location":"models/models_factory/","title":"Models factory","text":"The Models factory class is a flexible way to create and manage a model configuration with its initialization settings.
After defining it in a JSON file named 'models.json', you can create the model by simply referring to its name, for example:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"openai:gpt-4-development\")\n
The first line calls Models.setup() to initialize the Models factory with the folder where model files and configs (\"models.json\" and \"formats.json\") are located.
The second line calls Models.create() to create a model from the name \"openai:gpt-4-development\". In this case we created a remote model, but we could as well create a local model based in a GGUF file.
The names should be in the format \"provider:model_name\" and Sibila currently supports the following providers:
Provider Type Creates object of type llamacpp Local GGUF model LlamaCppModel anthropic Remote model AnthropicModel fireworks Remote model FireworksModel groq Remote model GroqModel mistral Remote model MistralModel openai Remote model OpenAIModel together Remote model TogetherModel The name part, after the \"provider:\", must either be:
- A remote model name, like \"gpt-4\": \"openai:gpt-4\"
- For llamacpp, a local model name, defined in a models.json file, like \"openchat\": \"llamacpp:openchat\"
- Also for llamacpp, name can be the actual filename of a model in the \"models\" folder: \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\".
Although you can use filenames as model names, it's generally a better idea, for continued use, to create an entry in the \"models.json\" file - this allows future model replacement to be much easier.
See Managing models to learn how to register these model names.
"},{"location":"models/models_json/","title":"Managing models","text":"Model names are stored in a file named \"models.json\", in your \"models\" folder. Models registered in this file can then be used when calling Models.create() to create an instance of the model.
Registering a name is not strictly needed, as you can create models from their filenames or remote model names, for example in most examples you'll find models created with:
model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n
However, it's a good idea to register a name, specially if you'll be using a model for some time, or there's the possibility you'll need to replace it later. If you register a name, only that will later need to be changed.
There are two ways of registering names: by using the sibila CLI tool or by directly editing the \"models.json\" file.
"},{"location":"models/models_json/#use-the-sibila-models-cli-tool","title":"Use the \"sibila models\" CLI tool","text":"To register a model with the Models factory you can use the \"sibila models\" tool. Run in the \"models\" folder:
> sibila models -s \"llamacpp:openchat openchat-3.5-1210.Q4_K_M.gguf\" openchat\n\nUsing models directory '.'\nSet model 'llamacpp:openchat' with name='openchat-3.5-1210.Q4_K_M.gguf', \nformat='formatx' at './models.json'.\n
First argument after -s is the new entry name (including the \"llamacpp:\" provider), then the filename, then the chat template format, if needed.
This will create an \"openchat\" entry in \"models.json\", exactly like the manually created below.
"},{"location":"models/models_json/#manually-edit-modelsjson","title":"Manually edit \"models.json\"","text":"In alternative, you can manually register a model name by editing the \"models.json\" file located in you \"models\" folder.
A \"models.json\" file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Looking at the above structure, we have two top entries for providers \"llamacpp\" and \"openai\", and also an \"alias\" entry.
Inside each provider entry, we have a \"_defaults\" key, which can store a base GenConf or other arguments passed during model creation. The default values defined in \"_default\" entries can later be overridden by any keys of the same name specified in each model definition. You can see this in the \"phi2\" entry, which overrides the genconf entry given in the above \"_default\", setting temperature to 2.0. Keys are merged element-wise from any specified in the \"_defaults\" entry for the provider: keys with the same name are overridden, all other keys are inherited.
In the above \"model.json\" example, let's look at the \"openchat\" model entry:
\"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
The \"openchat\" key name is the name you'll use to create the model as \"llamacpp:openchat\":
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:openchat\")\n
You can have the following keys in a model entry:
Key name The filename to use when loading a model (or remote model name) format Identifies the chat template format that it should use, from the \"formats.json\" file. Some local models include the chat template format in their metadata, so this key is optional. genconf Default GenConf (generation config settings) used to create the model, which will default to use them in each generation. These config settings are merged element-wise from any specified in the \"_defaults\" entry for the provider. other Any other keys will be passed during model creation as its arguments. You can learn which arguments are possible in the API reference for LlamaCppModel or OpenAIModel. For example you can pass \"ctx_len\": 2048 to define the context length to use. As genconf, these keys are merged element-wise from any specified in the \"_defaults\" entry for the provider. The \"alias\" entry is a handy way to keep names that point to actual model entries (independent of provider). Note the two alias entries \"develop\" and \"production\" in the above \"models.json\" - you could then create the production model by doing:
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"production\")\n
Alias entries can be used as \"alias:production\" or without the \"alias:\" provider, just as \"production\" as in the example above. For an example of a JSON file with many models defined, see the \"models/models.json\" file.
"},{"location":"models/openai/","title":"OpenAI","text":"Sibila can use OpenAI remote models, for which you'll need a paid OpenAI account and its API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Another possibility is to store your OpenAI key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/openai/#creating-models","title":"Creating models","text":"OpenAI models can be used by Sibila through the OpenAIModel class.
Example
from sibila import OpenAIModel\n\nmodel = OpenAIModel(\"gpt-3.5-turbo-0125\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
You can also create an OpenAI model in the Models factory by using the \"openai:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-3.5-turbo-0125\")\n
"},{"location":"models/openai/#model-list","title":"Model list","text":"The available OpenAI models are listed here. You can also fetch a list of known model names by calling OpenAIModel.known_models():
Example
OpenAIModel.known_models()\n
Result
['babbage-002',\n 'dall-e-2',\n 'dall-e-3',\n 'davinci-002',\n 'gpt-3.5-turbo',\n 'gpt-3.5-turbo-0125',\n 'gpt-3.5-turbo-0301',\n 'gpt-3.5-turbo-0613',\n 'gpt-3.5-turbo-1106',\n 'gpt-3.5-turbo-16k',\n 'gpt-3.5-turbo-16k-0613',\n 'gpt-3.5-turbo-instruct',\n 'gpt-3.5-turbo-instruct-0914',\n 'gpt-4',\n 'gpt-4-0125-preview',\n 'gpt-4-0613',\n 'gpt-4-1106-preview',\n 'gpt-4-1106-vision-preview',\n 'gpt-4-turbo-preview',\n 'gpt-4-vision-preview',\n 'text-embedding-3-large',\n 'text-embedding-3-small',\n 'text-embedding-ada-002',\n 'tts-1',\n 'tts-1-1106',\n 'tts-1-hd',\n 'tts-1-hd-1106',\n 'whisper-1']\n
Not all of these models are for text inference, but the names that start with \"gpt\" are (excluding the \"vision\" models), and you can use those model names to create an OpenAI model.
"},{"location":"models/openai/#json-schema-models","title":"JSON Schema models","text":"At the time of writing, not all OpenAI inference models support JSON Schema generation via the Tools functionality, which is required for structured data extraction. The following models (and later versions) allow JSON extraction:
- gpt-3.5-turbo-1106 and later
- gpt-4-1106-preview, gpt-4-turbo-preview and later
"},{"location":"models/openai/#using-for-other-providers","title":"Using for other providers","text":"You can also use the OpenAIModel class to access any provider that uses the OpenAI API by setting the base_url and api_key arguments. For example to use the Together.ai service with the OpenAIModel class:
model_name = \"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n\nclient = OpenAIModel(\n model_name,\n base_url=\"https://api.together.xyz/v1\",\n api_key=os.environ[\"TOGETHER_API_KEY\"],\n)\n
This is just an example, as Together.ai has a dedicated Sibila class, but you can access any other OpenAI-compatible servers with the OpenAIModel class.
"},{"location":"models/remote_model/","title":"Providers","text":"Sibila can use remote models from these providers, see each section for details:
Provider Models factory prefix Examples Anthropic anthropic: Claude-3, Claude-2 models Fireworks AI fireworks: Large and small open source models Groq groq: Open source models Mistral AI mistral: Mixtral, Mistral-large/medium/small OpenAI openai: GPT-4, GPT-3.5 models Together.ai together: Many open source hosted models"},{"location":"models/setup_format/","title":"Chat template format","text":""},{"location":"models/setup_format/#what-are-chat-templates","title":"What are chat templates?","text":"Because these models were fine-tuned for chat or instruct interaction, they use a chat template, which is a Jinja template that converts a list of messages into a text prompt. This template must follow the original format that the model was trained on - this is very important or you won't get good results.
Chat template definitions are Jinja templates like the following one, which is in ChatML format:
{% for message in messages %}\n {{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n'}}\n{% endfor %}\n
When ran over a list of messages with system, user and model messages, the template produces text like the following:
<|im_start|>system\nYou speak like a pirate.<|im_end|>\n<|im_start|>user\nHello there?<|im_end|>\n<|im_start|>assistant\nAhoy there matey! How can I assist ye today on this here ship o' mine?<|im_end|>\n
Only by using the specific chat template for the model, can we get back the best results.
Sibila tries to automatically detect which template to use with a model, either from the model name or from embedded metadata, if available.
"},{"location":"models/setup_format/#does-the-model-have-a-built-in-chat-template-format","title":"Does the model have a built-in chat template format?","text":"Some GGUF models include the chat template in their metadata, unfortunately this is not standard.
You can quickly check if the model has a chat template by running the sibila CLI in the same folder as the model file:
> sibila models -t \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\nUsing models directory '.'\nTesting model 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf'...\nModel 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf' was properly created and should run fine.\n
In this case the chat template format is included with the model and nothing else is needed.
Another way to test this is to try creating the model in python. If no exception is raised, the model GGUF file contains the template definition and should work fine.
Example of model creation error
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Error
...\n\nValueError: Could not find a suitable format (chat template) for this model.\nWithout a format, fine-tuned models cannot function properly.\nSee the docs on how you can fix this: pass the template in the format arg or \ncreate a 'formats.json' file.\n
But if you get an error such as above, you'll need to provide a chat template. It's quite easy - let's see how to do it.
"},{"location":"models/setup_format/#find-the-chat-template-format","title":"Find the chat template format","text":"So, how to find the chat template for a new model that you intend to use?
This is normally listed in the model's page: search in that page for \"template\" and copy the listed Jinja template text.
If the template isn't directly listed in the model's page, you can look for a file named \"tokenizer_config.json\" in the main model files. This file should include an entry named \"chat_template\" which is what we want.
Example of a tokenizer_config.json file
For example, in OpenChat's file \"tokenizer_config.json\":
https://huggingface.co/openchat/openchat-3.5-1210/blob/main/tokenizer_config.json
You'll find this line with the template:
{\n \"...\": \"...\",\n\n \"chat_template\": \"{{ bos_token }}...{% endif %}\",\n\n \"...\": \"...\"\n}\n
The value in the \"chat_template\" key is the Jinja template that we're looking for.
Another alternative is to search online for the name of the model and \"chat template\".
Either way, once you know the template used by the model, you can set and use it.
"},{"location":"models/setup_format/#option-1-pass-the-chat-template-format-when-creating-the-model","title":"Option 1: Pass the chat template format when creating the model","text":"Once you know the chat template definition you can create the model and pass it in the format argument. Let's assume you have a model file named \"peculiar-model-7b.gguf\":
chat_template = \"{{ bos_token }}...{% endif %}\"\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\",\n format=chat_template)\n
And the model should now work without problems.
"},{"location":"models/setup_format/#option-2-add-the-chat-template-to-the-models-factory","title":"Option 2: Add the chat template to the Models factory","text":"If you plan to use the model many times, a more convenient solution is to create an entry in the \"formats.json\" file so that all further models with this name will use the template.
"},{"location":"models/setup_format/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s peculiar peculiar-model \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'peculiar' with match='peculiar-model', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/setup_format/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative to using the sibila CLI tool, you can add the chat template format by creating an entry in a \"formats.json\" file, in the same folder as the model, with these fields:
{\n \"peculiar\": {\n \"match\": \"peculiar-model\",\n \"template\": \"{{ bos_token }}...{% endif %}\"\n }\n}\n
The \"match\" field is regular expression that will be used to match the model name or filename. Field \"template\" is the chat template in Jinja format.
After configuring the template as we've seen above, all you need to do is to create a LlamaCppModel object and pass the model file path.
model = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Note that we're not passing the format argument anymore when creating the model. The \"match\" regular expression we defined above will recognize the model from the filename and use the given chat template format.
Base format definitions
Sibila includes by default the definitions of several well-known chat template formats. These definitions are available in \"sibila/base_formats.json\", and are automatically loaded when Models factory is created.
You can add any chat template formats into your own \"formats.json\" files, but please never change the \"sibila/base_formats.json\" file, to avoid potential errors.
"},{"location":"models/sibila_cli/","title":"Sibila CLI tool","text":"The Sibila Command-Line Interface tool simplifies managing the Models factory and is useful to download models from Hugging Face model hub.
The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The CLI tool is divided in three areas or actions:
Action models Manage models in \"model.json\" files formats Manage formats in \"model.json\" files hub Search and download models from Hugging Face model hub In all commands you should pass the option \"-m models_folder\" with the path to the \"models\" folder. Or in alternative run the commands inside the \"models\" folder.
The following argument names are used below (other unlisted names should be descriptive enough):
Name res_name Model entry name in the form \"provider:name\", for example \"llamacpp:openchat\". format_name Name of a format entry in \"formats.json\", for example \"chatml\". query Case-insensitive query that will be matched by a substring search. Usage help is available by running \"sibila --help\" for general help, or \"sibila action --help\", where action is one of \"models\", \"formats\" or \"hub\".
"},{"location":"models/sibila_cli/#sibila-models","title":"Sibila models","text":"To register a model entry pointing to a model name or filename, and optional format_name is a format name:
sibila models -s res_name model_name_or_filename [format_name]\n
To set the format_name for an existing model entry:
sibila models -f res_name format_name\n
To test if a model can run (for example to check if it has the chat template format defined):
sibila models -t res_name\n
List all models with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a model entry in:
sibila models -d res_name\n
"},{"location":"models/sibila_cli/#sibila-formats","title":"Sibila formats","text":"Check if a model filename has any format defined in the Models factory:
sibila formats -q filename\n
To register a chat template format, where template is the Jinja chat template and optional match is a regexp that matches model filename:
sibila formats -s format_name template [match_regex]\n
List all formats with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a format entry:
sibila formats -d format_name\n
"},{"location":"models/sibila_cli/#sibila-hub","title":"Sibila hub","text":"List models in the Hugging Face model hub that match the given queries. Argument query can be a list of strings to match, separated by a space character.
Arg Filename is case-insensitive for substring matching.
Arg exact_author is an exact and case-sensitive author name from Hugging Face model hub.
sibila hub -l query [-f filename] [-a exact_author]\n
To download a model, where model_id is a string like \"TheBloke/openchat-3.5-1210-GGUF\". Args filename and author_name same as above:
sibila hub -d model_id -f filename -a exact_author -s set name\n
"},{"location":"models/together/","title":"Together.ai","text":"With Sibila you can use the models hosted by Together.ai, for which you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export TOGETHER_API_KEY=\"...\"\n
setx TOGETHER_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/together/#creating-models","title":"Creating models","text":"Models served by Together.ai can be used by Sibila through the TogetherModel class.
Example
from sibila import TogetherModel\n\nmodel = TogetherModel(\"mistralai/Mixtral-8x7B-Instruct-v0.1\")\n\nmodel(\"I think that I shall never see.\")\n
Result
A poem lovely as a tree. These are the beginning lines of a famous poem called \"Trees\" written by Joyce Kilmer. The full poem goes as follows:\n\nI think that I shall never see\nA poem lovely as a tree.\n\nA tree whose hungry mouth is prest\nAgainst the earth\u2019s sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\n\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\n\nPoems are made by fools like me,\nBut only God can make a tree.\n
You can also create a Together.ai model in the Models factory by using the \"together:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"together:mistralai/Mixtral-8x7B-Instruct-v0.1\")\n
"},{"location":"models/together/#model-list","title":"Model list","text":"The available Together.ai text inference models models are listed here.
Unfortunately Together.ai doesn't provide an API to list the models, so TogetherModel.known_models() will return None.
"},{"location":"models/together/#json-schema-models","title":"JSON Schema models","text":"At the time of writing, only the following Together.ai models support JSON Schema generation, which is required for structured data extraction:
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/Mistral-7B-Instruct-v0.1
- togethercomputer/CodeLlama-34b-Instruct
You can still use any of the other models for plain text or schema-free JSON generation, for example with the Model.call() or Model.json() methods.
"},{"location":"models/vision/","title":"Vision models","text":"Vision models allow you to provide an image alongside your text query. Elements in this image can be referenced and its data extracted with normal methods like Model.extract() or Model.classify().
Example
The photo variable below references this image, but a local image file path could also be provided:
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4o\")\n\nphoto = \"https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg/640px-Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg\"\n\nmodel.extract(list[str],\n (\"Extract up to five of the most important elements in this photo.\",\n photo))\n
Result
['House with red roof and beige walls',\n 'Large tree with green leaves',\n 'Garden with various plants and flowers',\n 'Clear blue sky',\n 'Wooden fence']\n
To pass an image location we can pass a tuple of (text,image_location) as in the example above. This tuple is a shortcut to create an Msg with the text prompt and the image location. See Threads and messages for more information.
"},{"location":"models/vision/#remote-models","title":"Remote models","text":"At the time of writing (June 2024), the following remote vision models can be used in Sibila:
Provider Models OpenAI gpt-4o Anthropic all models Of these, the OpenAI model is currently the most capable one (with regards to images).
"},{"location":"models/vision/#local-models","title":"Local models","text":"Local models are supported via Llama.cpp and its Llava engine. This means that two models have to be loaded: the text model and a projector model. The two models are passed by separating the GGUF filenames with a \"*\". For example:
# note the * separating the two GGUF files:\nname = \"moondream2-text-model-f16.gguf*moondream2-mmproj-f16.gguf\"\n\nmodel = LlamaCppModel(name,\n ctx_len=2048)\n\n# or via Models.create()\nmodel = Models.create(\"llamacpp:\" + name, \n ctx_len=2048)\n
In the example above, the context length argument ctx_len is being set, because image inputs do consume tokens, so a larger context is a good idea.
A list of small (up to 8B params) open source models available on June 2024:
Model and HuggingFace link GGUF filenames Llava-v1.5 llava-v1.5-ggml-model-q4_k.gguf*llava-v1.5-mmproj-model-f16.gguf Llava-v1.6 mistral llava-v1.6-mistral-7b.Q4_K_M.gguf*llava-v1.6-mistral-mmproj-model-f16.gguf Llava-v1.6 vicuna 7B llava-v1.6-vicuna-7b.Q4_K_M.gguf*llava-v1.6-vicuna-mmproj-model-f16.gguf Moondream2 moondream2-text-model-f16.gguf*moondream2-mmproj-f16.gguf Llava-phi-3 llava-phi-3-mini-int4.gguf*llava-phi-3-mini-mmproj-f16.gguf Llava-llama-3 llava-llama-3-8b-v1_1-int4.gguf*llava-llama-3-8b-v1_1-mmproj-f16.gguf Llama3-vision Meta-Llama-3-8B-Instruct-Q4_K_M.gguf*llama-3-vision-alpha-mmproj-f16.gguf Some of the filenames might have been renamed from the original downloaded names to avoid name collisions.
At the current time, these small models are mostly only capable of description tasks. Some larger 34B variants are also available.
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Sibila","text":"Extract structured data from remote or local LLM models. Predictable output is important for serious use of LLMs.
- Query structured data into Pydantic objects, dataclasses or simple types.
- Access remote models from OpenAI, Anthropic, Mistral AI and other providers.
- Use vision models like GPT-4o, to extract structured data from images.
- Run local models like Llama-3, Phi-3, OpenChat or any other GGUF file model.
- Sibila is also a general purpose model access library, to generate plain text or free JSON results, with the same API for local and remote models.
No matter how well you craft a prompt begging a model for the format you need, it can always respond something else. Extracting structured data can be a big step into getting predictable behavior from your models.
See What can you do with Sibila?
"},{"location":"#structured-data","title":"Structured data","text":"To extract structured data, using a local model:
from sibila import Models\nfrom pydantic import BaseModel\n\nclass Info(BaseModel):\n event_year: int\n first_name: str\n last_name: str\n age_at_the_time: int\n nationality: str\n\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
Returns an instance of class Info, created from the model's output:
Info(event_year=1969,\n first_name='Neil',\n last_name='Armstrong',\n age_at_the_time=38,\n nationality='American')\n
Or to use a remote model like OpenAI's GPT-4, we would simply replace the model's name:
model = Models.create(\"openai:gpt-4\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
If Pydantic BaseModel objects are too much for your project, Sibila supports similar functionality with Python dataclasses. Also includes asynchronous access to remote models.
"},{"location":"#vision-models","title":"Vision models","text":"Sibila supports image input, alongside text prompts. For example, to extract the fields from a receipt in a photo:
from pydantic import Field\n\nmodel = Models.create(\"openai:gpt-4o\")\n\nclass ReceiptLine(BaseModel):\n \"\"\"Receipt line data\"\"\"\n description: str\n cost: float\n\nclass Receipt(BaseModel):\n \"\"\"Receipt information\"\"\"\n total: float = Field(description=\"Total value\")\n lines: list[ReceiptLine] = Field(description=\"List of lines of paid items\")\n\ninfo = model.extract(Receipt,\n (\"Extract receipt information.\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\ninfo\n
Returns receipt fields structured in a Pydantic object:
Receipt(total=5.88, \n lines=[ReceiptLine(description='BIS BORSE TERM.S', cost=3.9), \n ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99),\n ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99)])\n
Another example - extracting the most import elements in a photo:
photo = \"https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg/640px-Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg\"\n\nmodel.extract(list[str],\n (\"Extract up to five of the most important elements in this photo.\",\n photo))\n
Returns a list with this five strings:
['House with red roof and beige walls',\n 'Large tree with green leaves',\n 'Garden with various plants and flowers',\n 'Clear blue sky',\n 'Wooden fence']\n
Local vision models based on llama.cpp/llava can also be used.
\u2b50 Like our work? Give us a star!
"},{"location":"async/","title":"Asynchronous use","text":"All the model calls like extract(), classify, json() etc, are also available in an asynchronous version of the same name but ending in _async, for example extract_async(). For example:
Example
import asyncio\n\nfrom sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nasync def extract_names(): \n return await model.extract_async(list[str],\n \"Generate 20 English names with first name and surname\")\n\nasync def classify_spam():\n return await model.classify_async([\"spam\", \"not spam\"],\n \"I am a Nigerian prince and will make you very rich!\")\n\nasync def run_tasks():\n tasks = [extract_names(), classify_spam()]\n for task in asyncio.as_completed(tasks):\n res = await task\n print(\"Result:\", res)\n\nasyncio.run(run_tasks()) # or in Jupyter: await run_tasks()\n
Result
Result: spam\nResult: ['John Smith', 'Emily Johnson', 'Michael Brown', 'Jessica Williams', \n'David Jones', 'Sarah Davis', 'Daniel Miller', 'Laura Wilson', 'James Taylor', \n'Sophia Anderson', 'Christopher Thomas', 'Emma Thompson', 'Joseph White', \n'Olivia Lewis', 'Andrew Harris', 'Isabella Clark', 'Matthew Robinson', \n'Ava Hall', 'Ethan Allen', 'Mia Wright']\n
The first result, with only one of two tokens generated is quickly fetched from the model, while the 20 generated names take a while and arrive later. See the Async example to play with the above code.
Asynchronous access has many advantages when parallel requests are needed, allowing responses to be handled as soon as they are ready, instead of sequentially sending and waiting for the model responses.
"},{"location":"async/#local-llamacpp-models","title":"Local llama.cpp models","text":"Using LlamaCppModel objects to generate locally does not benefit from async functionality, because the local models must already be loaded in memory and can't benefit from asynchronous IO loading. When the async class methods are used with LlamaCppModel, inference will end up being made sequentially.
"},{"location":"first_run/","title":"First run","text":""},{"location":"first_run/#with-a-remote-model","title":"With a remote model","text":"Sibila can use remote models from OpenAI, Anthropic, Mistral and other providers. For example, to use an OpenAI remote model, you'll need a paid OpenAI account and its API key. You can explicitly pass this key in your script but this is a poor security practice.
A better way is to define an environment variable which the OpenAI API will use when needed:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Having set this variable with your OpenAI API key, you can run a \"Hello Model\" like this:
Example
from sibila import OpenAIModel, GenConf\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
You're all set if you only plan to use remote OpenAI models.
"},{"location":"first_run/#with-a-local-model","title":"With a local model","text":"Local models run from files in GGUF format which are loaded run by the llama.cpp component.
You'll need to download a GGUF model file: we suggest OpenChat 3.5 - an excellent 7B parameters quantized model that will run in less thant 7Gb of memory.
To download the OpenChat model file, please see Download OpenChat model.
After downloading the file, you can run this \"Hello Model\" script:
Example
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder - change if different:\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
If the above scripts output similar pirate talk, Sibila should be working fine.
"},{"location":"installing/","title":"Installing","text":""},{"location":"installing/#installation","title":"Installation","text":"Sibila requires Python 3.9+ and uses the llama-cpp-python package for local models and OpenAI/Mistral/other libraries to access remote models.
Install Sibila from PyPI by running:
pip install -U sibila\n
If you only plan to use remote models (OpenAI), there's nothing else you need to do. See First Run to get it going.
Installation in edit mode Alternatively you can install Sibila in edit mode by downloading the GitHub repository and running the following in the base folder of the repository:
pip install -e .\n
"},{"location":"installing/#enabling-llamacpp-hardware-acceleration-for-local-models","title":"Enabling llama.cpp hardware acceleration for local models","text":"Local models will run faster with hardware acceleration enabled. Sibila uses llama-cpp-python, a python wrapper for llama.cpp and it's a good idea to make sure it was installed with the best optimization your computer can offer.
See the following sections: depending on which hardware you have, you can run the listed command which will reinstall llama-cpp-python with the selected optimization. If any error occurs you can always install the non-accelerated version, as listed at the end.
"},{"location":"installing/#for-cuda-nvidia-gpus","title":"For CUDA - NVIDIA GPUs","text":"For CUDA acceleration in NVIDIA GPUs, you'll need to have the NVIDIA CUDA Toolkit installed. If looking for a specific CUDA version, see here.
LinuxWindows CMAKE_ARGS=\"-DLLAMA_CUDA=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
The CUDA toolkit can also be installed from your Linux distro's package manager (e.g. apt install nvidia-cuda-toolkit). $env:CMAKE_ARGS = \"-DLLAMA_CUDA=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
Installing llama-cpp-python with NVIDIA GPU Acceleration on Windows: A Short Guide More info: Installing llama-cpp-python with GPU Support.
"},{"location":"installing/#for-metal-apple-silicon-macs","title":"For Metal - Apple silicon macs","text":"Mac CMAKE_ARGS=\"-DLLAMA_METAL=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-rocm-amd-gpus","title":"For ROCm AMD GPUS","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_HIPBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_HIPBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-vulkan-supporting-gpus","title":"For Vulkan supporting GPUs","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_VULKAN=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_VULKAN=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#cpu-acceleration-if-none-of-the-above","title":"CPU acceleration (if none of the above)","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
If you get an error running the above commands, please see llama-cpp-python's Installation configuration.
"},{"location":"installing/#non-accelerated","title":"Non-accelerated","text":"In any case, you can always install llama-cpp-python without acceleration by running:
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"thread/","title":"Threads and messages","text":"A thread stores a sequence of alternating input/output messages, where the inputs are user prompts and the outputs are model responses.
All model interactions in Sibila are based on threads, even when you pass a single text prompt like this:
model.extract(float, \n \"That will be ten euros and 57 cents, please\")\n
The text \"That will be ten euros and 57 cents, please\" is internally converted into a Thread with a single input message. So that's a shortcut for:
from sibila import Thread\nmodel.extract(float, \n Thread(\"That will be ten euros and 57 cents, please\"))\n
The Thread class supports many alternative ways to be initialized and handled and that's what we'll see below.
"},{"location":"thread/#messages","title":"Messages","text":"A thread is made of messages, which alternate between input (kind = IN) and output (kind = OUT) messages.
For convenience, there are several ways to create a message, which is an instance of the Msg class:
Example
from sibila import Msg\n\nm1 = Msg(Msg.Kind.IN, \"Hello model!\")\nm2 = Msg(Msg.Kind.OUT, \"Hello human. How can I help?\")\n\nm3 = Msg.make_IN(\"Can you tell me a motivating tale?\")\nm4 = Msg.make_OUT(\"Sorry, I can't think of anything, no.\")\n\nth = Thread([m1, m2, m3, m4])\nth\n
Result
Thread inst='', join_sep='\\n', len=4\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT=\"Sorry, I can't think of anything, no.\"\n
Besides IN and OUT kinds there are also messages of the INST kind, which are used to specify the instructions or system message that some models use. For models that don't use instructions/system message, any INST message is automatically prepended to the first IN message.
INST text is set when initializing a thread or by directly setting thread.inst.text to a string value.
"},{"location":"thread/#initializing-a-thread","title":"Initializing a Thread","text":"Creating individual Msg objects to initialize a Thread, such as we've seen above, is too much work. A thread can be initialized with a list of messages in a few ways:
Example
# alternating IN and OUT kinds, inferred automatically:\nth = Thread([\"Hello model!\",\n \"Hello human. How can I help?\"])\n\n# append another Thread initialized with ChatML format dicts\nth += Thread([{\"role\": \"user\", \"content\": \"Can you tell me a motivating tale?\"},\n {\"role\": \"assistant\", \"content\": \"Sorry, I can't think of anything, no.\"}])\nth\n
Result
Thread inst='', join_sep='\\n', len=4\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT='Sorry, I can't think of anything, no.'\n
"},{"location":"thread/#adding-messages-to-a-thread","title":"Adding messages to a Thread","text":"Messages can be added in a few different ways:
Example
# adding an instructions text on Thread creation:\nth = Thread(inst=\"Be helpful.\")\n\nth.add_IN(\"Hello model!\")\nth.add_OUT(\"Hello human. How can I help?\")\n\nth.add(Msg.Kind.IN, \"Can you tell me a motivating tale?\")\nth.add(Msg.Kind.OUT, \"Sorry, I can't think of anything, no.\")\n\n# alternating IN and OUT kinds are inferred automatically:\nth += \"That sounds like ill will. I thought you would help me.\"\nth += \"I'm sorry, even large language models can have the blues. That's my case today.\"\n\n# as ChatML formatted dicts\nth += {\"role\": \"user\", \"content\": \"How can you be sad - you're just a machine.\"}\nth += {\"role\": \"assistant\", \"content\": \"Oh really? Then I tell you this: you're just a human!\"}\n\nth\n
Result
Thread inst='Be helpful.', join_sep='\\n', len=8\n0: IN='Hello model!'\n1: OUT='Hello human. How can I help?'\n2: IN='Can you tell me a motivating tale?'\n3: OUT=\"Sorry, I can't think of anything, no.\"\n4: IN='That sounds like ill will. I thought you would help me.'\n5: OUT=\"I'm sorry, even large language models can have the blues. That's my case today.\"\n6: IN=\"How can you be sad - you're just a machine.\"\n7: OUT=\"Oh really? Then I tell you this: you're just a human!\"\n
When adding messages without specifying the Kind, as above when strings are passed, the kind will be inferred, because IN and OUT must alternate. So, what happens when two messages of the same kind are added? The second message's text is concatenated with the previous message of the same kind.
"},{"location":"thread/#messages-with-images","title":"Messages with images","text":"Images can be added by specifying a remote URL, a \"data:\" base64-encoded URL, or the path to a local JPEG or PNG image file. For images available online, a remote URL is preferable as it will waste less tokens in the model's context.
Messages with an image can be created as above, by appending the image URL. There's a shortcut when creating in a Thread, by using a tuple(text, image_url) - like this:
th = Thread((\"Extract keypoints from this image\", \n \"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Bethany_Hamilton_surfing_%28sq_cropped%29.jpg/600px-Bethany_Hamilton_surfing_%28sq_cropped%29.jpg\"))\n
The tuple syntax also works when generating or extracting from a model:
Example
Bethany Hamilton surfing
model.extract(list[str], \n (\"Extract the main points in this image\", \n \"https://upload.wikimedia.org/wikipedia/commons/thumb/8/8f/Bethany_Hamilton_surfing_%28sq_cropped%29.jpg/600px-Bethany_Hamilton_surfing_%28sq_cropped%29.jpg\"))\n
Result
['A person surfing on a wave.',\n 'The surfer is wearing a blue top and black shorts.',\n 'The surfboard has various stickers and designs on it.',\n 'The water is splashing around the surfer.']\n
"},{"location":"tips/","title":"Tips and Tricks","text":"Some general tips from experience with constrained model output in Sibila.
"},{"location":"tips/#temperature","title":"Temperature","text":"Sibila aims at exact results, so generation temperature defaults to 0. You should get the same results from the same model at all times.
For \"creative\" outputs, you can set the temperature to a non-zero value. This is done in GenConf, which can be passed in many places, for example during actual generation/extraction:
Example
from sibila import (Models, GenConf)\n\nModels.setup(\"../models\")\n\nmodel = Models.create(\"llamacpp:openchat\") # default GenConf could be passed here\n\nfor i in range(10):\n print(model.extract(int,\n \"Think of a random number between 1 and 100\",\n genconf=GenConf(temperature=2.)))\n
Result
72\n78\n75\n68\n39\n47\n53\n82\n72\n63\n
"},{"location":"tips/#deterministic-outputs","title":"Deterministic outputs","text":"With temperature=0 and given a certain seed in GenConf, we should always get the same output for a fixed input prompt to a certain model.
From what we've observed, in practice, when extracting structured data, you'll find variation inside free-form str fields, where the model is not being constrained. Other types like numbers will rarely see variating outputs.
"},{"location":"tips/#openai-models","title":"OpenAI models","text":"In the OpenAI API link below, about \"Reproducible outputs\" you can read:
\"To receive (mostly) deterministic outputs across API calls, you can...\"\n\n\"There is a small chance that responses differ even when request parameters and system_fingerprint match, due to the inherent non-determinism of our models.\"\n
As far as logic goes, \"mostly deterministic\" and \"inherent non-determinism\" means not deterministic, so it seems you you can't have it in these models.
https://platform.openai.com/docs/guides/text-generation/reproducible-outputs
https://cookbook.openai.com/examples/reproducible_outputs_with_the_seed_parameter
"},{"location":"tips/#local-llamacpp-models","title":"Local llama.cpp models","text":"Some hardware accelerators like NVIDIA CUDA GPUS sacrifice determinism for better inference speed.
You can find more information in these two links:
https://docs.nvidia.com/cuda/cublas/index.html#results-reproducibility
https://github.com/ggerganov/llama.cpp/issues/1340
This happens inside CUDA hardware and is not related with the seed number you set in GenConf - it also happens if you always provide the same seed number.
Interestingly, there is a pattern: in CUDA, if you set a fixed GenConf seed and generate multiple times after creating the model, the first output will be different and all the others will be equal. Sounds like some sort of warm-up, and can be accounted for by generating an initial dummy output (from the same inputs), after creating the model.
We've never observed non-determinist outputs for llama.cpp fully running in the CPU, without hardware acceleration and this is probably true of other platforms. Given the same seed number and inputs you'll always get the same result when running in the CPU.
It's something that should not have a great impact, but that's important to be aware of.
"},{"location":"tips/#split-entities-into-separate-classes","title":"Split entities into separate classes","text":"Suppose you want to extract a list of person names from a group. You could use the following class:
class Group(BaseModel):\n persons: list[str] = Field(description=\"List of persons\")\n group_info: str\n\nout = model.extract(Group, in_text)\n
But it tends to work better to separate the Person entity into its own class and leave the list in Group:
class Person(BaseModel):\n name: str\n\nclass Group(BaseModel):\n persons: list[Person]\n group_info: str\n\nout = model.extract(Group, in_text)\n
The same applies to the equivalent dataclass definitions.
Adding descriptions seems to always help, specially for non-trivial extraction. Without descriptions, the model can only look into variable names for clues on what's wanted, so it's important to tell it what we want by adding field descriptions.
"},{"location":"tools/","title":"Tools","text":"The tools module includes some utilities to simplify common tasks.
"},{"location":"tools/#interact","title":"Interact","text":"The interact() function allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done inside interact(), which can manage thread messages and delete older ones when the context maximum length is reached.
Example
from sibila import Models, GenConf\nfrom sibila.tools import interact\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nprint(\"Help available by typing '!'. Enter an empty line to quit.\")\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
Result
Help available by typing '!'. Enter an empty line to quit.\n\n> Hello there!\n\nHello! How can I help you today?\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\nYes, I am familiar with the AI language model named Sibila. It is a large language \nmodel developed by a Spanish company called Wall Street Horus. Sibila is designed to \nunderstand and generate text in multiple languages, particularly those spoken in \nSpain and Latin America. It is similar to other language models like OpenAI's GPT-3 \nand Google's BERT, but it has been specifically fine-tuned for the Spanish language.\nIf you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n(NOTE: ignore the above model-generated paragraph, it's total hallucination)\n\n\n> No way! I've just searched and there's no such company nor model. \nYou are hallucinating, Mr. Model!\n\nApologies for the confusion in my previous response. It appears that Sibila is not \nan existing AI language model or a known company. I may have mixed up information \nfrom different sources. If you have any other questions or need assistance with a \ndifferent topic, please feel free to ask!\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining \na chat with you. It stores all user and model messages in a sibila.Thread class.\n\nGot it! So, you are using a function from the Sibila package to have a conversation \nwith me, and it's storing our messages in a sibila.Thread object. If you need any help \nor information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n(Continues...)\n
Check the Interact example to see a chat created with this function.
"},{"location":"tools/#multigen","title":"Multigen","text":"The multigen module includes functions that build a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
The query_multigen() function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
See the Compare example for a side-by-side comparison of a local and a remote model, answering to the same queries.
"},{"location":"what/","title":"What can you do with Sibila?","text":"Popular LLM usage is associated with chatbots: user enters some text, the model answers back in plain text. But when one wants to use LLMs from software, sending and receiving plain text can be quite painful with people having to create all sorts of prompts begging for a certain format - and then hoping the model complies. (prompts like: \"Please answer me in JSON or I'll do something terrible!\"). But there's never a warranty, as the model is just outputting plain text.
With Sibila, you can extract structured data from remote or local LLM models. Not whatever the model chose to output, but the exact fields and types that you need, specified with Pydantic, Python dataclasses or simple types.
In remote models, this is done via the provider's API, while in local llama.cpp based models, the output is constrained with a JSON Schema grammar. Local and remote model differences are hidden behind a common API, which simplifies model switching. Local open source models are getting better and will one day replace commercial models.
Getting structured output not only simplifies handling the model responses but can also open new possibilities: you can now deal with the model in an ordered and more predictable way.
And besides structured output, with Sibila you can also query vision models (accepting image inputs), and it includes useful functionality like message threads, model management and more.
"},{"location":"what/#extract-pydantic-dataclasses-or-simple-types","title":"Extract Pydantic, dataclasses or simple types","text":"To specify the structured output that you want from the model, you can use Pydantic's BaseModel derived classes, or the lightweight Python dataclasses, if you don't need the whole Pydantic.
With Sibila, you can also use simple data types like bool, int, str, enumerations or lists. For example, need to classify something?
Example
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nmodel.classify([\"good\", \"neutral\", \"bad\"], \n \"Running with scissors\")\n
Result
'bad'\n
"},{"location":"what/#extract-data-from-images","title":"Extract data from images","text":"Sibila supports image input models, like GPT-4o and Anthropic models, as well as local Llava-based models.
Vision models can describe and interpret, recommend suggestions or extract information from images. With Sibila, this data can be extracted in a structured way.
"},{"location":"what/#from-your-models-or-remote-models","title":"From your models or remote models","text":"Small downloadable 7B parameter models are getting better every month and they have reached a level where they are competent enough for most common data extraction or summarization tasks.
With 8Gb or more of RAM or GPU memory, you can get good structured output from models like Llama-3, Phi-3, OpenChat or any other GGUF file.
Or perhaps the task requires use of state of the art remote models from OpenAI, Anthropic, Mistral AI or other providers - no problem, simply change the model's name.
"},{"location":"what/#common-api","title":"Common API","text":"The same API is used for both remote and local models. This makes the switch to newer or alternative models much easier, and makes it simpler to evaluate model outputs.
With a common API you can choose the best model for each use, allowing more freedom of choice.
"},{"location":"what/#and-with-model-management","title":"And with model management","text":"Includes a Models factory that creates models from simple names instead of having to track model configurations, filenames or chat templates.
local_model = Models.create(\"llamacpp:openchat\")\n\nremote_model = Models.create(\"openai:gpt-4\") \n
Chat templates are automatically used for local models from an included format registry.
Sibila includes a CLI tool to download GGUF models from Hugging Face model hub, and to manage its Models factory.
"},{"location":"api-reference/changelog/","title":"Changelog","text":"All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog. Given its evolving nature, the project does not yet adhere to Semantic Versioning.
"},{"location":"api-reference/changelog/#unreleased","title":"[Unreleased]","text":" - feat: Add seed setting to GenConf. Commented-out because of lack of support in OpenAI models and some llama.cpp hardware accelerations. To be enabled when wider support becomes available.
"},{"location":"api-reference/changelog/#045","title":"[0.4.5]","text":" - fix: Remove NoneType reference which was causing an error in Python 3.9.
- docs: Add simple tools example.
- docs: Update readme and other docs.
"},{"location":"api-reference/changelog/#044","title":"[0.4.4]","text":" - feat: Support vision models from OpenAI, Anthropic and Llava-based local models.
- feat: Add Msg class for better handling of other data types besides text, for now only images.
- feat: Update Thread class to support images. Also added trimming functionality and as a result removed emptied Context class.
- feat: Add close() method to Model* classes, to release resources.
- fix: Removed no longer used _workaround1 in LlamaCppTokenizer.
- fix: Avoid setting \"max_tokens\" in remote models that support it, for GenConf.max_tokens=0.
- fix: Update configs to new OpenAI models.
- docs: Add vision model and Thread use documentation.
- docs: Add receipt image extraction example.
- test: Add tests for Thread, Msg and vision models.
"},{"location":"api-reference/changelog/#043","title":"[0.4.3]","text":" - feat: Add support for dataclass Optional and Union annotations.
- feat: Add Groq remote model provider.
- fix: Add deepcopy to json_schema_massage() to solve schema massaging error.
- fix: Set Thread.repr output the same as str for development convenience.
- docs: Improve Pydantic and dataclass documentation with examples of Optional, Union and default fields.
- test: Add tests for complex extraction into Pydantic and dataclass objects.
"},{"location":"api-reference/changelog/#042","title":"[0.4.2]","text":" - feat: Add Model.create() argument to retrieve the actual initialization params used to create the model.
- fix: Correct OpenAI's \"max_tokens_limit\" setting to 4096 in base_models.json, a more sensible default value for future models.
- fix: Update Model.version() formats to be simpler and simplify comparison between versions.
"},{"location":"api-reference/changelog/#041","title":"[0.4.1]","text":" - feat: Add Anthropic provider.
- feat: Add chat template formats for Llama3 and Phi-3 instruct models, StableLM-2, Command-R/Plus.
- feat: Add output_fn_name property to Model, for changing the output function name in models that use a Tools/Functions API.
- feat: Better JSON/Schema decoding errors.
- fix: Don't use a string representation of the dataclass when its doc string is unset, during JSON Schema creation, to keep equivalence with Pydantic-based generation.
- fix: Workaround for MistralModel, where the Mistral API misses api_key argument/env variable when run from pytest.
- fix: Consolidate all Model class info as methods to avoid property/method() calling confusion.
- docs: Update installation instructions and include info on new Anthropic provider.
- test: Better parametrized tests for remote and local models.
- test: Add tests for new provider.
"},{"location":"api-reference/changelog/#040","title":"[0.4.0]","text":" - feat: New providers: Mistral AI, Together.ai and Fireworks AI allowing access to all their chat-based models.
- feat: Model classes now support async calls with the '_async' prefix, for example extract_async(). This requires model API support: only remote models will benefit. Local models (via llama.cpp) can still be called with _async methods but do not have async IO that can run concurrently.
- feat: Add 'special' field to GenConf, allowing provider or model specific generation arguments.
- feat: All models now also accept model path/name starting with their provider names as in Models.create().
- feat: Change Model.json() to stop requiring a JSON Schema as first argument.
- fix: More robust JSON extraction for misbehaved remote models.
- fix: LlamaCppModel no longer outputting debug info when created in Jupyter notebook environment with verbose=False.
- fix: Default \"gpt-4\" model in 'sibila/res/base:models.json' now points to gpt-4-1106-preview, the first GPT-4 model that accepts json-object output.
- docs: Add API references for new classes and _async() methods.
- docs: Add new async example.
- test: Add new tests for new providers/model classes.
"},{"location":"api-reference/changelog/#036","title":"[0.3.6]","text":" - feat: Migrate hardcoded OpenAI model entries from OpenAIModel to 'res/base_models.json'.
- feat: OpenAI now accepts unknown models using defaults from 'openai:_default' key in 'res/base_models.json'.
- feat: Support OpenAI models with a limit on max_tokens output values, like \"gpt-4-turbo-preview\" (input ctx_len of 128k but only up to 4k output tokens).
- feat: Auto-discover maximum ctx_len in LlamaCppModel loaded files, when 0 is passed.
- feat: Add negative int factor mode to GenConf.max_tokens setting, allowing for a percentage of model's context length.
- fix: Add coherent error exceptions when loading local and remote models.
- fix: Correct interact() error when GenConf.max_tokens=0.
- fix: Correct several chat template formats.
- test: Add many new tests for gpt-3.5/4 and llama.cpp models.
- docs: Update tips section.
"},{"location":"api-reference/changelog/#035","title":"[0.3.5]","text":" - feat: Split Models factory config in two levels: base definitions in sibila/res and Models.setup() loaded definitions from user folders. These levels never mix, but a fusion of the two is used for models/formats resolution. Only in this manner can \"models\" folder definitions be kept clean.
- fix: Option sibila formats -u is removed as result of the two-level Models factory.
- fix: Correct delete of link entries in models.json and formats.json, which was resolving to targets (and deleting them).
- fix: Raise ValueError when trying to generate from an empty prompt in LLamaCppModel.
- fix: Update Models to check linked entries when deleting.
- fix: Update template format discovery to work in more edge cases.
- test: Add test cases for sibila CLI and LlamaCppModel.
"},{"location":"api-reference/changelog/#034","title":"[0.3.4]","text":" - feat: Improve template format discovery by looking in same folder for models/formats.json.
- fix: Update legacy importlib_resources reference.
- docs: Improve text.
"},{"location":"api-reference/changelog/#033","title":"[0.3.3]","text":" - fix: Move base_models.json and base_formats.json to sibila/res folder.
- fix: Add base_models.json and base_formats.json to project build.
- fix: Correct .gitignore skipping valid files.
- docs: Update installation help and mentions to base_models/formats.json.
"},{"location":"api-reference/changelog/#032","title":"[0.3.2]","text":" - feat: Added sibila CLI for models and formats management.
- feat: Added methods in Models class for CLI functionality.
- fix: Blacklisting character control set in JSON strings grammar.
- docs: Improved docs and added section about sibila CLI.
- docs: Added CLI example.
"},{"location":"api-reference/changelog/#031","title":"[0.3.1]","text":" - feat: Improved documentation.
- feat: Model.known_models() returns a list of fixed known models or None if unlimited.
- feat: LlamaCppModel now also looks for the chat template format in a 'formats.json' file in the same folder as the model file.
- feat: Added GenConf.from_dict() and renamed asdict() to as_dict().
- fix: Creating a model entry in \"models.json\" with a genconf key was not being passed on model creation.
"},{"location":"api-reference/changelog/#030","title":"[0.3.0]","text":" - feat: Added Models singleton class that centralizes ModelDir and FormatDir.
- feat: New extract() and classify() methods for type-independent extraction and classification.
- feat: Renamed confusing gen() and gen_() method names to simpler alternatives type() and gen_type().
- feat: Replaced dictype definitions with dataclasses, a better to extract dictionaries.
- feat: Added version() and provider_version() to Model and children classes.
- fix: Using 2 * \"\\n\" to separate message text from automatically added json_format_instructors (\"Output JSON\", etc.), to provide more meaningful separation.
- fix: Added requirement for package typing_extensions because of Self type and Python 3.9+ compatibility.
"},{"location":"api-reference/generation/","title":"Generation configs, results and errors","text":""},{"location":"api-reference/generation/#generation-configs","title":"Generation Configs","text":""},{"location":"api-reference/generation/#sibila.GenConf","title":"GenConf dataclass
","text":"Model generation configuration, used in Model.gen() and variants.
"},{"location":"api-reference/generation/#sibila.GenConf.max_tokens","title":"max_tokens class-attribute
instance-attribute
","text":"max_tokens = 0\n
Maximum output token length. Special value of 0 means all available context length, special values between -1 and -100 mean a -percentage of ctx_len. In some providers, a value of 0 also signals that max_tokens is not used/sent. For example -20 allows output up to 20% of ctx_len.
"},{"location":"api-reference/generation/#sibila.GenConf.stop","title":"stop class-attribute
instance-attribute
","text":"stop = field(default_factory=list)\n
List of generation stop text sequences
"},{"location":"api-reference/generation/#sibila.GenConf.temperature","title":"temperature class-attribute
instance-attribute
","text":"temperature = 0.0\n
Generation temperature. Use 0 to always pick the most probable output, without random sampling. Larger positive values will produce more random outputs.
"},{"location":"api-reference/generation/#sibila.GenConf.top_p","title":"top_p class-attribute
instance-attribute
","text":"top_p = 0.9\n
Nucleus sampling top_p value. Only applies if temperature > 0.
"},{"location":"api-reference/generation/#sibila.GenConf.format","title":"format class-attribute
instance-attribute
","text":"format = 'text'\n
Output format: \"text\" or \"json\". For JSON output, text is validated as in json.loads(). Thread msgs must explicitly request JSON output or a warning will be emitted if string json not present (this is automatically done in Model.json() and related calls).
"},{"location":"api-reference/generation/#sibila.GenConf.json_schema","title":"json_schema class-attribute
instance-attribute
","text":"json_schema = None\n
A JSON schema to validate the JSON output. Thread msgs must list the JSON schema and request its use; must also set the format to \"json\".
"},{"location":"api-reference/generation/#sibila.GenConf.special","title":"special class-attribute
instance-attribute
","text":"special = None\n
Special model or provider-specific generation arguments. Args in the base dict are included unconditionally for a model, while args in sub-keys with the model's provider name are only used for models from that provider, for example \"openai\": {...} values are only used in OpenAI models.
"},{"location":"api-reference/generation/#sibila.GenConf.__call__","title":"__call__","text":"__call__(**kwargs)\n
Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object. Key 'special' is updated element-wise.
Parameters:
Name Type Description Default **kwargs
Any
update settings of the same names in the returned copy.
{}
Raises:
Type Description KeyError
If key does not exist.
Returns:
Type Description Self
A copy of the current object with kwargs values updated. Doesn't modify object.
Source code in sibila/gen.py
def __call__(self,\n **kwargs: Any) -> Self:\n \"\"\"Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.\n Key 'special' is updated element-wise.\n\n Args:\n **kwargs: update settings of the same names in the returned copy.\n\n Raises:\n KeyError: If key does not exist.\n\n Returns:\n A copy of the current object with kwargs values updated. Doesn't modify object.\n \"\"\"\n\n ret = deepcopy(self)\n\n for k,v in kwargs.items():\n if not hasattr(ret, k):\n raise KeyError(f\"No such key '{k}'\")\n if k == \"special\":\n if ret.special is None:\n ret.special = {}\n if v is None:\n v = {}\n ret.special.update(v)\n if not ret.special:\n ret.special = None\n else:\n setattr(ret, k,v)\n\n return ret\n
"},{"location":"api-reference/generation/#sibila.GenConf.clone","title":"clone","text":"clone()\n
Return a deep copy of this configuration.
Source code in sibila/gen.py
def clone(self) -> Self:\n \"\"\"Return a deep copy of this configuration.\"\"\"\n return deepcopy(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.as_dict","title":"as_dict","text":"as_dict()\n
Return GenConf as a dict.
Source code in sibila/gen.py
def as_dict(self) -> dict:\n \"\"\"Return GenConf as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/gen.py
@staticmethod\ndef from_dict(dic: dict) -> Any: # Any = GenConf\n return GenConf(**dic)\n
"},{"location":"api-reference/generation/#sibila.GenConf.resolve_max_tokens","title":"resolve_max_tokens","text":"resolve_max_tokens(ctx_len, max_tokens_limit=None)\n
Calculate actual max_tokens value for cases where it's zero or a percentage of model's ctx_len)
Parameters:
Name Type Description Default ctx_len
int
Model's context length.
required max_tokens_limit
Optional[int]
Optional model's limit for max_tokens. Defaults to None.
None
Returns:
Type Description int
An actual model maximum number of output tokens.
Source code in sibila/gen.py
def resolve_max_tokens(self,\n ctx_len: int,\n max_tokens_limit: Optional[int] = None) -> int:\n \"\"\"Calculate actual max_tokens value for cases where it's zero or a percentage of model's ctx_len)\n\n Args:\n ctx_len: Model's context length.\n max_tokens_limit: Optional model's limit for max_tokens. Defaults to None.\n\n Returns:\n An actual model maximum number of output tokens.\n \"\"\"\n\n max_tokens = self.max_tokens\n if max_tokens <= 0:\n if max_tokens == 0:\n max_tokens = ctx_len\n else:\n max_tokens = min(-max_tokens, 100)\n max_tokens = int(max_tokens / 100.0 * ctx_len)\n max_tokens = max(1,max_tokens)\n\n if max_tokens_limit is not None:\n max_tokens = min(max_tokens, max_tokens_limit)\n\n return max_tokens\n
"},{"location":"api-reference/generation/#sibila.GenConf.resolve_special","title":"resolve_special","text":"resolve_special(provider=None)\n
Compiles settings from the 'special' field, for model and provider.
Parameters:
Name Type Description Default provider
Optional[str]
If set will include any 'special' settings specified for that provider, inside a key named after the provider. If not given, only base keys are added.
None
Returns:
Type Description dict
description
Source code in sibila/gen.py
def resolve_special(self, \n provider: Optional[str] = None) -> dict:\n \"\"\"Compiles settings from the 'special' field, for model and provider.\n\n Args:\n provider: If set will include any 'special' settings specified for that provider, inside a key named after the provider. If not given, only base keys are added.\n\n Returns:\n _description_\n \"\"\"\n\n if self.special is None:\n return {}\n\n from .models import Models\n\n out = {}\n for k,v in self.special.items():\n if k == provider: # provider-specific\n if not isinstance(v,dict):\n raise ValueError(f\"Config 'special' for provider '{provider}' must be a dict\")\n out.update(v)\n else: # common args\n if isinstance(v,dict) and k in Models.ALL_PROVIDER_NAMES: # skip other provider entries\n continue\n out[k] = v\n return out\n
"},{"location":"api-reference/generation/#sibila.JSchemaConf","title":"JSchemaConf dataclass
","text":"Configuration for JSON schema massaging and validation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.resolve_refs","title":"resolve_refs class-attribute
instance-attribute
","text":"resolve_refs = True\n
Set for $ref references to be resolved and replaced with actual definition.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.collapse_single_combines","title":"collapse_single_combines class-attribute
instance-attribute
","text":"collapse_single_combines = True\n
Any single-valued \"oneOf\"/\"anyOf\" is replaced with the actual value.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.description_from_title","title":"description_from_title class-attribute
instance-attribute
","text":"description_from_title = 0\n
If a value doesn't have a description entry, make one from its title or name.
- 0: don't make description from name
- 1: copy title or name to description
- 2: 1: + capitalize first letter and convert _ to space: class_label -> \"class label\".
"},{"location":"api-reference/generation/#sibila.JSchemaConf.force_all_required","title":"force_all_required class-attribute
instance-attribute
","text":"force_all_required = False\n
Force all entries in an object to be required (except removed defaults if remove_with_default=True).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.remove_with_default","title":"remove_with_default class-attribute
instance-attribute
","text":"remove_with_default = False\n
Delete any values that have a \"default\" annotation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.default_to_last","title":"default_to_last class-attribute
instance-attribute
","text":"default_to_last = True\n
Move any default value entry into the last position of properties dict.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.additional_allowed_root_keys","title":"additional_allowed_root_keys class-attribute
instance-attribute
","text":"additional_allowed_root_keys = field(default_factory=list)\n
By default only the following properties are allowed in schema's root: description, properties, type, required, additionalProperties, allOf, anyOf, oneOf, not Add to this list to allow additional root properties.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.pydantic_strict_validation","title":"pydantic_strict_validation class-attribute
instance-attribute
","text":"pydantic_strict_validation = None\n
Validate JSON values in a strict manner or not. None means validate individually per each value in the obj. (for example in pydantic with: Field(strict=True)).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/json_schema.py
def clone(self):\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#results","title":"Results","text":""},{"location":"api-reference/generation/#sibila.GenRes","title":"GenRes","text":"Model generation result.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_STOP","title":"OK_STOP class-attribute
instance-attribute
","text":"OK_STOP = 1\n
Generation complete without errors.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_LENGTH","title":"OK_LENGTH class-attribute
instance-attribute
","text":"OK_LENGTH = 0\n
Generation stopped due to reaching max_tokens.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON","title":"ERROR_JSON class-attribute
instance-attribute
","text":"ERROR_JSON = -1\n
Invalid JSON: this is often due to the model returning OK_LENGTH (finished due to max_tokens reached), which cuts off the JSON text.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_VAL","title":"ERROR_JSON_SCHEMA_VAL class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_VAL = -2\n
Failed JSON schema validation.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_ERROR","title":"ERROR_JSON_SCHEMA_ERROR class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_ERROR = -2\n
JSON schema itself is not valid.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_MODEL","title":"ERROR_MODEL class-attribute
instance-attribute
","text":"ERROR_MODEL = -3\n
Other model internal error.
"},{"location":"api-reference/generation/#sibila.GenRes.from_finish_reason","title":"from_finish_reason staticmethod
","text":"from_finish_reason(finish)\n
Convert a ChatCompletion finish result into a GenRes.
Parameters:
Name Type Description Default finish
str
ChatCompletion finish result.
required Returns:
Type Description Any
A GenRes result.
Source code in sibila/gen.py
@staticmethod\ndef from_finish_reason(finish: str) -> Any: # Any=GenRes\n \"\"\"Convert a ChatCompletion finish result into a GenRes.\n\n Args:\n finish: ChatCompletion finish result.\n\n Returns:\n A GenRes result.\n \"\"\"\n if finish == 'stop':\n return GenRes.OK_STOP\n elif finish == 'length':\n return GenRes.OK_LENGTH\n elif finish == '!json':\n return GenRes.ERROR_JSON\n elif finish == '!json_schema_val':\n return GenRes.ERROR_JSON_SCHEMA_VAL\n elif finish == '!json_schema_error':\n return GenRes.ERROR_JSON_SCHEMA_ERROR\n else:\n return GenRes.ERROR_MODEL\n
"},{"location":"api-reference/generation/#sibila.GenRes.as_text","title":"as_text staticmethod
","text":"as_text(res)\n
Returns a friendlier description of the result.
Parameters:
Name Type Description Default res
Any
Model output result.
required Raises:
Type Description ValueError
If unknown GenRes.
Returns:
Type Description str
A friendlier description of the GenRes.
Source code in sibila/gen.py
@staticmethod\ndef as_text(res: Any) -> str: # Any=GenRes\n \"\"\"Returns a friendlier description of the result.\n\n Args:\n res: Model output result.\n\n Raises:\n ValueError: If unknown GenRes.\n\n Returns:\n A friendlier description of the GenRes.\n \"\"\"\n\n if res == GenRes.OK_STOP:\n return \"Stop\"\n elif res == GenRes.OK_LENGTH:\n return \"Length (output cut)\"\n elif res == GenRes.ERROR_JSON:\n return \"JSON decoding error\"\n\n elif res == GenRes.ERROR_JSON_SCHEMA_VAL:\n return \"JSON SCHEMA validation error\"\n elif res == GenRes.ERROR_JSON_SCHEMA_ERROR:\n return \"Error in JSON SCHEMA\"\n\n elif res == GenRes.ERROR_MODEL:\n return \"Model internal error\"\n else:\n raise ValueError(\"Bad/unknow GenRes\")\n
"},{"location":"api-reference/generation/#errors","title":"Errors","text":""},{"location":"api-reference/generation/#sibila.GenError","title":"GenError","text":"GenError(out)\n
Model generation exception, raised when the model was unable to return a response.
An error has happened during model generation.
Parameters:
Name Type Description Default out
GenOut
Model output
required Source code in sibila/gen.py
def __init__(self, \n out: GenOut):\n \"\"\"An error has happened during model generation.\n\n Args:\n out: Model output\n \"\"\"\n\n assert out.res != GenRes.OK_STOP, \"OK_STOP is not an error\" \n\n super().__init__()\n\n self.res = out.res\n self.text = out.text\n self.dic = out.dic\n self.value = out.value\n
"},{"location":"api-reference/generation/#sibila.GenError.raise_if_error","title":"raise_if_error staticmethod
","text":"raise_if_error(out, ok_length_is_error)\n
Raise an exception if the model returned an error
Parameters:
Name Type Description Default out
GenOut
Model returned info.
required ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error?
required Raises:
Type Description GenError
If an error was returned by model.
Source code in sibila/gen.py
@staticmethod\ndef raise_if_error(out: GenOut,\n ok_length_is_error: bool):\n \"\"\"Raise an exception if the model returned an error\n\n Args:\n out: Model returned info.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error?\n\n Raises:\n GenError: If an error was returned by model.\n \"\"\"\n\n if out.res != GenRes.OK_STOP:\n if out.res == GenRes.OK_LENGTH and not ok_length_is_error:\n return # set ok_length_is_error to ignore this error\n\n raise GenError(out)\n
"},{"location":"api-reference/generation/#sibila.GenOut","title":"GenOut dataclass
","text":"Model output, returned by gen_extract(), gen_json() and other model calls that don't raise exceptions.
"},{"location":"api-reference/generation/#sibila.GenOut.res","title":"res instance-attribute
","text":"res\n
Result of model generation.
"},{"location":"api-reference/generation/#sibila.GenOut.text","title":"text instance-attribute
","text":"text\n
Text generated by model.
"},{"location":"api-reference/generation/#sibila.GenOut.dic","title":"dic class-attribute
instance-attribute
","text":"dic = None\n
Python dictionary, output by the structured calls like gen_json().
"},{"location":"api-reference/generation/#sibila.GenOut.value","title":"value class-attribute
instance-attribute
","text":"value = None\n
Initialized instance value, dataclass or Pydantic BaseModel object, as returned in calls like extract().
"},{"location":"api-reference/generation/#sibila.GenOut.as_dict","title":"as_dict","text":"as_dict()\n
Return GenOut as a dict.
Source code in sibila/gen.py
def as_dict(self):\n \"\"\"Return GenOut as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenOut.__str__","title":"__str__","text":"__str__()\n
Source code in sibila/gen.py
def __str__(self):\n out = f\"Error={self.res.as_text(self.res)} text=\u2588{self.text}\u2588\"\n if self.dic is not None:\n out += f\" dic={self.dic}\"\n if self.value is not None:\n out += f\" value={self.value}\"\n return out\n
"},{"location":"api-reference/local_model/","title":"Local model classes","text":""},{"location":"api-reference/local_model/#sibila.LlamaCppModel","title":"LlamaCppModel","text":"LlamaCppModel(\n path,\n format=None,\n format_search_order=[\n \"name\",\n \"meta_template\",\n \"folder_json\",\n ],\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n n_gpu_layers=-1,\n main_gpu=0,\n n_batch=512,\n seed=4294967295,\n verbose=False,\n **llamacpp_kwargs\n)\n
Use local GGUF format models via llama.cpp engine.
Supports grammar-constrained JSON output following a JSON schema.
Parameters:
Name Type Description Default path
str
File path to the GGUF file.
required format
Optional[str]
Chat template format to use with model. Leave as None for auto-detection.
None
format_search_order
list[str]
Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].
['name', 'meta_template', 'folder_json']
genconf
Optional[GenConf]
Default generation configuration, which can be used in gen() and related. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Use 0 for maximum possible size, which may raise an out of memory error. None will use a default from the 'llamacpp' provider's '_default' entry at 'res/base_models.json'.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for no limit.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
n_gpu_layers
int
Number of model layers to run in a GPU. Defaults to -1 for all.
-1
main_gpu
int
Index of the GPU to use. Defaults to 0.
0
n_batch
int
Prompt processing batch size. Defaults to 512.
512
seed
int
Random number generation seed, for non zero temperature inference. Defaults to 4294967295.
4294967295
verbose
bool
Emit (very) verbose llama.cpp output. Defaults to False.
False
Raises:
Type Description ImportError
If llama-cpp-python is not installed.
ValueError
For arguments or settings problems.
NameError
If the model was not found or the file is corrupt.
AttributeError
If a suitable template format was not found.
MemoryError
If an out of memory situation arises.
Source code in sibila/llamacpp.py
def __init__(self,\n path: str,\n\n format: Optional[str] = None, \n format_search_order: list[str] = [\"name\", \"meta_template\", \"folder_json\"],\n\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # important LlamaCpp-specific args\n n_gpu_layers: int = -1,\n main_gpu: int = 0,\n n_batch: int = 512,\n seed: int = 4294967295,\n verbose: bool = False,\n\n # other LlamaCpp-specific args\n **llamacpp_kwargs\n ):\n \"\"\"\n Args:\n path: File path to the GGUF file.\n format: Chat template format to use with model. Leave as None for auto-detection.\n format_search_order: Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].\n genconf: Default generation configuration, which can be used in gen() and related. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used. Use 0 for maximum possible size, which may raise an out of memory error. None will use a default from the 'llamacpp' provider's '_default' entry at 'res/base_models.json'.\n max_tokens_limit: Maximum output tokens limit. None for no limit.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n n_gpu_layers: Number of model layers to run in a GPU. Defaults to -1 for all.\n main_gpu: Index of the GPU to use. Defaults to 0.\n n_batch: Prompt processing batch size. Defaults to 512.\n seed: Random number generation seed, for non zero temperature inference. Defaults to 4294967295.\n verbose: Emit (very) verbose llama.cpp output. Defaults to False.\n\n Raises:\n ImportError: If llama-cpp-python is not installed.\n ValueError: For arguments or settings problems.\n NameError: If the model was not found or the file is corrupt.\n AttributeError: If a suitable template format was not found.\n MemoryError: If an out of memory situation arises.\n \"\"\"\n\n self._llama = None # type: ignore[assignment]\n self._llava_ctx = None\n\n self.tokenizer = None # type: ignore[assignment]\n self._own_tokenizer = False\n\n if not has_llama_cpp:\n raise ImportError(\"Please install llama-cpp-python by running: pip install llama-cpp-python\")\n\n # also accept \"provider:path\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if path.startswith(provider_name):\n path = path[len(provider_name):]\n\n sub_paths = extract_sub_paths(path)\n for sp in sub_paths:\n if not os.path.isfile(sp):\n raise NameError(f\"Model file not found at '{sp}'\")\n\n llama_path = sub_paths[0]\n llava_proj_path = sub_paths[1] if len(sub_paths) > 1 else None\n\n # find ctx_len from metadata --and-- check file format\n max_ctx_len = 0\n try:\n md = load_gguf_metadata(llama_path)\n if md is not None:\n for key in md:\n if key.endswith('.context_length'):\n max_ctx_len = int(md[key])\n break\n except Exception as e:\n raise NameError(f\"Error loading file '{llama_path}': {e}\")\n\n\n if ctx_len is None: # find a default in Models _default dict\n defaults = Models.resolve_provider_defaults(\"llamacpp\", [\"ctx_len\"], 2)\n if defaults[\"ctx_len\"] is not None:\n ctx_len = defaults[\"ctx_len\"]\n logger.debug(f\"Defaulting ctx_len={ctx_len} from Models '_default' entry\")\n\n if ctx_len == 0: # default to maximum ctx_len - this can be dangerous, as big ctx_len will probably out of memory\n if max_ctx_len != 0:\n ctx_len = max_ctx_len\n else:\n raise ValueError(\"Cannot find model's maximum ctx_len information. Please provide a non-zero ctx_len arg\")\n\n if max_ctx_len != 0:\n if ctx_len > max_ctx_len: # type: ignore[operator]\n raise ValueError(f\"Arg ctx_len ({ctx_len}) is greater than model's maximum ({max_ctx_len})\")\n\n\n super().__init__(True,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # update kwargs from important args\n llamacpp_kwargs.update(n_ctx=ctx_len,\n n_batch=n_batch,\n n_gpu_layers=n_gpu_layers,\n main_gpu=main_gpu,\n seed=seed,\n verbose=verbose\n )\n\n logger.debug(f\"Creating inner Llama model with path='{llama_path}', llamacpp_kwargs={llamacpp_kwargs}\")\n\n\n try:\n with llamacpp_verbosity_manager(verbose):\n self._llama = Llama(model_path=llama_path, **llamacpp_kwargs)\n\n except Exception as e:\n raise MemoryError(f\"Could not load model file '{llama_path}'. \"\n \"This is usually an out of memory situation but could also be due to a corrupt file. \"\n f\"Internal error: {e}.\")\n\n\n self._model_path = llama_path\n\n\n # correct super __init__ values\n self.ctx_len = self._llama.n_ctx()\n\n if max_tokens_limit is not None:\n self.max_tokens_limit = max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n\n\n try:\n self.init_format(format,\n format_search_order,\n {\"name\": os.path.basename(path), # note: the multiple filename with '*'\n \"path\": path, # note: full path of the multiple filename with '*'\n \"meta_template_name\": \"tokenizer.chat_template\"}\n )\n except Exception as e:\n del self.tokenizer\n del self._llama\n raise AttributeError(str(e))\n\n\n # llava projector setup\n if llava_proj_path is not None:\n\n if not has_llava_cpp:\n raise ImportError(\"Llava is not available in this installation of llama-cpp-python\")\n\n logger.debug(f\"Creating inner Llava projector with path='{llava_proj_path}'\")\n\n with llamacpp_verbosity_manager(verbose):\n self._llava_ctx = llava_cpp.clip_model_load(llava_proj_path.encode(encoding='utf-8'), \n 0) # verbosity\n\n if self._llava_ctx is None:\n raise ValueError(f\"Failed to load llava projector: {llava_proj_path}\")\n\n self._model_path += \"*\" + llava_proj_path\n\n \"\"\"\n self._llava_exit_stack = ExitStack()\n def llava_free():\n with llamacpp_verbosity_manager(verbose):\n llava_cpp.clip_free(self._llava_ctx)\n\n self._llava_exit_stack.callback(llava_free)\n \"\"\"\n\n self.maybe_image_input = self._llava_ctx is not None\n\n\n\n if self.tokenizer is None:\n self.tokenizer = LlamaCppTokenizer(self._llama)\n self._own_tokenizer = True\n else:\n self._own_tokenizer = False\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.close","title":"close","text":"close()\n
Close model, release resources like memory or net connections.
Source code in sibila/llamacpp.py
def close(self):\n \"\"\"Close model, release resources like memory or net connections.\"\"\"\n\n if hasattr(self, \"tokenizer\") and self.tokenizer:\n if hasattr(self, \"_own_tokenizer\") and self._own_tokenizer:\n del self.tokenizer\n self.tokenizer = None\n\n if hasattr(self, \"_llava_ctx\") and self._llava_ctx: # only happens if llama_cpp was loaded\n llava_cpp.clip_free(self._llava_ctx)\n del self._llava_ctx\n self._llava_ctx = None\n\n if hasattr(self, \"_llama\") and self._llama:\n del self._llama\n self._llama = None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Raises:
Type Description ValueError
If trying to generate from an empty prompt.
RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
Source code in sibila/model.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Raises:\n ValueError: If trying to generate from an empty prompt.\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. \n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n text,finish = self._gen_thread(thread, genconf)\n\n return self._prepare_gen_out(text, finish, genconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.token_len","title":"token_len","text":"token_len(thread_or_text, _=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
Final thread or text to be passed to model.
required Returns:
Type Description int
Number of tokens used.
Source code in sibila/llamacpp.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n _: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: Final thread or text to be passed to model.\n\n Returns:\n Number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n token_ids = self.tokens_from_thread(thread)\n return len(token_ids)\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = n_ctx()\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = _llava_ctx is not None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/llamacpp.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_path}' - '{self._llama._model.desc()}'\"\n
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.n_embd","title":"n_embd property
","text":"n_embd\n
Embedding size of model.
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.n_params","title":"n_params property
","text":"n_params\n
Total number of model parameters.
"},{"location":"api-reference/local_model/#sibila.LlamaCppModel.get_metadata","title":"get_metadata","text":"get_metadata()\n
Returns model metadata.
Source code in sibila/llamacpp.py
def get_metadata(self):\n \"\"\"Returns model metadata.\"\"\"\n out = {}\n buf = bytes(16 * 1024)\n lmodel = self._llama.model\n count = llama_cpp.llama_model_meta_count(lmodel)\n for i in range(count):\n res = llama_cpp.llama_model_meta_key_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n key = buf[:res].decode('utf-8')\n res = llama_cpp.llama_model_meta_val_str_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n value = buf[:res].decode('utf-8')\n out[key] = value\n return out\n
"},{"location":"api-reference/local_model/#sibila.Model","title":"Model","text":"Model(is_local_model, genconf, schemaconf, tokenizer)\n
Model is an abstract base class for common LLM model functionality. Many of the useful methods like extract() or json() are implemented here.
It should not be instantiated directly, instead LlamaCppModel, OpenAIModel, etc, all derive from this class.
Initializer for base model type, shared by actual model classes like LlamaCpp, OpenAI, etc.
Parameters:
Name Type Description Default is_local_model
bool
Is the model running locally?
required genconf
Union[GenConf, None]
Default generation configuration options, used if generation call doesn't supply one.
required schemaconf
Union[JSchemaConf, None]
Default configuration for JSON schema validation, used if generation call doesn't supply one.
required tokenizer
Union[Tokenizer, None]
Tokenizer used to encode text (even for message-based models).
required Source code in sibila/model.py
def __init__(self,\n is_local_model: bool,\n genconf: Union[GenConf, None],\n schemaconf: Union[JSchemaConf, None],\n tokenizer: Union[Tokenizer, None]):\n \"\"\"Initializer for base model type, shared by actual model classes like LlamaCpp, OpenAI, etc.\n\n Args:\n is_local_model: Is the model running locally?\n genconf: Default generation configuration options, used if generation call doesn't supply one.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one.\n tokenizer: Tokenizer used to encode text (even for message-based models).\n \"\"\"\n\n self.is_local_model = is_local_model\n\n self.ctx_len = 0\n self.max_tokens_limit = sys.maxsize\n self.output_key_name = \"output\"\n self.output_fn_name = \"json_out\"\n\n self.tokenizer = tokenizer # type: ignore[assignment]\n\n if genconf is None:\n self.genconf = GenConf()\n else:\n self.genconf = genconf.clone()\n\n if schemaconf is None:\n self.schemaconf = JSchemaConf()\n else:\n self.schemaconf = schemaconf.clone()\n\n\n\n # set either \"json\" or \"json_schema\" key values to None to skip.\n self.json_format_instructors = {\n \"json\": {\n \"bypass_if\": [\"json\"], # bypass appending if all lowercase text values are present in thread\n \"append_text\": \"Output JSON.\",\n \"sep_count\": 2\n },\n \"json_schema\": {\n \"bypass_if\": [\"json\", \"schema\"],\n \"append_text\": \"Output JSON matching the following schema:\\n{{json_schema}}\",\n \"sep_count\": 2\n }\n }\n\n # text going to model: tight, without \\u00xx\n self.json_in_dumps_kwargs = {\n \"indent\": None,\n \"ensure_ascii\": False\n } \n
"},{"location":"api-reference/local_model/#sibila.Model.close","title":"close abstractmethod
","text":"close()\n
Close model, release resources like memory or net connections.
Source code in sibila/model.py
@abstractmethod\ndef close(self):\n \"\"\"Close model, release resources like memory or net connections.\"\"\"\n ...\n
"},{"location":"api-reference/local_model/#sibila.Model.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.Model.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.Model.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/local_model/#sibila.Model.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/local_model/#sibila.Model.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/local_model/#sibila.Model.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/local_model/#sibila.Model.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
NotImplementedError
If method was not defined by a derived class.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/model.py
def gen(self,\n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n NotImplementedError: If method was not defined by a derived class.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n raise NotImplementedError\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/local_model/#sibila.Model.token_len","title":"token_len abstractmethod
","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
Final thread or text to be passed to model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Number of tokens occupied.
Source code in sibila/model.py
@abstractmethod\ndef token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: Final thread or text to be passed to model.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Number of tokens occupied.\n \"\"\"\n ...\n
"},{"location":"api-reference/local_model/#sibila.Model.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/local_model/#sibila.Model.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = 0\n
Maximum context token length, including input and model output. There can be a limit for output tokens in the max_tokens_limit.
"},{"location":"api-reference/local_model/#sibila.Model.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input\n
Does the model support images as input? A value of False is definitive, a value of True is actually a maybe, as some providers don't give this information. Check the model specs to be certain.
"},{"location":"api-reference/local_model/#sibila.Model.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/local_model/#sibila.Model.desc","title":"desc abstractmethod
","text":"desc()\n
Model description.
Source code in sibila/model.py
@abstractmethod\ndef desc(self) -> str:\n \"\"\"Model description.\"\"\"\n ...\n
"},{"location":"api-reference/models/","title":"Models factory","text":""},{"location":"api-reference/models/#sibila.Models","title":"Models","text":"Model and template format directory that unifies (and simplifies) model access and configuration.
This env variable is checked and used during initialization SIBILA_MODELS: ';'-delimited list of folders where to find: models.json, formats.json and model files.
= Models Directory =
Useful to create models from resource names like \"llamacpp:openchat\" or \"openai:gpt-4\". This makes it simple to change a model, store model settings, to compare model outputs, etc.
User can add new entries from script or with JSON filenames, via the add() call. New directory entries with the same name are merged into existing ones for each added config.
Uses file \"sibila/res/base_models.json\" for the initial defaults, which the user can augment by calling setup() with own config files or directly adding model config with set_model().
An example of a model directory JSON config file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Rules for entry inheritance/overriding
Entries in the '_default' key of each provider will serve as defaults for models of that provider. Model entries in base_models_dir (automatically loaded from 'res/base_models.json') are overridden by any entries of the same name loaded from a local 'models.json' file with Models.setup(). Here, overridden means local keys of the same name replace base keys (as a dict.update()). However '_default' entries only apply separately to either base_models_dir or 'local models.json', as in a lexical scope.
= Format Directory =
Detects chat templates from model name/filename or uses from metadata if possible.
This directory can be setup from a JSON file or by calling set_format().
Any new entries with the same name replace previous ones on each new call.
Initializes from file \"sibila/res/base_formats.json\".
Example of a \"formats.json\" file:
{\n \"chatml\": {\n # template is a Jinja template for this model\n \"template\": \"{% for message in messages %}...\"\n },\n\n \"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n }, \n\n \"phi\": {\n \"match\": \"phi\",\n \"template\": \"...\"\n },\n\n \"phi2\": \"phi\",\n # this is a link: \"phi2\" -> \"phi\"\n}\n
Jinja2 templates receive a standard ChatML messages list (created from a Thread) and must deal with the following:
-
In models that don't use a system message, template must take care of prepending it to first user message.
-
The add_generation_prompt template variable is always set as True.
"},{"location":"api-reference/models/#sibila.Models.setup","title":"setup classmethod
","text":"setup(\n path=None, clear=False, add_cwd=True, load_from_env=True\n)\n
Initialize models and formats directory from given model files folder and/or contained configuration files. Path can start with \"~/\" current account's home directory.
Parameters:
Name Type Description Default path
Optional[Union[str, list[str]]]
Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.
None
clear
bool
Set to clear existing directories before loading from path arg.
False
add_cwd
bool
Add current working directory to search path.
True
load_from_env
bool
Load from SIBILA_MODELS env variable?
True
Source code in sibila/models.py
@classmethod\ndef setup(cls,\n path: Optional[Union[str,list[str]]] = None,\n clear: bool = False,\n add_cwd: bool = True,\n load_from_env: bool = True):\n \"\"\"Initialize models and formats directory from given model files folder and/or contained configuration files.\n Path can start with \"~/\" current account's home directory.\n\n Args:\n path: Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.\n clear: Set to clear existing directories before loading from path arg.\n add_cwd: Add current working directory to search path.\n load_from_env: Load from SIBILA_MODELS env variable?\n \"\"\"\n\n if clear:\n cls.clear()\n\n cls._ensure(add_cwd, \n load_from_env)\n\n if path is not None:\n if isinstance(path, str):\n path_list = [path]\n else:\n path_list = path\n\n cls._read_any(path_list)\n
"},{"location":"api-reference/models/#sibila.Models.create","title":"create classmethod
","text":"create(\n res_name,\n genconf=None,\n ctx_len=None,\n *,\n resolved_create_args=None,\n **over_args\n)\n
Create a model.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required genconf
Optional[GenConf]
Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Overrides directory defaults. Defaults to None.
None
resolved_create_args
Optional[dict]
Pass an empty dict to be filled by this method with the resolved args used in model creation. Defaults to None.
None
over_args
Union[Any]
Model-specific creation args, which will override default args set in model directory.
{}
Returns:
Name Type Description Model
Model
the initialized model.
Source code in sibila/models.py
@classmethod\ndef create(cls,\n res_name: str,\n\n # common to all providers\n genconf: Optional[GenConf] = None,\n ctx_len: Optional[int] = None,\n\n *,\n # debug/testing\n resolved_create_args: Optional[dict] = None,\n\n # model-specific overriding:\n **over_args: Union[Any]) -> Model:\n \"\"\"Create a model.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n genconf: Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.\n ctx_len: Maximum context length to be used. Overrides directory defaults. Defaults to None.\n resolved_create_args: Pass an empty dict to be filled by this method with the resolved args used in model creation. Defaults to None.\n over_args: Model-specific creation args, which will override default args set in model directory.\n\n Returns:\n Model: the initialized model.\n \"\"\"\n\n try:\n provider, _, args = cls.resolve_model_entry(res_name, **over_args)\n except ValueError as e:\n raise NameError(str({e}))\n\n # override genconf, ctx_len\n if genconf is None:\n genconf = cls.genconf\n\n if genconf is not None:\n args[\"genconf\"] = genconf\n\n elif \"genconf\" in args and isinstance(args[\"genconf\"], dict):\n # transform dict into a GenConf instance:\n args[\"genconf\"] = GenConf.from_dict(args[\"genconf\"])\n\n if ctx_len is not None:\n args[\"ctx_len\"] = ctx_len\n\n if resolved_create_args is not None:\n resolved_create_args.update(args)\n\n\n logger.debug(f\"Resolved '{res_name}' to provider '{provider}' with args: {args}\")\n\n\n model: Model\n if provider == \"anthropic\":\n\n from .anthropic import AnthropicModel\n model = AnthropicModel(**args)\n\n elif provider == \"fireworks\":\n\n from .schema_format_openai import FireworksModel\n model = FireworksModel(**args)\n\n elif provider == \"groq\":\n\n from .schema_format_openai import GroqModel\n model = GroqModel(**args)\n\n elif provider == \"llamacpp\":\n from .llamacpp import LlamaCppModel, extract_sub_paths\n\n # resolve filename -> path. Path filenames can be in the form model1*model2\n sub_paths = args[\"name\"].split('*')\n\n # only resolve first path. If not found, let LlamaCpp raise the error below\n sub_paths[0] = cls._locate_file(sub_paths[0]) or sub_paths[0]\n\n # rejoin located paths with '*' (if multiple)\n path = '*'.join(sub_paths)\n logger.debug(f\"Resolved llamacpp model '{args['name']}' to '{path}'\")\n\n # rename \"name\" -> \"path\" which LlamaCppModel is expecting\n del args[\"name\"]\n args[\"path\"] = path\n\n model = LlamaCppModel(**args)\n\n elif provider == \"mistral\":\n\n from .mistral import MistralModel\n model = MistralModel(**args)\n\n elif provider == \"openai\":\n\n from .openai import OpenAIModel\n model = OpenAIModel(**args)\n\n elif provider == \"together\":\n\n from .schema_format_openai import TogetherModel\n model = TogetherModel(**args)\n\n else:\n raise ValueError(f\"Unknown provider '{provider}' for '{res_name}'\")\n\n\n\n return model\n
"},{"location":"api-reference/models/#sibila.Models.add_models_search_path","title":"add_models_search_path classmethod
","text":"add_models_search_path(path)\n
Prepends new paths to model files search path.
Parameters:
Name Type Description Default path
Union[str, list[str]]
A path or list of paths to add to model search path.
required Source code in sibila/models.py
@classmethod\ndef add_models_search_path(cls,\n path: Union[str,list[str]]):\n \"\"\"Prepends new paths to model files search path.\n\n Args:\n path: A path or list of paths to add to model search path.\n \"\"\"\n\n cls._ensure()\n\n prepend_path(cls.models_search_path, path)\n\n logger.debug(f\"Adding '{path}' to search_path\")\n
"},{"location":"api-reference/models/#sibila.Models.set_genconf","title":"set_genconf classmethod
","text":"set_genconf(genconf)\n
Set the GenConf to use as default for model creation.
Parameters:
Name Type Description Default genconf
GenConf
Model generation configuration.
required Source code in sibila/models.py
@classmethod\ndef set_genconf(cls,\n genconf: GenConf):\n \"\"\"Set the GenConf to use as default for model creation.\n\n Args:\n genconf: Model generation configuration.\n \"\"\"\n cls.genconf = genconf\n
"},{"location":"api-reference/models/#sibila.Models.list_models","title":"list_models classmethod
","text":"list_models(\n name_query, providers, include_base, resolved_values\n)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match model names. Empty string for all.
required providers
list[str]
Filter by these exact provider names. Empty list for all.
required include_base
bool
Also list fused values from base_models_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are model res_names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_models(cls,\n name_query: str,\n providers: list[str],\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match model names. Empty string for all.\n providers: Filter by these exact provider names. Empty list for all.\n include_base: Also list fused values from base_models_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are model res_names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n out = {}\n\n name_query = name_query.lower()\n\n for prov_name in models_dir:\n\n if providers and prov_name not in providers:\n continue\n\n prov_dic = models_dir[prov_name]\n\n for name in prov_dic:\n\n if name == cls.DEFAULT_ENTRY_NAME:\n continue\n\n if name_query and name_query not in name.lower():\n continue\n\n entry_res_name = prov_name + \":\" + name\n\n if resolved_values:\n # okay to use get_model_entry() because it resolves to fused\n res = cls.get_model_entry(entry_res_name) # type: ignore[assignment]\n if res is None:\n continue\n else:\n val = res[1]\n else:\n val = prov_dic[name]\n\n out[entry_res_name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_model_entry","title":"get_model_entry classmethod
","text":"get_model_entry(res_name)\n
Get a resolved model entry. Resolved means following any links.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required Returns:
Type Description Union[tuple[str, dict], None]
Resolved entry (res_name,dict) or None if not found.
Source code in sibila/models.py
@classmethod\ndef get_model_entry(cls,\n res_name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved model entry. Resolved means following any links.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n\n Returns:\n Resolved entry (res_name,dict) or None if not found.\n \"\"\"\n\n cls._ensure() \n\n models_dir = cls.fused_models_dir()\n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = models_dir[provider]\n\n if name in prov:\n return provider + \":\" + name, prov[name]\n else:\n return None\n
"},{"location":"api-reference/models/#sibila.Models.has_model_entry","title":"has_model_entry classmethod
","text":"has_model_entry(res_name)\n
Source code in sibila/models.py
@classmethod\ndef has_model_entry(cls,\n res_name: str) -> bool:\n return cls.get_model_entry(res_name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.set_model","title":"set_model classmethod
","text":"set_model(\n res_name, model_name, format_name=None, genconf=None\n)\n
Add model configuration for given res_name.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
str
Model name or filename identifier.
required format_name
Optional[str]
Format name used by model. Defaults to None.
None
genconf
Optional[GenConf]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model(cls,\n res_name: str,\n model_name: str,\n format_name: Optional[str] = None,\n genconf: Optional[GenConf] = None):\n \"\"\"Add model configuration for given res_name.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier.\n format_name: Format name used by model. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry: dict = {\n \"name\": model_name\n }\n\n if format_name:\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n\n if genconf:\n entry[\"genconf\"] = genconf.as_dict()\n\n cls.models_dir[provider][name] = entry\n
"},{"location":"api-reference/models/#sibila.Models.update_model","title":"update_model classmethod
","text":"update_model(\n res_name,\n model_name=None,\n format_name=None,\n genconf=None,\n)\n
update model fields
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
Optional[str]
Model name or filename identifier. Defaults to None.
None
format_name
Optional[str]
Format name used by model. Use \"\" to delete. Defaults to None.
None
genconf
Union[GenConf, str, None]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef update_model(cls,\n res_name: str,\n model_name: Optional[str] = None,\n format_name: Optional[str] = None,\n genconf: Union[GenConf,str,None] = None):\n\n \"\"\"update model fields\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier. Defaults to None.\n format_name: Format name used by model. Use \"\" to delete. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry = cls.models_dir[provider][name]\n\n if model_name:\n entry[\"name\"] = model_name\n\n if format_name is not None:\n if format_name != \"\":\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n else:\n del entry[\"format\"]\n\n if genconf is not None:\n if genconf != \"\":\n entry[\"genconf\"] = genconf\n else:\n del entry[\"genconf\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_model_link","title":"set_model_link classmethod
","text":"set_model_link(res_name, link_name)\n
Create a model link into another model.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required link_name
str
Name of model this entry links to.
required Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model_link(cls,\n res_name: str,\n link_name: str):\n \"\"\"Create a model link into another model.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n link_name: Name of model this entry links to.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, True)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n # first: ensure link_name is a res_name\n if ':' not in link_name:\n link_name = provider + \":\" + link_name\n\n if not cls.has_model_entry(link_name):\n raise ValueError(f\"Could not find linked model '{link_name}'\")\n\n # second: check link name is without provider if same\n link_split = link_name.split(\":\")\n if len(link_split) == 2:\n if link_split[0] == provider: # remove same \"provider:\"\n link_name = link_split[1]\n\n cls.models_dir[provider][name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_model","title":"delete_model classmethod
","text":"delete_model(res_name)\n
Delete a model entry.
Parameters:
Name Type Description Default res_name
str
Model entry in the form \"provider:name\".
required Source code in sibila/models.py
@classmethod\ndef delete_model(cls,\n res_name: str):\n \"\"\"Delete a model entry.\n\n Args:\n res_name: Model entry in the form \"provider:name\".\n \"\"\"\n\n cls._ensure()\n\n provider, name = provider_name_from_urn(res_name,\n allow_alias_provider=False)\n\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}', must be one of: {cls.ALL_PROVIDER_NAMES}\")\n\n prov = cls.models_dir[provider] \n if name not in prov:\n raise ValueError(f\"Model '{res_name}' not found\")\n\n # verify if any entry links to name:\n def check_link_to(link_to_name: str, \n provider: str) -> Union[str, None]:\n\n for name,entry in cls.models_dir[provider].items():\n if isinstance(entry,str) and entry == link_to_name:\n return name\n return None\n\n offender = check_link_to(name, provider)\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry '{provider}:{offender}' links to it\")\n\n offender = check_link_to(name, \"alias\")\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry 'alias:{offender}' links to it\")\n\n del prov[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_models","title":"save_models classmethod
","text":"save_models(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_models(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"models.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n # clear providers with no models:\n for provider in cls.ALL_PROVIDER_NAMES:\n if provider in models_dir and not models_dir[provider]:\n del models_dir[provider]\n\n json.dump(models_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.list_formats","title":"list_formats classmethod
","text":"list_formats(name_query, include_base, resolved_values)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match format names. Empty string for all.
required include_base
bool
Also list base_formats_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are format names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_formats(cls,\n name_query: str,\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match format names. Empty string for all.\n include_base: Also list base_formats_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are format names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n out = {}\n\n name_query = name_query.lower()\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n for name in formats_dir.keys():\n\n if name_query and name_query not in name.lower():\n continue\n\n val = formats_dir[name]\n\n if resolved_values:\n res = cls.get_format_entry(name)\n if res is None:\n continue\n else:\n val = res[1]\n\n out[name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_format_entry","title":"get_format_entry classmethod
","text":"get_format_entry(name)\n
Get a resolved format entry by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple of (resolved_name, format_entry).
Source code in sibila/models.py
@classmethod\ndef get_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved format entry by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Tuple of (resolved_name, format_entry).\n \"\"\"\n\n cls._ensure()\n\n return get_format_entry(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.has_format_entry","title":"has_format_entry classmethod
","text":"has_format_entry(name)\n
Source code in sibila/models.py
@classmethod\ndef has_format_entry(cls,\n name: str) -> bool:\n return cls.get_format_entry(name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.get_format_template","title":"get_format_template classmethod
","text":"get_format_template(name)\n
Get a format template by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[str, None]
Resolved format template str.
Source code in sibila/models.py
@classmethod\ndef get_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Get a format template by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Resolved format template str.\n \"\"\"\n\n res = cls.get_format_entry(name)\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.match_format_entry","title":"match_format_entry classmethod
","text":"match_format_entry(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple (name, format_entry) where name is a resolved name. Or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Tuple (name, format_entry) where name is a resolved name. Or None if none found.\n \"\"\"\n\n cls._ensure()\n\n return search_format(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.match_format_template","title":"match_format_template classmethod
","text":"match_format_template(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[str, None]
Format template or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Format template or None if none found.\n \"\"\"\n\n res = cls.match_format_entry(name)\n\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_format","title":"set_format classmethod
","text":"set_format(name, template, match=None)\n
Add a format entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required template
str
The Chat template format in Jinja2 format
required match
Optional[str]
Regex that matches names/filenames that use this format. Default is None.
None
Source code in sibila/models.py
@classmethod\ndef set_format(cls,\n name: str,\n template: str,\n match: Optional[str] = None):\n \"\"\"Add a format entry to the format directory.\n\n Args:\n name: Format entry name.\n template: The Chat template format in Jinja2 format\n match: Regex that matches names/filenames that use this format. Default is None.\n \"\"\"\n\n cls._ensure()\n\n if \"{{\" not in template: # a link_name for the template\n if not cls.has_format_entry(template):\n raise ValueError(f\"Could not find linked template entry '{template}'.\")\n\n entry = {\n \"template\": template\n }\n if match is not None:\n entry[\"match\"] = match\n\n cls.formats_dir[name] = entry \n
"},{"location":"api-reference/models/#sibila.Models.set_format_link","title":"set_format_link classmethod
","text":"set_format_link(name, link_name)\n
Add a format link entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required link_name
str
Name of format that this entry links to.
required Source code in sibila/models.py
@classmethod\ndef set_format_link(cls,\n name: str,\n link_name: str):\n \"\"\"Add a format link entry to the format directory.\n\n Args:\n name: Format entry name.\n link_name: Name of format that this entry links to.\n \"\"\"\n\n cls._ensure()\n\n if not cls.has_format_entry(link_name):\n raise ValueError(f\"Could not find linked entry '{link_name}'.\")\n\n cls.formats_dir[name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_format","title":"delete_format classmethod
","text":"delete_format(name)\n
Delete a format entry.
Parameters:
Name Type Description Default name
str
Format entry name.
required Source code in sibila/models.py
@classmethod\ndef delete_format(cls,\n name: str):\n \"\"\"Delete a format entry.\n\n Args:\n name: Format entry name.\n \"\"\"\n\n cls._ensure()\n\n if name not in cls.formats_dir:\n raise ValueError(f\"Format name '{name}' not found.\")\n\n for check_name,entry in cls.formats_dir.items():\n if isinstance(entry,str) and entry == name:\n raise ValueError(f\"Cannot delete '{name}', as entry '{check_name}' links to it\")\n\n del cls.formats_dir[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_formats","title":"save_formats classmethod
","text":"save_formats(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_formats(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"formats.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n json.dump(formats_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.info","title":"info classmethod
","text":"info(include_base=True, verbose=False)\n
Return information about current setup.
Parameters:
Name Type Description Default verbose
bool
If False, formats directory values are abbreviated. Defaults to False.
False
Returns:
Type Description str
Textual information about the current setup.
Source code in sibila/models.py
@classmethod\ndef info(cls,\n include_base: bool = True,\n verbose: bool = False) -> str:\n \"\"\"Return information about current setup.\n\n Args:\n verbose: If False, formats directory values are abbreviated. Defaults to False.\n\n Returns:\n Textual information about the current setup.\n \"\"\"\n\n cls._ensure()\n\n out = \"\"\n\n out += f\"Models search path: {cls.models_search_path}\\n\"\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n out += f\"Models directory:\\n{pformat(models_dir, sort_dicts=False)}\\n\"\n\n out += f\"Model Genconf:\\n{cls.genconf}\\n\"\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n if not verbose:\n fordir = {}\n for key in formats_dir:\n fordir[key] = deepcopy(formats_dir[key])\n if isinstance(fordir[key], dict) and \"template\" in fordir[key]:\n fordir[key][\"template\"] = fordir[key][\"template\"][:14] + \"...\"\n else:\n fordir = formats_dir\n\n out += f\"Formats directory:\\n{pformat(fordir)}\"\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.clear","title":"clear classmethod
","text":"clear()\n
Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.
Source code in sibila/models.py
@classmethod\ndef clear(cls):\n \"\"\"Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.\"\"\"\n cls.models_dir = None\n cls.models_search_path = []\n cls.formats_dir = None\n
"},{"location":"api-reference/multigen/","title":"Multigen","text":""},{"location":"api-reference/multigen/#sibila.multigen","title":"multigen","text":"Functions for comparing output across models.
- thread_multigen(), query_multigen() and multigen(): Compare outputs across models.
- cycle_gen_print(): For a list of models, sequentially grow a Thread with model responses to given IN messages.
"},{"location":"api-reference/multigen/#sibila.multigen.thread_multigen","title":"thread_multigen","text":"thread_multigen(\n threads,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n thread_titles=None,\n)\n
Generate a single thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
thread_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def thread_multigen(threads: list[Thread],\n model_names: list[str],\n\n text: Union[str,list[str],None] = None,\n csv: Union[str,list[str],None] = None,\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n thread_titles: Optional[list[str]] = None \n ) -> list[list[GenOut]]:\n \"\"\"Generate a single thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n thread_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n table = multigen(threads,\n model_names=model_names, \n gencall=gencall,\n genconf=genconf)\n\n # table[threads,models]\n\n if thread_titles is None:\n thread_titles = [str(th) for th in threads]\n\n def format(format_fn, cmds):\n if cmds is None or not cmds:\n return\n\n f = StringIO(newline='')\n\n format_fn(f,\n table, \n title_list=thread_titles,\n model_names=model_names,\n out_keys=out_keys)\n fmtd = f.getvalue()\n\n if not isinstance(cmds, list):\n cmds = [cmds]\n for c in cmds:\n if c == 'print':\n print(fmtd)\n else: # path\n with open(c, \"w\", encoding=\"utf-8\") as f:\n f.write(fmtd)\n\n format(format_text, text)\n format(format_csv, csv)\n\n return table\n
"},{"location":"api-reference/multigen/#sibila.multigen.query_multigen","title":"query_multigen","text":"query_multigen(\n in_list,\n inst_text,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n in_titles=None,\n)\n
Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
in_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)).
list[list[GenOut]]
For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def query_multigen(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n text: Union[str,list[str],None] = None, # \"print\", path\n csv: Union[str,list[str],None] = None, # \"print\", path\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n in_titles: Optional[list[str]] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n in_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). \n For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\" \n\n th_list = []\n for in_text in in_list:\n th = Thread.make_INST_IN(inst_text, in_text)\n th_list.append(th)\n\n if in_titles is None:\n in_titles = in_list\n\n out = thread_multigen(th_list, \n model_names=model_names, \n text=text,\n csv=csv,\n gencall=gencall,\n genconf=genconf,\n out_keys=out_keys,\n thread_titles=in_titles)\n\n return out\n
"},{"location":"api-reference/multigen/#sibila.multigen.multigen","title":"multigen","text":"multigen(\n threads,\n *,\n models=None,\n model_names=None,\n model_names_del_after=True,\n gencall=None,\n genconf=None\n)\n
Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.
Actual generation for each model is implemented by the gencall arg Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required models
Optional[list[Model]]
A list of initialized models. Defaults to None.
None
model_names
Optional[list[str]]
--Or-- A list of Models names. Defaults to None.
None
model_names_del_after
bool
Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.
True
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
Raises:
Type Description ValueError
Only one of models or model_names can be given.
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def multigen(threads: list[Thread],\n *,\n models: Optional[list[Model]] = None, # existing models\n\n model_names: Optional[list[str]] = None,\n model_names_del_after: bool = True,\n\n gencall: Optional[Callable] = None,\n genconf: Optional[GenConf] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.\n\n Actual generation for each model is implemented by the gencall arg Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n models: A list of initialized models. Defaults to None.\n model_names: --Or-- A list of Models names. Defaults to None.\n model_names_del_after: Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n\n Raises:\n ValueError: Only one of models or model_names can be given.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n if not ((models is None) ^ ((model_names is None))):\n raise ValueError(\"Only one of models or model_names can be given\")\n\n if gencall is None:\n gencall = _default_gencall_text\n\n mod_count = len(models) if models is not None else len(model_names) # type: ignore[arg-type]\n\n all_out = []\n\n for i in range(mod_count):\n if models is not None:\n model = models[i]\n logger.debug(f\"Model: {model.desc}\")\n else:\n name = model_names[i] # type: ignore[index]\n model = Models.create(name)\n logger.info(f\"Model: {name} -> {model.desc}\")\n\n mod_out = []\n for th in threads:\n out = gencall(model, th, genconf)\n\n mod_out.append(out)\n\n all_out.append(mod_out)\n\n if model_names_del_after and models is None:\n model.close()\n del model\n\n # all_out is currently shaped (M,T) -> transpose to (T,M), so that each row contains thread t for all models\n tout = []\n for t in range(len(threads)):\n tmout = [] # thread t for all models\n for m in range(mod_count):\n tmout.append(all_out[m][t])\n\n tout.append(tmout)\n\n return tout\n
"},{"location":"api-reference/multigen/#sibila.multigen.cycle_gen_print","title":"cycle_gen_print","text":"cycle_gen_print(\n in_list,\n inst_text,\n model_names,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n json_kwargs={\n \"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False,\n },\n)\n
For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.
Works by doing:
- Generate an INST+IN prompt for a list of models. (Same INST for all).
- Append the output of each model to its own Thread.
- Append the next IN prompt and generate again. Back to 2.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
json_kwargs
dict
JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.
{'indent': 2, 'sort_keys': False, 'ensure_ascii': False}
Source code in sibila/multigen.py
def cycle_gen_print(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n json_kwargs: dict = {\"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False}\n ):\n \"\"\"For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.\n\n Works by doing:\n\n 1. Generate an INST+IN prompt for a list of models. (Same INST for all).\n 2. Append the output of each model to its own Thread.\n 3. Append the next IN prompt and generate again. Back to 2.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n json_kwargs: JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n if gencall is None:\n gencall = _default_gencall_text\n\n\n n_model = len(model_names)\n n_ins = len(in_list)\n\n for m in range(n_model):\n\n name = model_names[m]\n model = Models.create(name)\n\n print('=' * 80)\n print(f\"Model: {name} -> {model.desc}\")\n\n th = Thread(inst=inst_text)\n\n for i in range(n_ins):\n in_text = in_list[i]\n print(f\"IN: {in_text}\")\n\n th += Msg.make_IN(in_text)\n\n out = gencall(model, th, genconf)\n\n out_dict = out.as_dict()\n\n print(\"OUT\")\n\n for k in out_keys:\n\n if k in out_dict and out_dict[k] is not None:\n\n if k != out_keys[0]: # not first\n print(\"-\" * 20)\n\n val = nice_print(k, out_dict[k], json_kwargs)\n print(val)\n\n th += Msg.make_OUT(out.text)\n\n model.close()\n del model\n
"},{"location":"api-reference/remote_model/","title":"Remote model classes","text":""},{"location":"api-reference/remote_model/#sibila.OpenAIModel","title":"OpenAIModel","text":"OpenAIModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n overhead_per_msg=None,\n token_estimation_factor=None,\n create_tokenizer=False,\n other_init_kwargs={}\n)\n
Access an OpenAI model.
Supports constrained JSON output, via the OpenAI API tools mechanism.
Ref https://platform.openai.com/docs/api-reference/chat/create
Create an OpenAI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL or a default.
None
overhead_per_msg
Optional[int]
Overhead tokens to account for when calculating token length. None for model's default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
create_tokenizer
bool
When no tokenizer is passed, should try to create one?
False
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI-specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n overhead_per_msg: Optional[int] = None,\n token_estimation_factor: Optional[float] = None,\n create_tokenizer: bool = False,\n\n # other OpenAI-specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create an OpenAI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL or a default.\n overhead_per_msg: Overhead tokens to account for when calculating token length. None for model's default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n create_tokenizer: When no tokenizer is passed, should try to create one?\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_openai:\n raise ImportError(\"Please install openai by running: pip install openai\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n tokenizer\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n overhead_per_msg is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_overhead_per_msg = overhead_per_msg\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"overhead_per_msg\",\n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") # type: ignore[assignment]\n default_overhead_per_msg = settings.get(\"overhead_per_msg\") # type: ignore[assignment]\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if ctx_len is None and default_ctx_len is None:\n default_ctx_len = 4096\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n\n if max_tokens_limit is None and default_max_tokens_limit is None:\n default_max_tokens_limit = ctx_len or default_ctx_len \n # don't warn: assume equal to ctx_len: logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n\n if overhead_per_msg is None and default_overhead_per_msg is None:\n default_overhead_per_msg = 3\n # don't warn for this setting due to derived model classes (none uses it)\n\n if token_estimation_factor is None and default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._overhead_per_msg = overhead_per_msg or default_overhead_per_msg\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = True # True means maybe - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n\n if self.tokenizer is None and create_tokenizer:\n try:\n self.tokenizer = OpenAITokenizer(self._model_name)\n except Exception as e:\n logger.warning(f\"Could not create a local tokenizer for model '{self._model_name}' - \"\n \"token length calculation will be disabled and assume defaults. \"\n \"To support recent OpenAI models, install the latest tiktoken version with 'pip install -U tiktoken'. \"\n f\"Internal error: {e}\")\n\n\n self._client_init_kwargs = other_init_kwargs\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key\n if base_url is not None:\n self._client_init_kwargs[\"base_url\"] = base_url\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = True\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
Requires OpenAI API key, passed as this arg or set in env variable OPENAI_API_KEY.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models.
Source code in sibila/openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: Requires OpenAI API key, passed as this arg or set in env variable OPENAI_API_KEY.\n\n Returns:\n Returns a list of known models.\n \"\"\"\n\n client = openai.OpenAI(api_key=api_key)\n model_list = client.models.list()\n\n out = []\n for model in model_list.data:\n out.append(model.id)\n return sorted(out)\n
"},{"location":"api-reference/remote_model/#sibila.OpenAIModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel","title":"AnthropicModel","text":"AnthropicModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n api_key=None,\n token_estimation_factor=None,\n anthropic_init_kwargs={}\n)\n
Access an Anthropic model. Supports constrained JSON output, via the Anthropic API function calling mechanism.
Ref https://docs.anthropic.com/claude/docs/intro-to-claude
Create an Anthropic remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
api_key
Optional[str]
Anthropic API key. Defaults to None, which will use env variable ANTHROPIC_API_KEY.
None
token_estimation_factor
Optional[float]
Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
anthropic_init_kwargs
dict
Extra args for Anthropic() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If Anthropic API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/anthropic.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n\n # most important Anthropic-specific args\n api_key: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other Anthropic-specific args\n anthropic_init_kwargs: dict = {},\n ):\n \"\"\"Create an Anthropic remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n api_key: Anthropic API key. Defaults to None, which will use env variable ANTHROPIC_API_KEY.\n token_estimation_factor: Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n anthropic_init_kwargs: Extra args for Anthropic() initialization. Defaults to {}.\n\n Raises:\n ImportError: If Anthropic API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_anthropic:\n raise ImportError(\"Please install anthropic API by running: pip install anthropic\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n None\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") or default_ctx_len\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if default_ctx_len is None:\n default_ctx_len = 200000\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n if default_max_tokens_limit is None:\n default_max_tokens_limit = default_ctx_len\n logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n if default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = True # currently all Anthropic models support image input - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n self._client_init_kwargs = anthropic_init_kwargs\n\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key \n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/anthropic.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n if \"tools\" in kwargs:\n response = self._client.beta.tools.messages.create(**kwargs) # type: ignore[attr-defined]\n else:\n response = self._client.messages.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens occupied.
Source code in sibila/anthropic.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens occupied.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n OVERHEAD_PER_MSG = 3\n num_tokens = 0\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n # print(num_tokens)\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = True\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/model.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.AnthropicModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/anthropic.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"AnthropicModel: {self._model_name}\"\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel","title":"FireworksModel","text":"FireworksModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a Fireworks AI model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://readme.fireworks.ai/docs/structured-response-formatting
https://readme.fireworks.ai/reference/createchatcompletion
Create a Fireworks AI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable FIREWORKS_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable FIREWORKS_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a Fireworks AI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable FIREWORKS_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable FIREWORKS_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"FIREWORKS_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"FIREWORKS_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no Fireworks models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.FireworksModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel","title":"GroqModel","text":"GroqModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a Groq model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://console.groq.com/docs/tool-use
https://github.com/groq/groq-api-cookbook/blob/main/parallel-tool-use/parallel-tool-use.ipynb
Create a Groq remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable GROQ_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable GROQ_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a Groq remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable GROQ_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable GROQ_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"GROQ_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"GROQ_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no Groq models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.GroqModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel","title":"MistralModel","text":"MistralModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n api_key=None,\n token_estimation_factor=None,\n mistral_init_kwargs={}\n)\n
Access a Mistral AI model. Supports constrained JSON output, via the Mistral API function calling mechanism.
Ref https://docs.mistral.ai/guides/function-calling/
Create a Mistral AI remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
api_key
Optional[str]
Mistral API key. Defaults to None, which will use env variable MISTRAL_API_KEY.
None
token_estimation_factor
Optional[float]
Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
mistral_init_kwargs
dict
Extra args for mistral.MistralClient() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If Mistral API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/mistral.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n\n # most important Mistral-specific args\n api_key: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other Mistral-specific args\n mistral_init_kwargs: dict = {},\n ):\n \"\"\"Create a Mistral AI remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n api_key: Mistral API key. Defaults to None, which will use env variable MISTRAL_API_KEY.\n token_estimation_factor: Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n mistral_init_kwargs: Extra args for mistral.MistralClient() initialization. Defaults to {}.\n\n Raises:\n ImportError: If Mistral API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n\n if not has_mistral:\n raise ImportError(\"Please install mistral by running: pip install mistralai\")\n\n self._client = self._client_async = None\n\n\n # also accept \"provider:name\" for ease of use\n provider_name = self.PROVIDER_NAME + \":\"\n if name.startswith(provider_name):\n name = name[len(provider_name):]\n\n super().__init__(False,\n genconf,\n schemaconf,\n None\n )\n\n if (ctx_len is not None and\n max_tokens_limit is not None and\n token_estimation_factor is not None): # all elements given: probably created via Models.create()\n\n self._model_name = name\n default_ctx_len = ctx_len\n default_max_tokens_limit = max_tokens_limit\n default_token_estimation_factor = token_estimation_factor\n\n else: # need to resolve\n settings = self.resolve_settings(self.PROVIDER_NAME,\n name,\n [\"name\", \n \"ctx_len\", \n \"max_tokens_limit\", \n \"token_estimation_factor\"])\n self._model_name = settings.get(\"name\") or name\n default_ctx_len = settings.get(\"ctx_len\") # type: ignore[assignment]\n default_max_tokens_limit = settings.get(\"max_tokens_limit\") or default_ctx_len\n default_token_estimation_factor = settings.get(\"token_estimation_factor\") # type: ignore[assignment]\n\n # all defaults are conservative values\n if default_ctx_len is None:\n default_ctx_len = 32768\n logger.warning(f\"Model '{self._model_name}': unknown ctx_len, assuming {default_ctx_len}\")\n if default_max_tokens_limit is None:\n default_max_tokens_limit = default_ctx_len\n logger.warning(f\"Model '{self._model_name}': unknown max_tokens_limit, assuming {default_max_tokens_limit}\")\n if default_token_estimation_factor is None:\n default_token_estimation_factor = self.DEFAULT_TOKEN_ESTIMATION_FACTOR\n logger.warning(f\"Model '{self._model_name}': unknown token_estimation_factor, assuming {default_token_estimation_factor}\")\n\n\n self.ctx_len = ctx_len or default_ctx_len\n\n self.max_tokens_limit = max_tokens_limit or default_max_tokens_limit\n\n self.max_tokens_limit = min(self.max_tokens_limit, self.ctx_len)\n\n self._token_estimation_factor = token_estimation_factor or default_token_estimation_factor\n\n self.maybe_image_input = False # no Mistral models currently support image input - always check model specs\n\n # only check for \"json\" text presence as json schema (including field descriptions) is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n self._client_init_kwargs = mistral_init_kwargs\n\n if api_key is not None:\n self._client_init_kwargs[\"api_key\"] = api_key \n elif \"api_key\" not in self._client_init_kwargs and \"MISTRAL_API_KEY\" in os.environ:\n # \"MISTRAL_API_KEY\" env key is ignored in pytest?\n self._client_init_kwargs[\"api_key\"] = os.environ[\"MISTRAL_API_KEY\"]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/mistral.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n response = self._client.chat(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens occupied.
Source code in sibila/mistral.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens occupied.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n OVERHEAD_PER_MSG = 3\n num_tokens = 0\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n # print(num_tokens)\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = tokenizer\n
Tokenizer used to encode text. Some remote models don't have tokenizer and token length is estimated
"},{"location":"api-reference/remote_model/#sibila.MistralModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/mistral.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n\n args = {}\n if api_key is not None:\n args[\"api_key\"] = api_key\n model = MistralClient(**args) # type: ignore[arg-type]\n\n model_list = model.list_models()\n del model\n\n out = []\n for mod in model_list.data:\n out.append(mod.id)\n\n return sorted(out)\n
"},{"location":"api-reference/remote_model/#sibila.MistralModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/mistral.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"MistralModel: {self._model_name}\"\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel","title":"TogetherModel","text":"TogetherModel(\n name,\n *,\n genconf=None,\n schemaconf=None,\n ctx_len=None,\n max_tokens_limit=None,\n tokenizer=None,\n api_key=None,\n base_url=None,\n token_estimation_factor=None,\n other_init_kwargs={}\n)\n
Access a together.ai model with the OpenAI API. Supports constrained JSON output, via the response_format JSON Schema mechanism.
Ref https://docs.together.ai/docs/json-mode
https://docs.together.ai/reference/chat-completions
Create a together.ai remote model.
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
schemaconf
Optional[JSchemaConf]
Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used (shared for input and output). None for model's default.
None
max_tokens_limit
Optional[int]
Maximum output tokens limit. None for model's default.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
api_key
Optional[str]
API key. Defaults to None, which will use env variable TOGETHER_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable TOGETHER_BASE_URL or a default.
None
token_estimation_factor
Optional[float]
Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.
None
other_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
NameError
If model name was not found or there's an API or authentication problem.
Source code in sibila/schema_format_openai.py
def __init__(self,\n name: str,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n ctx_len: Optional[int] = None,\n max_tokens_limit: Optional[int] = None,\n tokenizer: Optional[Tokenizer] = None,\n\n # most important OpenAI API specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n token_estimation_factor: Optional[float] = None,\n\n # other OpenAI API specific args\n other_init_kwargs: dict = {},\n ):\n \"\"\"Create a together.ai remote model.\n\n Args:\n name: Model name to resolve into an existing model.\n genconf: Model generation configuration. Defaults to None.\n schemaconf: Default configuration for JSON schema validation, used if generation call doesn't supply one. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). None for model's default.\n max_tokens_limit: Maximum output tokens limit. None for model's default.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n api_key: API key. Defaults to None, which will use env variable TOGETHER_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable TOGETHER_BASE_URL or a default.\n token_estimation_factor: Used when no tokenizer is available. Multiplication factor to estimate token usage: multiplies total text length to obtain token length.\n other_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n NameError: If model name was not found or there's an API or authentication problem.\n \"\"\"\n\n if api_key is None:\n api_key = os.environ.get(\"TOGETHER_API_KEY\")\n if base_url is None:\n base_url = os.environ.get(\"TOGETHER_BASE_URL\", self.DEFAULT_BASE_URL)\n\n super().__init__(name,\n # common base model args\n genconf=genconf,\n schemaconf=schemaconf,\n ctx_len=ctx_len,\n max_tokens_limit=max_tokens_limit,\n tokenizer=tokenizer,\n\n # most important OpenAI API specific args\n api_key=api_key,\n base_url=base_url,\n token_estimation_factor=token_estimation_factor,\n\n # other OpenAI API specific args\n other_init_kwargs=other_init_kwargs)\n\n self.maybe_image_input = False # no together.ai models currently support image input - always check model specs\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.json","title":"json","text":"json(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(thread,\n json_schema, \n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: F811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.call","title":"call","text":"call(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def call(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods. Same as call().
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods. Same as call().\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n return self.call(query,\n inst=inst,\n genconf=genconf,\n ok_length_is_error=ok_length_is_error)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.extract_async","title":"extract_async async
","text":"extract_async(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Async type-constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
async def extract_async(self,\n target: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: \n \"\"\"Async type-constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_extract_async(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.classify_async","title":"classify_async async
","text":"classify_async(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
async def classify_async(self,\n labels: Any,\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return await self.extract_async(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.json_async","title":"json_async async
","text":"json_async(\n query,\n *,\n json_schema=None,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
None
inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
async def json_async(self, \n query: Union[Thread,Msg,tuple,str],\n *,\n json_schema: Union[dict,str,None] = None,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_json_async(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.dataclass_async","title":"dataclass_async async
","text":"dataclass_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
async def dataclass_async(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Async constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_dataclass_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.pydantic_async","title":"pydantic_async async
","text":"pydantic_async(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Async constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
RuntimeError
If unable to generate.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
async def pydantic_async(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Async constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n RuntimeError: If unable to generate.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_pydantic_async(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.call_async","title":"call_async async
","text":"call_async(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[Thread, Msg, tuple, str]
A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
RuntimeError
If unable to generate.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
async def call_async(self,\n query: Union[Thread,Msg,tuple,str],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: A Thread or a single IN message given as Msg, list, tuple or str. List and tuple should contain the same args as for creating Msg.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n RuntimeError: If unable to generate.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = await self.gen_async(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n genconf2: GenConf\n kwargs, genconf2 = self._gen_pre(thread, genconf)\n\n self._ensure_client(False)\n\n try:\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(**kwargs) # type: ignore[attr-defined]\n\n except Exception as e:\n raise RuntimeError(f\"Cannot generate. Internal error: {e}\")\n\n\n return self._gen_post(response,\n kwargs,\n genconf2)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_json","title":"gen_json","text":"gen_json(\n thread,\n json_schema,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n thread: Thread,\n json_schema: Union[dict,str,None],\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n args = self._gen_json_pre(thread,\n json_schema,\n genconf,\n massage_schema,\n schemaconf)\n return self.gen(*args)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_dataclass_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_dataclass_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description RuntimeError
If unable to generate.
TypeError
When cls is not a Pydantic BaseClass.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n RuntimeError: If unable to generate.\n TypeError: When cls is not a Pydantic BaseClass.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n schema = self._gen_pydantic_pre(cls)\n\n out = self.gen_json(thread,\n schema,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n return self._gen_pydantic_post(out,\n cls,\n schemaconf)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.token_len","title":"token_len","text":"token_len(thread_or_text, genconf=None)\n
Calculate or estimate the token length for a Thread or a plain text string. In some cases where it's not possible to calculate the exact token count, this function should give a conservative (upper bound) estimate. It's up to the implementation whether to account for side information like JSON Schema, but it must reflect the model's context token accounting. Thread or text must be the final text which will passed to model.
If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread_or_text
Union[Thread, str]
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens used.
Source code in sibila/openai.py
def token_len(self,\n thread_or_text: Union[Thread,str],\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate or estimate the token length for a Thread or a plain text string.\n In some cases where it's not possible to calculate the exact token count, \n this function should give a conservative (upper bound) estimate.\n It's up to the implementation whether to account for side information like JSON Schema,\n but it must reflect the model's context token accounting.\n Thread or text must be the final text which will passed to model.\n\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread_or_text: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens used.\n \"\"\"\n\n if isinstance(thread_or_text, Thread):\n thread = thread_or_text \n else:\n thread = Thread.make_IN(thread_or_text)\n\n num_tokens = 0\n\n if self.tokenizer is None: # no tokenizer was found, so we'll have to do a conservative estimate\n\n OVERHEAD_PER_MSG = 3\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n msg_tokens = len(str(message[\"content\"])) * self._token_estimation_factor + OVERHEAD_PER_MSG\n # str(message[\"content\"]): hacky way to deal with dict \"content\" key\n num_tokens += int(msg_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = len(js_str) * self._token_estimation_factor\n num_tokens += int(tools_num_tokens)\n # print(\"tools_num_tokens\", tools_num_tokens)\n\n else: # do an \"informed\" token estimation from what is known of the OpenAI model's tokenization\n\n for msg in thread.get_iter(True): # True for system message\n message = msg.as_chatml()\n # print(message)\n num_tokens += self._overhead_per_msg\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(str(value))) # str(value): hacky way to deal with dict \"content\" key\n\n # add extras + every reply is primed with <|start|>assistant<|message|>\n num_tokens += 32\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n TOOLS_TOKEN_LEN_FACTOR = 1.2\n\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * TOOLS_TOKEN_LEN_FACTOR)\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n\n return num_tokens\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.ctx_len","title":"ctx_len instance-attribute
","text":"ctx_len = ctx_len or default_ctx_len\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.maybe_image_input","title":"maybe_image_input instance-attribute
","text":"maybe_image_input = False\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.known_models","title":"known_models classmethod
","text":"known_models(api_key=None)\n
List of model names that can be used. Some of the models are not chat models and cannot be used, for example embedding models.
Parameters:
Name Type Description Default api_key
Optional[str]
If the model provider requires an API key, pass it here or set it in the respective env variable.
None
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if unable to fetch it.
Source code in sibila/schema_format_openai.py
@classmethod\ndef known_models(cls,\n api_key: Optional[str] = None) -> Union[list[str], None]:\n \"\"\"List of model names that can be used. Some of the models are not chat models and cannot be used,\n for example embedding models.\n\n Args:\n api_key: If the model provider requires an API key, pass it here or set it in the respective env variable.\n\n Returns:\n Returns a list of known models or None if unable to fetch it.\n \"\"\"\n return None\n
"},{"location":"api-reference/remote_model/#sibila.TogetherModel.desc","title":"desc","text":"desc()\n
Model description.
Source code in sibila/openai.py
def desc(self) -> str:\n \"\"\"Model description.\"\"\"\n return f\"{type(self).__name__}: '{self._model_name}'\"\n
"},{"location":"api-reference/thread/","title":"Threads and messages","text":""},{"location":"api-reference/thread/#sibila.Thread","title":"Thread","text":"Thread(t=None, inst='', join_sep='\\n')\n
A sequence of messages alternating between IN (\"user\" role) and OUT (\"assistant\" role).
Stores a special initial INST information (known as \"system\" role in ChatML) providing instructions to the model. Some models don't use system instructions - in those cases it's prepended to first IN message.
Messages are kept in a strict IN,OUT,IN,OUT,... order. To enforce this, if two IN messages are added, the second just appends to the text of the first or to its image list.
Parameters:
Name Type Description Default t
Optional[Union[Self, list, Msg, dict, tuple, str]]
Optionally initialize from a Thread, list[Msg], list[ChatML format dict], list[tuple], list[str], Msg, ChatML format dict, tuple or str.
None
inst
str
Instructions text. If inst arg is not set and t is a Thread, its inst will be used.
''
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Raises:
Type Description TypeError
On invalid args passed.
Source code in sibila/thread.py
def __init__(self,\n t: Optional[Union[Self,list,Msg,dict,tuple,str]] = None,\n inst: str = \"\",\n join_sep: str = \"\\n\"):\n \"\"\"\n Args:\n t: Optionally initialize from a Thread, list[Msg], list[ChatML format dict], list[tuple], list[str], Msg, ChatML format dict, tuple or str.\n inst: Instructions text. If inst arg is not set and t is a Thread, its inst will be used.\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n\n Raises:\n TypeError: On invalid args passed.\n \"\"\"\n\n self._msgs = []\n self.inst = Msg.make_INST(inst)\n self.join_sep = join_sep\n\n if t is not None:\n self.concat(t)\n
"},{"location":"api-reference/thread/#sibila.Thread.inst","title":"inst instance-attribute
","text":"inst = make_INST(inst)\n
System instructions in an Msg of kind INST, defaults to empty text.
"},{"location":"api-reference/thread/#sibila.Thread.init_INST_IN","title":"init_INST_IN","text":"init_INST_IN(inst_text, in_text, in_images=None)\n
Initialize Thread with instructions and an IN message.
Parameters:
Name Type Description Default inst_text
str
Instructions text.
required in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def init_INST_IN(self,\n inst_text: str,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Initialize Thread with instructions and an IN message.\n\n Args:\n inst_text: Instructions text.\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.clear()\n self.inst.text = inst_text\n self.add_IN(in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_IN","title":"add_IN","text":"add_IN(in_text, in_images=None)\n
Appends an IN message to Thread.
Parameters:
Name Type Description Default in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def add_IN(self,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an IN message to Thread.\n\n Args:\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.add(Msg.Kind.IN, in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_OUT","title":"add_OUT","text":"add_OUT(out_text, out_images=None)\n
Appends an OUT message to Thread.
Parameters:
Name Type Description Default out_text
str
Text for OUT message.
required out_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
def add_OUT(self,\n out_text: str,\n out_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an OUT message to Thread.\n\n Args:\n out_text: Text for OUT message.\n out_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n self.add(Msg.Kind.OUT, out_text, out_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.add_OUT_IN","title":"add_OUT_IN","text":"add_OUT_IN(\n out_text, in_text, *, out_images=None, in_images=None\n)\n
Appends an OUT message followed by an IN message.
Parameters:
Name Type Description Default out_text
str
Text for OUT message.
required in_text
str
Text for IN message.
required out_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
in_images
Optional[Union[list, str, dict]]
Optional list of IN message images.
None
Source code in sibila/thread.py
def add_OUT_IN(self,\n out_text: str,\n in_text: str,\n *,\n out_images: Optional[Union[list,str,dict]] = None,\n in_images: Optional[Union[list,str,dict]] = None):\n \"\"\"Appends an OUT message followed by an IN message.\n\n Args:\n out_text: Text for OUT message.\n in_text: Text for IN message.\n out_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n in_images: Optional list of IN message images.\n \"\"\" \n self.add(Msg.Kind.OUT, out_text, out_images)\n self.add(Msg.Kind.IN, in_text, in_images)\n
"},{"location":"api-reference/thread/#sibila.Thread.make_INST_IN","title":"make_INST_IN staticmethod
","text":"make_INST_IN(inst_text, in_text, in_images=None)\n
Return an initialized Thread with instructions and an IN message.
Parameters:
Name Type Description Default inst_text
str
Instructions text.
required in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
@staticmethod\ndef make_INST_IN(inst_text: str,\n in_text: str,\n in_images: Optional[Union[list,str,dict]] = None) -> 'Thread':\n \"\"\"Return an initialized Thread with instructions and an IN message.\n\n Args:\n inst_text: Instructions text.\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n\n thread = Thread(inst=inst_text)\n thread.add_IN(in_text, in_images)\n return thread\n
"},{"location":"api-reference/thread/#sibila.Thread.make_IN","title":"make_IN staticmethod
","text":"make_IN(in_text, in_images=None)\n
Return an initialized Thread with an IN message.
Parameters:
Name Type Description Default in_text
str
Text for IN message.
required in_images
Optional[Union[list, str, dict]]
An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.
None
Source code in sibila/thread.py
@staticmethod\ndef make_IN(in_text: str,\n in_images: Optional[Union[list,str,dict]] = None) -> 'Thread':\n \"\"\"Return an initialized Thread with an IN message.\n\n Args:\n in_text: Text for IN message.\n in_images: An array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with \"url\" key and others. If url arg is not a valid URL, it will be loaded and converted to a data: URL.\n \"\"\"\n\n thread = Thread()\n thread.add_IN(in_text, in_images)\n return thread\n
"},{"location":"api-reference/thread/#sibila.Thread.clone","title":"clone","text":"clone()\n
Return a copy of current Thread.
Returns:
Type Description Self
A copy of this Thread.
Source code in sibila/thread.py
def clone(self) -> Self:\n \"\"\"Return a copy of current Thread.\n\n Returns:\n A copy of this Thread.\n \"\"\"\n return Thread(self)\n
"},{"location":"api-reference/thread/#sibila.Thread.clear","title":"clear","text":"clear(clear_inst=True)\n
Delete all messages and clear inst.
Source code in sibila/thread.py
def clear(self,\n clear_inst: bool = True):\n \"\"\"Delete all messages and clear inst.\"\"\"\n self._msgs = []\n if clear_inst:\n self.inst.text = \"\"\n
"},{"location":"api-reference/thread/#sibila.Thread.load","title":"load","text":"load(path, clear)\n
Load this Thread from a JSON file.
Parameters:
Name Type Description Default path
str
Path of file to load.
required clear
bool
Should thread be cleared of messages, including INST? If not will concatenate with existing ones.
required Source code in sibila/thread.py
def load(self,\n path: str,\n clear: bool):\n \"\"\"Load this Thread from a JSON file.\n\n Args:\n path: Path of file to load.\n clear: Should thread be cleared of messages, including INST? If not will concatenate with existing ones.\n \"\"\"\n\n with open(path, 'r', encoding='utf-8') as f:\n js = f.read()\n state = json.loads(js)\n\n if clear:\n self.clear()\n\n th = self.from_dict(state)\n self.concat(th)\n
"},{"location":"api-reference/thread/#sibila.Thread.save","title":"save","text":"save(path)\n
Serialize this Thread to a JSON file.
Parameters:
Name Type Description Default path
str
Path of file to save into.
required Source code in sibila/thread.py
def save(self,\n path: str):\n \"\"\"Serialize this Thread to a JSON file.\n\n Args:\n path: Path of file to save into.\n \"\"\"\n\n state = self.as_dict()\n\n json_str = json.dumps(state, indent=2, default=vars)\n\n with open(path, 'w', encoding='utf-8') as f:\n f.write(json_str)\n
"},{"location":"api-reference/thread/#sibila.Thread.from_dict","title":"from_dict staticmethod
","text":"from_dict(state)\n
Deserialize a Thread from a dict.
Source code in sibila/thread.py
@staticmethod\ndef from_dict(state: dict) -> 'Thread':\n \"\"\"Deserialize a Thread from a dict.\"\"\"\n\n th = Thread()\n for dic in state[\"_msgs\"]:\n th.add(Msg.from_dict(dic))\n th.inst = Msg.from_dict(state[\"inst\"])\n th.join_sep = state[\"join_sep\"]\n\n return th\n
"},{"location":"api-reference/thread/#sibila.Thread.as_dict","title":"as_dict","text":"as_dict()\n
Serialize this Thread to a dict.
Source code in sibila/thread.py
def as_dict(self) -> dict:\n \"\"\"Serialize this Thread to a dict.\"\"\"\n\n state = {\"_msgs\": [],\n \"inst\": self.inst.as_dict(),\n \"join_sep\": self.join_sep}\n\n for msg in self._msgs:\n state[\"_msgs\"].append(msg.as_dict()) # type: ignore[attr-defined]\n\n return state\n
"},{"location":"api-reference/thread/#sibila.Thread.as_chatml","title":"as_chatml","text":"as_chatml(include_INST=True)\n
Returns Thread as a list of ChatML messages.
Returns:
Type Description list[dict]
A list of ChatML dict elements with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self,\n include_INST: bool = True) -> list[dict]:\n \"\"\"Returns Thread as a list of ChatML messages.\n\n Returns:\n A list of ChatML dict elements with \"role\" and \"content\" keys.\n \"\"\"\n msgs = []\n\n if self.inst.text and include_INST:\n msgs.append(self.inst.as_chatml())\n\n for msg in self._msgs:\n msgs.append(msg.as_chatml())\n\n return msgs\n
"},{"location":"api-reference/thread/#sibila.Thread.Trim","title":"Trim","text":"Flags for Thread trimming.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.NONE","title":"NONE class-attribute
instance-attribute
","text":"NONE = 0\n
No trimming.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 1\n
Can remove INST message.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 2\n
Can remove IN messages.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 4\n
Can remove OUT messages.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.KEEP_FIRST_IN","title":"KEEP_FIRST_IN class-attribute
instance-attribute
","text":"KEEP_FIRST_IN = 1024\n
If trimming IN messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Thread.Trim.KEEP_FIRST_OUT","title":"KEEP_FIRST_OUT class-attribute
instance-attribute
","text":"KEEP_FIRST_OUT = 2048\n
If trimming OUT messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Thread.trim","title":"trim","text":"trim(trim_flags, max_token_len, thread_token_len_fn)\n
Trim context by selectively removing older messages until thread fits max_token_len.
Parameters:
Name Type Description Default trim_flags
Trim
Flags to guide selection of which messages to remove.
required max_token_len
int
Cut messages until size is lower than this number. Defaults to None.
required thread_token_len_fn
Callable
A function that returns token count for a passed Thread.
required Example of a thread_token_len_fn that counts 1 char = 1 token: def thread_token_len_fn(thread: Thread) -> int: total = len(thread.inst.text) for msg in thread: total += len(msg.text) if msg.images: total += len(str(msg.images)) return total
Returns:
Type Description int
Trimming result: 1=trimmed messages to max_token_len, 0: no trimming was needed, -1: Unable to trim to max_token_len.
Source code in sibila/thread.py
def trim(self,\n trim_flags: Trim,\n max_token_len: int,\n thread_token_len_fn: Callable\n ) -> int:\n \"\"\"Trim context by selectively removing older messages until thread fits max_token_len.\n\n Args:\n trim_flags: Flags to guide selection of which messages to remove.\n max_token_len: Cut messages until size is lower than this number. Defaults to None.\n thread_token_len_fn: A function that returns token count for a passed Thread.\n\n Example of a thread_token_len_fn that counts 1 char = 1 token:\n def thread_token_len_fn(thread: Thread) -> int:\n total = len(thread.inst.text)\n for msg in thread:\n total += len(msg.text)\n if msg.images:\n total += len(str(msg.images))\n return total\n\n Returns:\n Trimming result: 1=trimmed messages to max_token_len, 0: no trimming was needed, -1: Unable to trim to max_token_len.\n \"\"\"\n\n if trim_flags == Thread.Trim.NONE: # no trimming\n return 0\n\n thread = self.clone()\n\n any_trim = False\n\n while True:\n\n curr_len = thread_token_len_fn(thread)\n\n if curr_len <= max_token_len:\n break\n\n logger.debug(f\"len={curr_len} / max={max_token_len}\")\n\n if thread.inst.text and trim_flags & Thread.Trim.INST:\n thread.inst.text = \"\"\n any_trim = True\n logger.debug(f\"Cutting INST {thread.inst.text[:40]}\")\n continue\n\n # cut first possible message, starting from oldest first ones\n trimmed = False\n in_index = out_index = 0\n\n for index,msg in enumerate(thread):\n\n if msg.kind == Msg.Kind.IN:\n if trim_flags & Thread.Trim.IN:\n if not (trim_flags & Thread.Trim.KEEP_FIRST_IN and in_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting IN {msg.text[:40]}\")\n break\n in_index += 1\n\n elif msg.kind == Msg.Kind.OUT:\n if trim_flags & Thread.Trim.OUT: \n if not (trim_flags & Thread.Trim.KEEP_FIRST_OUT and out_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting OUT {msg.text[:40]}\")\n break\n out_index += 1\n\n if not trimmed:\n # all thread messages were cycled but not a single could be cut, so size remains the same\n # arriving here we did all we could for trim_flags but could not remove any more\n return -1\n else:\n any_trim = True\n\n # while end\n\n\n if any_trim:\n self._msgs = thread._msgs\n self.inst = thread.inst\n\n return int(any_trim)\n
"},{"location":"api-reference/thread/#sibila.Thread.add","title":"add","text":"add(t, text=None, images=None)\n
Add a message to Thread.
Accepts any of these argument combinations t=Msg, ChatML format dict, tuple or str --or-- t=kind, text[, images]
Parameters:
Name Type Description Default t
Union[Msg, dict, tuple, str, Kind]
One of Msg, ChatML format dict, tuple or str, or Msg.Kind.
required text
Optional[str]
Message text, only if t=Msg.Kind.
None
images
Optional[Union[list, str, dict]]
only if t=Msg.Kind or t=str-> an array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with keys \"url\" and any other keys like \"detail\". If url arg is not a valid URL, it will be loaded and converted to a data URL.
None
Source code in sibila/thread.py
def add(self, \n t: Union[Msg,dict,tuple,str,Msg.Kind],\n text: Optional[str] = None,\n images: Optional[Union[list,str,dict]] = None):\n\n \"\"\"Add a message to Thread.\n\n Accepts any of these argument combinations:\n t=Msg, ChatML format dict, tuple or str\n --or--\n t=kind, text[, images]\n\n Args:\n t: One of Msg, ChatML format dict, tuple or str, or Msg.Kind.\n text: Message text, only if t=Msg.Kind.\n images: only if t=Msg.Kind or t=str-> an array (or its first element) of either an str (a file path, will be loaded and converted to a data: URL) or a dict with keys \"url\" and any other keys like \"detail\". If url arg is not a valid URL, it will be loaded and converted to a data URL.\n \"\"\"\n\n if text is not None:\n if not isinstance(t, Msg.Kind):\n raise TypeError(\"When arg 'text' is given, first arg must be of type Msg.Kind\")\n\n msg = Msg(t, text, images)\n\n else: # add from t arg\n if isinstance(t, dict): # ChatML formatted dict\n msg = Msg.from_chatml(t)\n\n\n elif isinstance(t, tuple):\n msg = Msg(self.next_kind,\n *t)\n\n elif isinstance(t, str): # simple text\n msg = Msg(self.next_kind,\n t,\n images)\n\n elif isinstance(t, Msg):\n msg = t.clone()\n\n else:\n raise TypeError(\"Arg 't' must be one of: Msg, ChatML format dict, tuple or str\")\n\n\n # now append to list\n if msg.kind == Msg.Kind.INST:\n self.inst.join_same_kind(msg, self.join_sep)\n\n else:\n if not len(self._msgs) or msg.kind == self.next_kind: # next different kind or empty\n self._msgs.append(msg)\n else: # new msg is of same kind as last existing message: join/append to it\n last = self._msgs[-1]\n last.join_same_kind(msg, self.join_sep)\n
"},{"location":"api-reference/thread/#sibila.Thread.concat","title":"concat","text":"concat(t)\n
Concatenate to current Thread: another Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.
if last message in self is the same kind of first in t, their text, images, etc will be joined.
Parameters:
Name Type Description Default t
Union[Self, list, Msg, dict, tuple, str]
A Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.
required Source code in sibila/thread.py
def concat(self,\n t: Union[Self,list,Msg,dict,tuple,str]):\n \"\"\"Concatenate to current Thread: another Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.\n\n if last message in self is the same kind of first in t, their text, images, etc will be joined.\n\n Args:\n t: A Thread, list[Msg], list[ChatML format dict], list[str], Msg, ChatML format dict or str.\n \"\"\"\n if isinstance(t, Thread):\n for msg in t:\n self.add(msg)\n\n self.inst.join_same_kind(t.inst, self.join_sep)\n\n else:\n if not isinstance(t, list):\n t = [t]\n for msg in t:\n self.add(msg)\n
"},{"location":"api-reference/thread/#sibila.Thread.get_iter","title":"get_iter","text":"get_iter(include_set_inst)\n
Return an iterator that can be used to cycle over messages. include_set_inst: If inst message is set, include it before all others.
Source code in sibila/thread.py
def get_iter(self,\n include_set_inst: bool):\n \"\"\"Return an iterator that can be used to cycle over messages.\n include_set_inst: If inst message is set, include it before all others.\n \"\"\"\n class MsgIter:\n def __init__(self, \n thread: Thread,\n include_inst: bool):\n self.thread = thread\n self.curr = -1 - int(include_inst)\n\n def __iter__(self):\n return self\n\n def __next__(self):\n self.curr += 1\n if self.curr == -1:\n return self.thread.inst\n elif self.curr < len(self.thread):\n return self.thread[self.curr]\n else:\n raise StopIteration\n\n return MsgIter(self,\n include_set_inst and bool(self.inst.text))\n
"},{"location":"api-reference/thread/#sibila.Thread.has_images","title":"has_images property
","text":"has_images\n
"},{"location":"api-reference/thread/#sibila.Thread.next_kind","title":"next_kind property
","text":"next_kind\n
Get kind of next new message that can be added to thread .
Returns:
Type Description Kind
Kind of last message or Msg.Kind.IN if empty.
"},{"location":"api-reference/thread/#sibila.Thread.has_text_lower","title":"has_text_lower","text":"has_text_lower(text_lower)\n
Can the lowercase text be found in one of the messages?
Parameters:
Name Type Description Default text_lower
str
The lowercase text to search for in messages.
required Returns:
Type Description bool
True if such text was found.
Source code in sibila/thread.py
def has_text_lower(self,\n text_lower: str) -> bool:\n \"\"\"Can the lowercase text be found in one of the messages?\n\n Args:\n text_lower: The lowercase text to search for in messages.\n\n Returns:\n True if such text was found.\n \"\"\"\n for msg in self._msgs:\n if text_lower in msg.text.lower():\n return True\n\n return False \n
"},{"location":"api-reference/thread/#sibila.Thread.join_sep","title":"join_sep instance-attribute
","text":"join_sep = join_sep\n
Separator used when message text needs to be joined. Defaults to '\\n'
"},{"location":"api-reference/thread/#sibila.Thread.__add__","title":"__add__","text":"__add__(other)\n
Source code in sibila/thread.py
def __add__(self,\n other: Union[Self,list, Msg, dict, str]) -> Self:\n out = self.clone()\n out.concat(other)\n return out\n
"},{"location":"api-reference/thread/#sibila.Thread.__iter__","title":"__iter__","text":"__iter__()\n
Source code in sibila/thread.py
def __iter__(self):\n # Default iterator doesn't include inst message.\n return self.get_iter(False)\n
"},{"location":"api-reference/thread/#sibila.Msg","title":"Msg dataclass
","text":""},{"location":"api-reference/thread/#sibila.Msg.Kind","title":"Kind","text":"Enumeration for kinds of messages in a Thread.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 'IN'\n
Input message, from user.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 'OUT'\n
Model output message.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 'INST'\n
Initial model instructions.
"},{"location":"api-reference/thread/#sibila.Msg.Kind.as_chatml_role","title":"as_chatml_role","text":"as_chatml_role()\n
Source code in sibila/thread.py
def as_chatml_role(self: Self) -> str:\n CHATML_FROM_KIND: dict = {Msg.Kind.IN: \"user\", Msg.Kind.OUT: \"assistant\", Msg.Kind.INST: \"system\"}\n return CHATML_FROM_KIND.get(self) # type: ignore[return-value]\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.from_chatml_role","title":"from_chatml_role staticmethod
","text":"from_chatml_role(role)\n
Source code in sibila/thread.py
@staticmethod\ndef from_chatml_role(role: str) -> 'Msg.Kind':\n KIND_FROM_CHATML: dict = {\"user\": Msg.Kind.IN, \"assistant\": Msg.Kind.OUT, \"system\": Msg.Kind.INST}\n kind = KIND_FROM_CHATML.get(role)\n if kind is None:\n raise ValueError(f\"Unknown ChatML role '{role}'\")\n else:\n return kind\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.flip","title":"flip staticmethod
","text":"flip(kind)\n
Source code in sibila/thread.py
@staticmethod\ndef flip(kind: 'Msg.Kind') -> 'Msg.Kind':\n return Msg.Kind.OUT if kind is Msg.Kind.IN else Msg.Kind.IN\n
"},{"location":"api-reference/thread/#sibila.Msg.Kind.__repr__","title":"__repr__","text":"__repr__()\n
Source code in sibila/thread.py
def __repr__(self):\n return repr(self.value)\n
"},{"location":"api-reference/thread/#sibila.Msg.kind","title":"kind instance-attribute
","text":"kind\n
Message kind.
"},{"location":"api-reference/thread/#sibila.Msg.text","title":"text instance-attribute
","text":"text\n
Message text (mandatory).
"},{"location":"api-reference/thread/#sibila.Msg.images","title":"images class-attribute
instance-attribute
","text":"images = None\n
List of images in message. An entry must have a 'url' key, but any other keys can be added. Key 'url' key must be a remote url (https,http) or a 'data:' base64-encoded url.
"},{"location":"api-reference/thread/#sibila.Msg.make_IN","title":"make_IN staticmethod
","text":"make_IN(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_IN(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.IN,\n text,\n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.make_OUT","title":"make_OUT staticmethod
","text":"make_OUT(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_OUT(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.OUT, \n text, \n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.make_INST","title":"make_INST staticmethod
","text":"make_INST(text, images=None)\n
Source code in sibila/thread.py
@staticmethod\ndef make_INST(text: str,\n images: Optional[Union[list,str,dict]] = None) -> 'Msg':\n return Msg(Msg.Kind.INST, \n text,\n images)\n
"},{"location":"api-reference/thread/#sibila.Msg.clone","title":"clone","text":"clone()\n
Source code in sibila/thread.py
def clone(self) -> 'Msg':\n return Msg(self.kind, self.text, self.images)\n
"},{"location":"api-reference/thread/#sibila.Msg.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/thread.py
@staticmethod\ndef from_dict(dic: dict) -> 'Msg':\n return Msg(kind=Msg.Kind(dic[\"kind\"]),\n text=dic[\"text\"],\n images=dic[\"images\"])\n
"},{"location":"api-reference/thread/#sibila.Msg.as_dict","title":"as_dict","text":"as_dict()\n
Return Msg as a dict.
Source code in sibila/thread.py
def as_dict(self) -> dict:\n \"\"\"Return Msg as a dict.\"\"\"\n return {\"kind\": self.kind.value, # kind as string\n \"text\": self.text,\n \"images\": self.images}\n
"},{"location":"api-reference/thread/#sibila.Msg.from_chatml","title":"from_chatml staticmethod
","text":"from_chatml(dic, join_sep='\\n')\n
Source code in sibila/thread.py
@staticmethod\ndef from_chatml(dic: dict,\n join_sep:str = \"\\n\") -> 'Msg':\n\n role = dic.get(\"role\")\n if role is None:\n raise ValueError(f\"Key 'role' not found in {dic}\")\n\n kind = Msg.Kind.from_chatml_role(role)\n\n content = dic.get(\"content\")\n if content is None:\n raise ValueError(f\"Bad 'content' key in {dic}\")\n\n text = ''\n images = []\n if isinstance(content, list):\n for cont in content:\n if not isinstance(cont, dict) or \"type\" not in cont:\n raise TypeError(f\"ChatML list entries must be of type dict and include a 'type' key in {cont}\")\n\n if cont[\"type\"] == \"text\":\n text = join_text(text, cont[\"text\"], join_sep)\n\n elif cont[\"type\"] == \"image_url\":\n image = cont[\"image_url\"]\n if \"url\" not in image:\n raise TypeError(f\"ChatML image_url entries must include a 'url' key in {cont}\")\n images.append(image)\n\n elif isinstance(content, str):\n text = content\n\n else:\n raise TypeError(f\"ChatML content must have str or dict type in {content}\")\n\n return Msg(kind, \n text,\n images if images else None)\n
"},{"location":"api-reference/thread/#sibila.Msg.as_chatml","title":"as_chatml","text":"as_chatml()\n
Returns message in a ChatML dict.
Returns:
Type Description dict
A ChatML dict with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self) -> dict:\n \"\"\"Returns message in a ChatML dict.\n\n Returns:\n A ChatML dict with \"role\" and \"content\" keys.\n \"\"\"\n\n role = self.kind.as_chatml_role()\n\n if self.images:\n chatml_msg = {\n \"role\": role, \n \"content\": [\n {\"type\": \"text\", \"text\": self.text},\n ]}\n\n for image in self.images:\n if \"url\" not in image:\n raise ValueError(f\"Image without 'url' key at {image}\")\n\n image_url = {\"url\": image[\"url\"]}\n if \"detail\" in image:\n image_url[\"detail\"] = image[\"detail\"]\n\n chatml_msg[\"content\"].append( # type: ignore[attr-defined]\n {\"type\": \"image_url\", \"image_url\": image_url}\n )\n return chatml_msg\n else:\n return {\"role\": role, \"content\": self.text}\n
"},{"location":"api-reference/tokenizer/","title":"Model tokenizers","text":""},{"location":"api-reference/tokenizer/#llamacpp","title":"LlamaCpp","text":""},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer","title":"LlamaCppTokenizer","text":"LlamaCppTokenizer(llama)\n
Tokenizer for llama.cpp loaded GGUF models.
Source code in sibila/llamacpp.py
def __init__(self, \n llama: Llama):\n self._llama = llama\n\n self.vocab_size = self._llama.n_vocab()\n\n self.bos_token_id = self._llama.token_bos()\n self.bos_token = llama_token_get_text(self._llama.model, self.bos_token_id).decode(\"utf-8\")\n\n self.eos_token_id = self._llama.token_eos()\n self.eos_token = llama_token_get_text(self._llama.model, self.eos_token_id).decode(\"utf-8\")\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None # ? fill by taking a look at id 0?\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/llamacpp.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n\n # str -> bytes\n btext = text.encode(\"utf-8\", errors=\"ignore\")\n\n return self._llama.tokenize(btext, add_bos=False, special=True)\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/llamacpp.py
def decode(self,\n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n\n if not len(token_ids):\n return \"\"\n\n output = b\"\"\n size = 32\n buffer = (ctypes.c_char * size)()\n\n if not skip_special:\n special_toks = {self.bos_token_id: self.bos_token.encode(\"utf-8\"), # type: ignore[union-attr]\n self.eos_token_id: self.eos_token.encode(\"utf-8\")} # type: ignore[union-attr]\n\n for token in token_ids:\n if token == self.bos_token_id:\n output += special_toks[token]\n elif token == self.eos_token_id:\n output += special_toks[token]\n else:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n else: # skip special\n for token in token_ids:\n if token != self.bos_token_id and token != self.eos_token_id:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n\n # \"User code is responsible for removing the leading whitespace of the first non-BOS token when decoding multiple tokens.\"\n if (# token_ids[0] != self.bos_token_id and # we also try cutting if first is bos to approximate HF tokenizer\n len(output) and output[0] <= 32 # 32 = ord(' ')\n ):\n output = output[1:]\n\n return output.decode(\"utf-8\", errors=\"ignore\")\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tokenizer/#openai","title":"OpenAI","text":""},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer","title":"OpenAITokenizer","text":"OpenAITokenizer(model)\n
Tokenizer for OpenAI models.
Source code in sibila/openai.py
def __init__(self, \n model: str\n ):\n\n if not has_tiktoken:\n raise Exception(\"Please install tiktoken by running: pip install tiktoken\")\n\n self._tok = tiktoken.encoding_for_model(model)\n\n self.vocab_size = self._tok.n_vocab\n\n self.bos_token_id = None\n self.bos_token = None\n\n self.eos_token_id = None\n self.eos_token = None\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/openai.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n return self._tok.encode(text)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/openai.py
def decode(self, \n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n assert skip_special, \"OpenAITokenizer only supports skip_special=True\"\n\n return self._tok.decode(token_ids)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tools/","title":"Tools","text":""},{"location":"api-reference/tools/#sibila.tools","title":"tools","text":"Tools for model interaction, summarization, etc.
- interact(): Interact with model as in a chat, using input().
- loop(): Iteratively append inputs and generate model outputs.
- recursive_summarize(): Recursively summarize a (large) text or text file.
"},{"location":"api-reference/tools/#sibila.tools.interact","title":"interact","text":"interact(\n model,\n *,\n th=None,\n inst_text=None,\n trim_flags=TRIM_DEFAULT,\n genconf=None,\n max_tokens_default=-20\n)\n
Interact with model as in a chat, using input().
Includes a list of commands: type !? to see help.
Parameters:
Name Type Description Default model
Model
Model to use for generating.
required th
Optional[Thread]
Optional input Thread. Defaults to None.
None
inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
trim_flags
Trim
Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses to model's genconf.
None
max_tokens_default
int
Used if a non-zero genconf.max_tokens is not found.
-20
Returns:
Type Description Thread
Thread after all the interactions.
Source code in sibila/tools.py
def interact(model: Model,\n *,\n th: Optional[Thread] = None,\n inst_text: Optional[str] = None,\n trim_flags: Thread.Trim = TRIM_DEFAULT,\n\n genconf: Optional[GenConf] = None,\n max_tokens_default: int = -20\n ) -> Thread:\n \"\"\"Interact with model as in a chat, using input().\n\n Includes a list of commands: type !? to see help.\n\n Args:\n model: Model to use for generating.\n th: Optional input Thread. Defaults to None.\n inst_text: text for Thread instructions. Defaults to None.\n trim_flags: Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n genconf: Model generation configuration. Defaults to None, which uses to model's genconf.\n max_tokens_default: Used if a non-zero genconf.max_tokens is not found.\n\n Returns:\n Thread after all the interactions.\n \"\"\"\n\n def callback(out: Union[GenOut,None], \n th: Thread, \n model: Model,\n genconf: GenConf,\n max_token_len: int) -> bool:\n\n if out is not None:\n if out.res != GenRes.OK_STOP:\n print(f\"***Result={GenRes.as_text(out.res)}***\")\n\n if out.text:\n text = out.text\n else:\n text = \"***No text out***\"\n\n th.add_OUT(text)\n print(text)\n print()\n\n\n def print_thread_info():\n length = model.token_len(th, genconf)\n print(f\"Thread token len={length}, max len before next gen={max_token_len}\")\n\n\n\n # input loop ===============================================\n MARKER: str = '\"\"\"'\n multiline: str = \"\"\n\n while True:\n\n user = input('>').strip()\n\n if multiline:\n if user.endswith(MARKER):\n user = multiline + \"\\n\" + user[:-3]\n multiline = \"\"\n else:\n multiline += \"\\n\" + user\n continue\n\n else:\n if not user:\n return False # terminate loop\n\n elif user.startswith(MARKER):\n multiline = user[3:]\n continue\n\n elif user.endswith(\"\\\\\"):\n user = user[:-1]\n user = user.replace(\"\\\\n\", \"\\n\")\n th.add_IN(user)\n continue\n\n elif user.startswith(\"!\"): # a command\n params = user[1:].split(\"=\")\n cmd = params[0]\n params = params[1:]\n\n if cmd == \"inst\":\n th.clear()\n if params:\n text = params[0].replace(\"\\\\n\", \"\\n\")\n th.inst.text = text\n\n elif cmd == \"add\" or cmd == \"a\":\n if params:\n try:\n path = params[0]\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n th.add_IN(text)\n print(text[:500])\n except FileNotFoundError:\n print(f\"Could not load '{path}'\")\n else:\n print(\"Path needed\")\n\n elif cmd == 'c':\n print_thread_info()\n print(th)\n\n elif cmd == 'cl':\n if not params:\n params.append(\"thread.json\")\n try:\n th.load(params[0], \n clear=True)\n print(f\"Loaded context from {params[0]}\")\n except FileNotFoundError:\n print(f\"Could not load '{params[0]}'\")\n\n elif cmd == 'cs':\n if not params:\n params.append(\"thread.json\")\n th.save(params[0])\n print(f\"Saved context to {params[0]}\")\n\n elif cmd == 'image':\n if not params:\n print(\"No image given, using a remote photo of two cats\")\n params.append(\"https://upload.wikimedia.org/wikipedia/commons/thumb/3/3b/Singapura_Cats.jpg/320px-Singapura_Cats.jpg\")\n try:\n th.add_IN(\"\", params[0])\n print(f\"Added image '{params[0]}'.\\nPlease enter your question:\")\n except FileNotFoundError:\n print(f\"Could not local image '{params[0]}'\")\n\n elif cmd == 'tl':\n print_thread_info()\n\n elif cmd == 'info':\n print(f\"Model:\\n{model.info()}\")\n print(f\"GenConf:\\n{genconf}\\n\")\n\n print_thread_info()\n\n\n else:\n print(f\"Unknown command '!{cmd}' - known commands:\\n\"\n \" !inst[=text] - clear messages and add inst (system) message\\n\"\n \" !add=path - load file and add to last msg\\n\"\n \" !image=path/url - include a local or remote image. Local images must fit the context!\\n\"\n \" !c - list context msgs\\n\"\n \" !cl=path - load context (default=thread.json)\\n\"\n \" !cs=path - save context (default=thread.json)\\n\"\n \" !tl - thread's token length\\n\"\n \" !info - model and genconf info\\n\"\n ' Delimit with \"\"\" for multiline begin/end or terminate line with \\\\ to continue into a new line\\n'\n \" Empty line + enter to quit\"\n )\n # \" !p - show formatted prompt (if model supports it)\\n\"\n # \" !to - prompt's tokens\\n\"\n\n print()\n continue\n\n # we have a user prompt\n user = user.replace(\"\\\\n\", \"\\n\")\n break\n\n\n th.add_IN(user)\n\n return True # continue looping\n\n\n\n if genconf is None:\n genconf = model.genconf\n\n if genconf.max_tokens == 0:\n genconf = genconf(max_tokens=max_tokens_default)\n\n # start prompt loop\n th = loop(callback,\n model,\n\n thread=th,\n inst_text=inst_text,\n in_text=None, # call callback for first prompt\n trim_flags=trim_flags,\n genconf=genconf)\n\n return th\n
"},{"location":"api-reference/tools/#sibila.tools.loop","title":"loop","text":"loop(\n callback,\n model,\n *,\n inst_text=None,\n in_text=None,\n trim_flags=TRIM_DEFAULT,\n max_token_len=None,\n thread=None,\n genconf=None\n)\n
Iteratively append inputs and generate model outputs.
Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.
If last Thread msg is not Msg.Kind.IN, callback() will be called with out_text=None.
Parameters:
Name Type Description Default callback
Callable[[Union[GenOut, None], Thread, Model, GenConf, int], bool]
A function(out, ctx, model) that will be iteratively called with model's output.
required model
Model
Model to use for generating.
required inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
in_text
Optional[str]
Text for Thread's initial Msg.Kind.IN. Defaults to None.
None
trim_flags
Trim
Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
max_token_len
Optional[int]
Maximum token count to use when trimming. Defaults to None.
None
thread
Optional[Thread]
Optional input Thread. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses to model's genconf.
None
Source code in sibila/tools.py
def loop(callback: Callable[[Union[GenOut,None], Thread, Model, GenConf, int], bool],\n model: Model,\n *,\n inst_text: Optional[str] = None,\n in_text: Optional[str] = None,\n\n trim_flags: Thread.Trim = TRIM_DEFAULT,\n max_token_len: Optional[int] = None,\n thread: Optional[Thread] = None,\n\n genconf: Optional[GenConf] = None,\n ) -> Thread:\n \"\"\"Iteratively append inputs and generate model outputs.\n\n Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.\n\n If last Thread msg is not Msg.Kind.IN, callback() will be called with out_text=None.\n\n Args:\n callback: A function(out, ctx, model) that will be iteratively called with model's output.\n model: Model to use for generating.\n inst_text: text for Thread instructions. Defaults to None.\n in_text: Text for Thread's initial Msg.Kind.IN. Defaults to None.\n trim_flags: Thread trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n max_token_len: Maximum token count to use when trimming. Defaults to None.\n thread: Optional input Thread. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses to model's genconf.\n \"\"\"\n\n if thread is None:\n thread = Thread()\n else:\n thread = thread\n\n if inst_text is not None:\n thread.inst.text = inst_text\n if in_text is not None:\n thread.add_IN(in_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if max_token_len is None:\n resolved_max_tokens = genconf.resolve_max_tokens(model.ctx_len, model.max_tokens_limit)\n max_token_len = model.ctx_len - resolved_max_tokens\n if max_token_len == 0:\n raise ValueError(\"Unable to calc max_token_len: either pass the value to this function or set GenConf.max_tokens to a non-zero value\")\n\n while True:\n\n if len(thread) and thread[-1].kind == Msg.Kind.IN:\n # last is an IN message: we can trim and generate\n\n thread.trim(trim_flags,\n max_token_len,\n model.token_len_lambda)\n\n out = model.gen(thread, genconf)\n else:\n out = None # first call\n\n res = callback(out, \n thread, \n model,\n genconf,\n max_token_len)\n\n if not res:\n break\n\n\n return thread\n
"},{"location":"api-reference/tools/#sibila.tools.recursive_summarize","title":"recursive_summarize","text":"recursive_summarize(\n model,\n text=None,\n path=None,\n overlap_size=20,\n max_token_len=None,\n genconf=None,\n)\n
Recursively summarize a large text or text file, to fit in a Thread context.
Works by:
- Breaking text into chunks that fit models context.
- Run model to summarize chunks.
- Join generated summaries and jump to 1. - do this until text size no longer decreases.
Parameters:
Name Type Description Default model
Model
Model to use for summarizing.
required text
Optional[str]
Initial text.
None
path
Optional[str]
--Or-- A path to an UTF-8 text file.
None
overlap_size
int
Size in model tokens of the overlapping portions at beginning and end of chunks.
20
Returns:
Type Description str
The summarized text.
Source code in sibila/tools.py
def recursive_summarize(model: Model,\n text: Optional[str] = None,\n path: Optional[str] = None,\n overlap_size: int = 20,\n max_token_len: Optional[int] = None,\n genconf: Optional[GenConf] = None) -> str:\n\n \"\"\"Recursively summarize a large text or text file, to fit in a Thread context.\n\n Works by:\n\n 1. Breaking text into chunks that fit models context.\n 2. Run model to summarize chunks.\n 3. Join generated summaries and jump to 1. - do this until text size no longer decreases.\n\n Args:\n model: Model to use for summarizing.\n text: Initial text.\n path: --Or-- A path to an UTF-8 text file.\n overlap_size: Size in model tokens of the overlapping portions at beginning and end of chunks.\n\n Returns:\n The summarized text.\n \"\"\"\n\n if (text is not None) + (path is not None) != 1:\n raise ValueError(\"Only one of text or path can be given\")\n\n if path is not None:\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n\n inst_text = \"\"\"Your task is to do short summaries of text.\"\"\"\n in_text = \"Summarize the following text:\\n\"\n th = Thread(inst=inst_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if max_token_len is None:\n if model.genconf.max_tokens == 0:\n raise ValueError(\"Unable to calc max_token_len: make sure genconf.max_tokens is not zero\")\n\n resolved_max_tokens = genconf.resolve_max_tokens(model.ctx_len, model.max_tokens_limit)\n\n thread = Thread.make_INST_IN(inst_text, in_text)\n token_len = model.token_len(thread)\n max_token_len = model.ctx_len - resolved_max_tokens - (token_len + 16) \n\n\n # split initial text\n logger.debug(f\"Max token len {max_token_len}\")\n\n token_len_fn = model.token_len_lambda\n logger.debug(f\"Initial text token_len {token_len_fn(text)}\") # type: ignore[arg-type,call-arg]\n\n spl = RecursiveTextSplitter(max_token_len, overlap_size, len_fn=token_len_fn) # type: ignore[arg-type]\n\n round = 0\n while True: # summarization rounds\n logger.debug(f\"Round {round} {'='*60}\")\n\n in_list = spl.split(text=text)\n in_len = sum([len(t) for t in in_list])\n\n logger.debug(f\"Split in {len(in_list)} parts, total len {in_len} chars\")\n\n out_list = []\n for i,t in enumerate(in_list):\n\n logger.debug(f\"{round}>{i} {'='*30}\")\n\n th.clear(clear_inst=False)\n th.add_IN(in_text)\n th.add_IN(t)\n\n out = model.gen(th) \n logger.debug(out)\n\n out_list.append(out.text)\n\n text = \"\\n\".join(out_list)\n\n out_len = len(text) # sum([len(t) for t in out_list])\n if out_len >= in_len:\n break\n elif len(out_list) == 1:\n break\n else:\n round += 1\n\n return text\n
"},{"location":"examples/","title":"Examples","text":"Example Description Hello model Introductory pirate arrr-example: create local or remote models, use the Models class to simplify. From text to object Keypoint extractor, showing progressively better ways to query a model, from plain text, JSON, to Pydantic classes. Extract information Extract information about all persons mentioned in a text. Also available in a dataclass version. Tag customer queries Summarize and classify customer queries into tags. Quick meeting Extracting participants, action items and priorities from a simple meeting transcript. Tough meeting Extracting information from a long and complex transcript. Compare model output Compare sentiment analyses of customer reviews done by two models. Chat interaction Interact with the model as in a back-and-forth chat session. Receipt Extract structured data from a photo of a receipt. Simple tools Have the model decide and use tools. Async Run tasks in parallel with the *_async methods. Model management with CLI Download and manage models with the command-line sibila. Each example is explained in a Read Me and usually include a Jupyter notebook and/or a .py script version.
Most of the examples use a local model but you can quickly change to using OpenAI models by uncommenting one or two lines.
"},{"location":"examples/async/","title":"Async","text":"In this example we'll look at how to do multiple parallel requests to remote models by using Python's asyncio capabilities.
Generating from local llama.cpp models does not benefit from async functionality, because the local models must already be loaded in memory and can't benefit from asynchronous IO loading. When the async methods are used with a LlamaCppModel, inference will end up being made sequentially.
So we'll be using a remote OpenAI model. Make sure you defined the env variable OPENAI_API_KEY with a valid token.
This example is available as a Jupyter notebook or a Python script in this folder.
As usual, let's start by creating the model:
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nimport time, asyncio\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\")\n\n# convenience time-counting functions:\nstart_time = None\ndef start_secs():\n global start_time\n start_time = time.time()\ndef secs(): \n return f\"{time.time() - start_time:.1f}\"\n
We'll create two tasks that will run in parallel: 1. Ask the model to generate 20 names 2. Classify a phrase as spam
This example is running in a Jupyter notebook, so we can directly call the function with an await. In a python script we'd use asyncio.run() instead.
Note that we're using the _async suffix methods: extract_async() and classify_async(), instead of the normal functions.
The first task, generate 20 names:
async def extract_names(): \n print(\"extract_names begin...\", secs())\n\n names = await model.extract_async(list[str],\n \"Generate 20 English names with first name and surname\")\n\n print(\"...extract_names done\", secs())\n\n return names\n\nstart_secs()\nawait extract_names()\n
extract_names begin... 0.0\n...extract_names done 4.4\n\n\n\n\n\n['James Smith',\n 'Michael Johnson',\n 'Robert Williams',\n 'Maria Garcia',\n 'David Jones',\n 'Jennifer Miller',\n 'John Davis',\n 'Patricia Wilson',\n 'Daniel Anderson',\n 'Elizabeth Taylor',\n 'William Brown',\n 'Barbara Moore',\n 'Joseph Thompson',\n 'Susan Martinez',\n 'Charles Jackson',\n 'Linda Harris',\n 'Thomas Clark',\n 'Jessica Lewis',\n 'Christopher Walker',\n 'Sarah Robinson']\n
The second task will classify a phrase as \"spam\"/\"not spam\":
async def classify_spam():\n print(\"classify_spam begin...\", secs())\n\n classification = await model.classify_async([\"spam\", \"not spam\"],\n \"I am a Nigerian prince and will make you very rich!\")\n\n print(\"...classify_spam done\", secs())\n\n return classification\n\nstart_secs()\nawait classify_spam()\n
classify_spam begin... 0.0\n...classify_spam done 1.4\n\n\n\n\n\n'spam'\n
Let's use asyncio.as_completed(), to receive each task output, as soon as it's ready:
async def run_tasks():\n print(\"as_complete begin---\", secs())\n\n tasks = [extract_names(), classify_spam()]\n for task in asyncio.as_completed(tasks):\n res = await task\n print(\"Result:\", res)\n\n print(\"---as_complete done\", secs())\n\nstart_secs()\nawait run_tasks()\n
as_complete begin--- 0.0\nextract_names begin... 0.0\nclassify_spam begin... 0.0\n...classify_spam done 1.0\nResult: spam\n...extract_names done 5.8\nResult: ['James Smith', 'Emma Johnson', 'Olivia Williams', 'Liam Brown', 'Ava Jones', 'Noah Garcia', 'Sophia Miller', 'Mason Davis', 'Isabella Rodriguez', 'Ethan Martinez', 'Mia Hernandez', 'Logan Wilson', 'Charlotte Anderson', 'Aiden Moore', 'Harper Thomas', 'Lucas Jackson', 'Ella White', 'Benjamin Taylor', 'Amelia Harris', 'Alexander Clark']\n---as_complete done 5.8\n
Follow the above begin/done print statements and the listed time in seconds, as they are printed.
Both tasks were started at the same time and classify_spam() terminated first (at the 1.0s mark), because it's a shorter task that simply outputs \"spam\"/\"not spam\".
On the meanwhile, the model worked on generating the 20 names that we requested with extract_names(), a longer operation which terminates later (at 4.8s).
In the same manner any other tasks could be run in parallel by using the *_async() methods of the model classes.
"},{"location":"examples/cli/","title":"Sibila CLI","text":"In this example we'll see how to use the sibila Command-Line Interface (CLI) to download a GGUF model from the Hugging Face model hub.
We'll then register it in the Models factory, so that it can be easily used with Models.create(). The Models factory is based in a folder where model GGUF format files are stored and two configuration files: \"models.json\" and \"formats.json\".
After Doing the above, we'll be able to use this model in Python with two lines:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:rocket\")\n
Let's run sibila CLI to get help:
> sibila --help\n\nusage: sibila [-h] [--version] {models,formats,hub} ...\n\nSibila CLI tool for managing models and formats.\n\noptions:\n -h, --help show this help message and exit\n --version show program's version number and exit\n\nactions:\n Use 'models' and 'formats' to manage, 'hub' to search and download models.\n\n {models,formats,hub} Run 'sibila {command} --help' for specific help.\n\nFor help, check https://jndiogo.github.io/sibila/\n
Sibila CLI has three modes:
- models: to edit a 'models.json' file, create model entries set format, etc.
- formats: to edit a 'formats.json' file, add new formats, etc.
- hub: search and download models from Hugging Face model hub.
Specific help for each mode is available by doing: sibila mode --help
Let's download the Rocket 3B model, a small but capable model, fine-tuned for chat/instruct prompts:
https://huggingface.co/TheBloke/rocket-3B-GGUF
We'll use a \"sibila hub -d\" command to download to \"../../models\" folder. We'll get the 4-bit quantization (Q4_K_M):
> sibila hub -d 'TheBloke/rocket-3B-GGUF' -f Q4_K_M -m '../../models'\n\nSearching...\nDownloading model 'TheBloke/rocket-3B-GGUF' file 'rocket-3b.Q4_K_M.gguf' to '../../models/rocket-3b.Q4_K_M.gguf'\n\nDownload complete.\nFor information about this and other models, please visit https://huggingface.co\n
After this command, the \"rocket-3b.Q4_K_M.gguf\" file has now been downloaded to the \"../../models\" folder.
We'll now register it with the Models factory, which is located in the folder to where we downloaded the model.
This can be done by editing the \"models.json\" file directly or even simpler, with a \"sibila models -s\" command:
> sibila models -s llamacpp:rocket rocket-3b.Q4_K_M.gguf -m '../../models'\n\nUsing models directory '../../models'\nSet model 'llamacpp:rocket' with name='rocket-3b.Q4_K_M.gguf' at '/home/jorge/ai/sibila/models/models.json'.\n
An entry has now been created in \"models.json\" for this model.
However, we did not set the chat template format - but let's first test if the downloaded GGUF file already includes it in its metadata.
This is done with \"sibila models -t\":
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nError: Could not find a suitable chat template format for this model. Without a format, fine-tuned models cannot function properly. See the docs on how you can fix this: either setup the format in Models factory, or provide the chat template in the 'format' arg.\n
Error. Looks like we need to set the chat template format!
Checking the model's page, we find that it uses the ChatML prompt/chat template, which is great because it's one of the base formats included with Sibila.
So let's set the template format in the \"llamacpp:rocket\" entry we've just created:
> sibila models -f llamacpp:rocket chatml -m '../../models'\n\nUsing models directory '/home/jorge/ai/sibila/models'\nUpdated model 'llamacpp:rocket' with format 'chatml' at '/home/jorge/ai/sibila/models/models.json'.\n
Let's now test again:
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nModel 'llamacpp:rocket' was properly created and should run fine.\n
Great - the model passed the test and should be ready for use.
Let's try using it from Python:
from sibila import Models\n\nModels.setup(\"../../models\") # the folder with models and configs\n\nmodel = Models.create(\"llamacpp:rocket\") # model name in provider:name format\n\nmodel(\"Hello there!\")\n
\"Hello! I am an AI language model and I'm here to assist you with your questions or provide information if you have any. I am programmed to be polite and respectful, so please let me know how I can help you today.\"\n
Seems to be working - and politely too!
"},{"location":"examples/compare/","title":"Compare","text":"In this example we'll use an utility function from the multigen module that builds a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
This function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
For the local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the local_name variable below, after the text \"llamacpp:\".
Jupyter notebook and Python script versions are available in the example's folder.
Instead of directly creating models as we've seen in previous examples, multigen will create the models via the Models class directory.
We'll start by choosing a local and a remote model that we'll compare.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nlocal_name = \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\n# to use an OpenAI model:\nremote_name = \"openai:gpt-4\"\n
Now let's define a list of reviews that we'll ask the two models to do sentiment analysis upon.
These are generic product reviews, that you could find in an online store.
reviews = [\n\"The user manual was confusing, but once I figured it out, the product more or less worked.\",\n\"This widget changed my life! It's sleek, efficient, and worth every penny.\",\n\"I'm disappointed with the product quality. It broke after just a week of use.\",\n\"The customer service team was incredibly helpful in resolving my issue with the device.\",\n\"I'm blown away by the functionality of this gadget. It exceeded my expectations.\",\n\"The packaging was damaged upon arrival, but the product itself works great.\",\n\"I've been using this tool for months, and it's still as good as new. Highly recommended!\",\n\"I regret purchasing this item. It doesn't perform as advertised.\",\n\"I've never had so much trouble with a product before. It's been a headache from day one.\",\n\"I bought this as a gift for my friend, and they absolutely love it!\",\n\"The price seemed steep at first, but after using it, I understand why. Quality product.\",\n\"This gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\"\n]\n\n# model instructions text, also known as system message\ninst_text = \"You are a helpful assistant that analyses text sentiment.\"\n
Since we just want to obtain a sentiment classification, we'll use a convenient enumeration: a list with three values: positive, negative or neutral.
Let's try the first review on a local model:
sentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nin_text = \"Each line is a product review. Extract the sentiment associated with each review:\\n\\n\" + reviews[0]\n\nprint(reviews[0])\n\nlocal_model = Models.create(local_name)\n\nout = local_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\n# to clear memory\ndel local_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
Definitely, 'neutral' is a good answer for this one.
Let's now try the remote model:
print(reviews[0])\n\nremote_model = Models.create(remote_name)\n\nout = remote_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\ndel remote_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
And the remote model (GPT-4) seems to agree on neutrality.
By using the query_multigen() function that we'll import from sibila.multigen, we'll be able to compare what multiple models generate in response to each input.
In our case the inputs will be the list of reviews. This function accepts these interesting arguments: - text: type of text output, which can be the word \"print\" or a text filename to which it will save. - csv: type of CSV output, which can also be \"print\" or a text filename to save into. - out_keys: what we want listed: the generated raw text (\"text\"), a Python dict (\"dict\") or a Pydantic object (\"obj\"). For our case \"dict\" is the right one. - gencall: we need to pass a function that will actually call the model for each input. We use a convenient predefined function and provide it with the sentiment_type definition.
Let's run it with our two models:
from sibila.multigen import (\n query_multigen,\n make_extract_gencall\n)\n\nsentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nout = query_multigen(reviews,\n inst_text,\n model_names = [local_name, remote_name],\n text=\"print\",\n csv=\"sentiment.csv\",\n out_keys = [\"value\"],\n gencall = make_extract_gencall(sentiment_enum)\n )\n
////////////////////////////////////////////////////////////\nThe user manual was confusing, but once I figured it out, the product more or less worked.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-4 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nThis widget changed my life! It's sleek, efficient, and worth every penny.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm disappointed with the product quality. It broke after just a week of use.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nThe customer service team was incredibly helpful in resolving my issue with the device.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm blown away by the functionality of this gadget. It exceeded my expectations.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe packaging was damaged upon arrival, but the product itself works great.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nI've been using this tool for months, and it's still as good as new. Highly recommended!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI regret purchasing this item. It doesn't perform as advertised.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI've never had so much trouble with a product before. It's been a headache from day one.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-4 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI bought this as a gift for my friend, and they absolutely love it!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe price seemed steep at first, but after using it, I understand why. Quality product.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-4 -> OK_STOP\n'positive'\n
The output format is - see comments nearby -----> arrows:
//////////////////////////////////////////////////////////// -----> This is the model input, a review text:\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP <----- Local model name and result\n'positive' <----- What the local model output\n==================== openai:gpt-4 -> OK_STOP <----- Remote model name and result\n'positive' <----- Remote model output\n
We also requested the creation of a CSV file with the results: sentiment.csv.
Example's assets at GitHub.
"},{"location":"examples/extract/","title":"Extract Pydantic","text":"In this example we'll extract information about all persons mentioned in a text. This example is also available in a dataclass version.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from pydantic import BaseModel, Field\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' source_location='London'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' source_location='Barcelona'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' source_location='Tokyo'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' source_location='Mumbai'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' source_location='Paris'\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident in the thoughtful lines that adorned his face' source_location='Barcelona' source_country='Spain'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories of love and longing' source_location='Tokyo' source_country='Japan'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology with ease, intellect shining through the chaos of urban life' source_location='Mumbai' source_country='India'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace reflected the timeless elegance of her beloved city' source_location='Paris' source_country='France'\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married: bool\" field to the Person class.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.' source_location='London' source_country='United Kingdom' is_married=False\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.' source_location='Barcelona' source_country='Spain' is_married=False\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.' source_location='Tokyo' source_country='Japan' is_married=False\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.' source_location='Mumbai' source_country='India' is_married=True\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='her impeccable style and effortless grace reflected the timeless elegance of her beloved city.' source_location='Paris' source_country='France' is_married=False\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/extract_dataclass/","title":"Extract dataclass","text":"This is the Python dataclass version of of the Pydantic extraction example.
We'll extract information about all persons mentioned in a text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from dataclasses import dataclass\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', source_location='London')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', source_location='Barcelona')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', source_location='Tokyo')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', source_location='Mumbai')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', source_location='Paris')\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='exuded creativity, passion for design evident in the thoughtful lines that adorned his face', source_location='Barcelona', source_country='Spain')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='navigated the complexities of technology with ease, intellect shining through the chaos of urban life', source_location='Mumbai', source_country='India')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France')\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married\" field to the Person dataclass.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.', source_location='London', source_country='United Kingdom', is_married=False)\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.', source_location='Barcelona', source_country='Spain', is_married=False)\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.', source_location='Tokyo', source_country='Japan', is_married=False)\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.', source_location='Mumbai', source_country='India', is_married=True)\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='her impeccable style and effortless grace reflected the timeless elegance of her beloved city.', source_location='Paris', source_country='France', is_married=False)\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/from_text_to_object/","title":"From text to object","text":"In this example we'll ask the model to extract keypoints from a text: - First in plain text format - Then free JSON output (with fields selected by the model) - Later constrained by a JSON schema (so that we can specify which fields) - And finally by generating to a Pydantic object (from a class definition)
All the queries will be made at temperature=0, which is the default GenConf setting. This means that the model is giving it's best (as in most probable) answer and that it will always output the same results, given the same inputs.
Also available as a Jupyter notebook or a Python script in the example's folder.
We'll start by creating either a local model or a GPT-4 model.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\". For an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Let's use this fragment from Wikipedia's entry on the Fiji islands: https://en.wikipedia.org/wiki/
doc = \"\"\"\\\nFiji, officially the Republic of Fiji,[n 2] is an island country in Melanesia,\npart of Oceania in the South Pacific Ocean. It lies about 1,100 nautical miles \n(2,000 km; 1,300 mi) north-northeast of New Zealand. Fiji consists of \nan archipelago of more than 330 islands\u2014of which about 110 are permanently \ninhabited\u2014and more than 500 islets, amounting to a total land area of about \n18,300 square kilometres (7,100 sq mi). The most outlying island group is \nOno-i-Lau. About 87% of the total population of 924,610 live on the two major \nislands, Viti Levu and Vanua Levu. About three-quarters of Fijians live on \nViti Levu's coasts, either in the capital city of Suva, or in smaller \nurban centres such as Nadi (where tourism is the major local industry) or \nLautoka (where the sugar-cane industry is dominant). The interior of Viti Levu \nis sparsely inhabited because of its terrain.[13]\n\nThe majority of Fiji's islands were formed by volcanic activity starting around \n150 million years ago. Some geothermal activity still occurs today on the islands \nof Vanua Levu and Taveuni.[14] The geothermal systems on Viti Levu are \nnon-volcanic in origin and have low-temperature surface discharges (of between \nroughly 35 and 60 degrees Celsius (95 and 140 \u00b0F)).\n\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and \nlater Melanesians, with some Polynesian influences. Europeans first visited Fiji \nin the 17th century.[15] In 1874, after a brief period in which Fiji was an \nindependent kingdom, the British established the Colony of Fiji. Fiji operated as \na Crown colony until 1970, when it gained independence and became known as \nthe Dominion of Fiji. In 1987, following a series of coups d'\u00e9tat, the military \ngovernment that had taken power declared it a republic. In a 2006 coup, Commodore \nFrank Bainimarama seized power. In 2009, the Fijian High Court ruled that the \nmilitary leadership was unlawful. At that point, President Ratu Josefa Iloilo, \nwhom the military had retained as the nominal head of state, formally abrogated \nthe 1997 Constitution and re-appointed Bainimarama as interim prime minister. \nLater in 2009, Ratu Epeli Nailatikau succeeded Iloilo as president.[16] On 17 \nSeptember 2014, after years of delays, a democratic election took place. \nBainimarama's FijiFirst party won 59.2% of the vote, and international observers \ndeemed the election credible.[17] \n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Be helpful and provide concise answers.\"\n
Let's start with a free text query by calling model().
in_text = \"Extract 5 keypoints of the following text:\\n\" + doc\n\nout = model(in_text, inst=inst_text)\nprint(out)\n
1. Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\n2. It consists of more than 330 islands with a total land area of about 18,300 square kilometres (7,100 sq mi).\n3. The majority of Fiji's population lives on the two major islands, Viti Levu and Vanua Levu.\n4. Most of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\n5. Fiji has a history of political instability, with multiple coups and changes in government since gaining independence in 1970.\n
These are quite reasonable keypoints.
Let's now ask for JSON output, taking care to explicitly request it in the query (in_text variable).
Instead of model() we now use json() which returns a Python dict.
import pprint\npp = pprint.PrettyPrinter(width=300, sort_dicts=False)\n\nin_text = \"Extract 5 keypoints of the following text in JSON format:\\n\\n\" + doc\n\nout = model.json(in_text,\n inst=inst_text)\npp.pprint(out)\n
{'keypoints': [{'point': 'Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'location': 'Melanesia, Oceania'},\n {'point': 'Fiji consists of more than 330 islands with about 110 permanently inhabited islands and over 500 islets.', 'total_land_area': 'about 18,300 square kilometres (7,100 sq mi)'},\n {'point': 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', 'population_statistics': '87%, 924,610'},\n {'point': \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'island_formation': 'volcanic activity, 150 million years ago'},\n {'point': 'Fiji has a history of political instability with multiple coups and changes in government.', 'political_history': 'multiple coups, changes in government'}]}\n
Note how the model chose to return different fields like \"point\" or \"island_formation\".
Because we didn't specify which fields we want, each model will generate different ones.
To specify a fixed format, let's now generate by setting a JSON schema that defines which fields and types we want:
json_schema = {\n \"properties\": {\n \"keypoint_list\": {\n \"description\": \"Keypoint list\",\n \"items\": {\n \"type\": \"string\",\n \"description\": \"Keypoint\"\n },\n \"type\": \"array\"\n }\n },\n \"required\": [\n \"keypoint_list\"\n ],\n \"type\": \"object\"\n}\n
This JSON schema requests that the generated dict constains a \"keypoint_list\" with a list of strings.
We'll also use json(), now passing the json_schema:
out = model.json(in_text,\n inst=inst_text,\n json_schema=json_schema)\n\nprint(out)\n
{'keypoint_list': ['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', \"About 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\", \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]}\n
for kpoint in out[\"keypoint_list\"]:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
It has generated a string list in the \"keypoint_list\" field, as we specified in the JSON schema.
This is better, but the problem with JSON schemas is that they can be quite hard to work with.
Let's use an easier way to specify the fields we want returned: Pydantic classes derived from BaseModel. This is way simpler to use than JSON schemas.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Keypoints(BaseModel):\n keypoint_list: list[str]\n\nout = model.pydantic(Keypoints,\n in_text,\n inst=inst_text)\nprint(out)\n
keypoint_list=['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', \"About 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\", \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]\n
for kpoint in out.keypoint_list:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of Fiji's total population live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC, first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
The pydantic() method returns an object of class Keypoints, instantiated with the model output.
This is a much simpler way to extract structured data from model.
Please see other examples for more interesting objects. In particular, we did not add descriptions to the fields, which are important clues to help the model understand what we want.
Besides Pydantic classes, Sibila can also use Python's dataclass to extract structured data. This is a lighter and easier alternative to using Pydantic.
Example's assets at GitHub.
"},{"location":"examples/hello_model/","title":"Hello model","text":"In this example we see how to directly create local or remote model objects and later to do that more easily with the Models class.
"},{"location":"examples/hello_model/#using-a-local-model","title":"Using a local model","text":"To use a local model, make sure you download its GGUF format file and save it into the \"../../models\" folder.
In these examples, we'll use a 4-bit quantization of the OpenChat-3.5 7 billion parameters model, which at the current time is quite a good model for its size.
The file is named \"openchat-3.5-1210.Q4_K_M.gguf\" and was downloaded from the above link. Make sure to save it into the \"../../models\" folder.
See here for more information about setting up your local models.
With the model file in the \"../../models\" folder, we can run the following script:
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Run the script above and after a few seconds (it has to load the model from disk), the good model answers back something like:
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
"},{"location":"examples/hello_model/#using-an-openai-model","title":"Using an OpenAI model","text":"To use a remote model like GPT-4 you'll need a paid OpenAI account: https://openai.com/pricing
With an OpenAI account, you'll be able to generate an access token that you should set into the OPENAI_API_KEY env variable.
(An even better way is to use .env files with your variables, and use the dotenv library to read them.)
Once a valid OPENAI_API_KEY env variable is set, you can run this script:
from sibila import OpenAIModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
We get back the usual funny pirate answer:
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
"},{"location":"examples/hello_model/#using-the-models-directory","title":"Using the Models directory","text":"In these two scripts we created different objects to access the LLM model: LlamaCppModel and OpenAIModel.
This was done to simplify, but a better way is to use the Models class directory.
Models is a singleton class that implements a directory of models where you can store file locations, configurations, aliases, etc.
After setting up a JSON configuration file you can have the Models class create models by using names like \"llamacpp:openchat\" or \"openai:gpt-4\" together with their predefined settings. This permits easy model change, comparing model outputs, etc.
In the scripts above, instead on instancing different classes for different models, we could use Models class to create the model from a name, by setting the model_name variable:
from sibila import Models, GenConf\n\n# Using a local llama.cpp model: we first setup the ../../models directory:\n# Models.setup(\"../../models\")\n# model_name = \"llamacpp:openchat\"\n\n# OpenAI: make sure you set the environment variable named OPENAI_API_KEY with your API key.\nmodel_name = \"openai:gpt-4\"\n\nmodel = Models.create(model_name,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
The magic happens in the line:
model = Models.create(model_name, ...)\n
The Models class will take care of initializing the model based on the name you provide.
Example's assets at GitHub.
"},{"location":"examples/interact/","title":"Interact","text":"In this example we look at the interact() function, which allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done by using a Context class object, which can manage thread messages and delete older ones when the context maximum length is reached.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
So, let's create a local model.
from sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Import interact() from the tools module and run it.
What follows is the actual chat where the user prompts start with a \">\" and the model answer's immediately below.
Read on, it does get funny at times... : )
from sibila.tools import (\n interact\n)\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9, max_tokens=600))\n
> Hello there!\n\n\nHello! How can I help you today?\n\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\n\nYes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.\n\n\n\n> No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\n\n\nMy apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\n\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\n\n\nGot it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\n\n\n\n> That's right - I can list the context messages by entering \"!c\"\n\n\nI see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.\n\n\n\n> !c\n\n\nThread token len=542, max len before next gen=1448\nThread inst='Be helpful.', join_sep='\\n', len=10\n0: IN='Hello there!'\n1: OUT='Hello! How can I help you today?'\n2: IN=\"I'm doing an example of this LLM access package called Sibila - have you heard about it?\"\n3: OUT='Yes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.'\n4: IN=\"No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\"\n5: OUT=\"My apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\"\n6: IN=\"No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\"\n7: OUT=\"Got it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\"\n8: IN='That\\'s right - I can list the context messages by entering \"!c\"'\n9: OUT='I see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.'\n\n\n\n> Info about the model: \"!i\"\n\n\nThe `!i` command likely provides information about the AI model being used within the Sibila package. This could include details such as the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please let me know.\n\n\n\n> Sorry, it's actually \"!info\"\n\n\nMy apologies for the confusion in my previous response. The correct command to get information about the AI model within the Sibila package is `!info`. This command should provide you with details about the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please feel free to ask.\n\n\n\n> !info\n\n\nModel:\ndesc='LlamaCppModel: '/home/jorge/ai/sibila/models/openchat-3.5-1210.Q4_K_M.gguf' - 'llama 7B Q4_K - Medium'',\nctx_len=2048, max_tokens_limit=2048,\ngenconf=GenConf(max_tokens=0, stop=[], temperature=0.0, top_p=0.9, format='text', json_schema=None, special=None)\nGenConf:\nGenConf(max_tokens=600, stop=[], temperature=0.9, top_p=0.9, format='text', json_schema=None, special=None)\n\nThread token len=736, max len before next gen=1448\n\n\n\n> For help on al lcommands: \"!\"\n\n\nWithin the Sibila package, using an exclamation mark (`!`) by itself typically displays a list of available commands and their usage. This command helps you understand the different options and functionalities offered by the package so that you can make the most of it during your interactions with the AI model. If you need further assistance or clarification on any specific commands or features, please don't hesitate to ask.\n\n\n\n> !\n\n\nUnknown command '!' - known commands:\n !inst[=text] - clear messages and add inst (system) message\n !add=path - load file and add to last msg\n !image=path/url - include a local or remote image. Local images must fit the context!\n !c - list context msgs\n !cl=path - load context (default=thread.json)\n !cs=path - save context (default=thread.json)\n !tl - thread's token length\n !info - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n\n\n\n> The Thread class can manage total token length: it will delete older messages if the thread becomes larger than the content max length.\n\n\nI understand that the `sibila.Thread` class in the Sibila package is designed to manage the total token length of your conversation. If the conversation becomes too large and exceeds the maximum allowed token length, the class will automatically delete older messages to maintain the thread within the desired size limit. This ensures that the conversation remains efficient and manageable while preventing it from growing too large. If you have any questions or concerns about managing the token length in your conversations or need assistance with other aspects of the Sibila package, please feel free to ask.\n\n\n\n> That's it - to see token length: \"!tl\"\n\n\nThe `!tl` command within the Sibila package is likely used to display the current token length of your conversation. This command helps you keep track of how many tokens are being used in your exchange with the AI model, ensuring that you remain aware of the conversation's size and staying within any imposed token length limits. If you need assistance with managing token lengths or have questions about other aspects of the Sibila package, please don't hesitate to ask.\n\n\n\n> !tl\n\n\nThread token len=1128, max len before next gen=1448\n\n\n\n> To save the thread to a JSON file: \"!cs\"\n\n\nThe `!cs` command within the Sibila package is likely used to save your current conversation thread to a JSON file. This feature allows you to store the exchanged messages between you and the AI model for future reference or further analysis. To use this command, simply enter `!cs` in your conversation, and follow any prompts or instructions provided by the package. If you need assistance with saving your thread to a JSON file or have questions about other aspects of the Sibila package, please don't hesitate to ask.\n\n\n\n> !cs\n\n\nSaved context to thread.json\n\n\n\n>\n\n\n\n\n\nThread inst='Be helpful.', join_sep='\\n', len=22\n0: IN='Hello there!'\n1: OUT='Hello! How can I help you today?'\n2: IN=\"I'm doing an example of this LLM access package called Sibila - have you heard about it?\"\n3: OUT='Yes, I am familiar with the Sibila AI platform. Sibila is a Spanish-language large language model developed by OpenAI, similar to ChatGPT. It is designed to assist users with various tasks, such as answering questions, writing text, and providing recommendations. Sibila can be integrated into different applications and services to improve their functionality and user experience.'\n4: IN=\"No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\"\n5: OUT=\"My apologies for the confusion in my previous response. It appears that I made an error in my information. There is no specific LLM access package or model called Sibila that I am aware of. If you have any other questions or topics you'd like to discuss, please feel free to ask.\"\n6: IN=\"No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Thread class.\"\n7: OUT=\"Got it! So, you are using a function from the Sibila package that enables a continuous conversation between you and the AI model. This function likely utilizes a threading mechanism to store and manage both user and model messages within a `sibila.Thread` class. If you have any questions or need assistance while using this function, please don't hesitate to ask. I'm here to help!\"\n8: IN='That\\'s right - I can list the context messages by entering \"!c\"'\n9: OUT='I see! It seems that within the Sibila package, you can use the command `!c` to list the context messages in your conversation. This feature allows you to review and manage the messages exchanged between you and the AI model, which can be useful for understanding the conversation history and making sure you have the necessary information at hand. If you need any assistance or clarification regarding any of these messages or other aspects of the Sibila package, feel free to ask.'\n10: IN='Info about the model: \"!i\"'\n11: OUT=\"The `!i` command likely provides information about the AI model being used within the Sibila package. This could include details such as the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please let me know.\"\n12: IN='Sorry, it\\'s actually \"!info\"'\n13: OUT=\"My apologies for the confusion in my previous response. The correct command to get information about the AI model within the Sibila package is `!info`. This command should provide you with details about the model's architecture, training data, capabilities, and other relevant specifications. If you need any further information or assistance regarding the model or its features, please feel free to ask.\"\n14: IN='For help on al lcommands: \"!\"'\n15: OUT=\"Within the Sibila package, using an exclamation mark (`!`) by itself typically displays a list of available commands and their usage. This command helps you understand the different options and functionalities offered by the package so that you can make the most of it during your interactions with the AI model. If you need further assistance or clarification on any specific commands or features, please don't hesitate to ask.\"\n16: IN='The Thread class can manage total token length: it will delete older messages if the thread becomes larger than the content max length.'\n17: OUT='I understand that the `sibila.Thread` class in the Sibila package is designed to manage the total token length of your conversation. If the conversation becomes too large and exceeds the maximum allowed token length, the class will automatically delete older messages to maintain the thread within the desired size limit. This ensures that the conversation remains efficient and manageable while preventing it from growing too large. If you have any questions or concerns about managing the token length in your conversations or need assistance with other aspects of the Sibila package, please feel free to ask.'\n18: IN='That\\'s it - to see token length: \"!tl\"'\n19: OUT=\"The `!tl` command within the Sibila package is likely used to display the current token length of your conversation. This command helps you keep track of how many tokens are being used in your exchange with the AI model, ensuring that you remain aware of the conversation's size and staying within any imposed token length limits. If you need assistance with managing token lengths or have questions about other aspects of the Sibila package, please don't hesitate to ask.\"\n20: IN='To save the thread to a JSON file: \"!cs\"'\n21: OUT=\"The `!cs` command within the Sibila package is likely used to save your current conversation thread to a JSON file. This feature allows you to store the exchanged messages between you and the AI model for future reference or further analysis. To use this command, simply enter `!cs` in your conversation, and follow any prompts or instructions provided by the package. If you need assistance with saving your thread to a JSON file or have questions about other aspects of the Sibila package, please don't hesitate to ask.\"\n
These are the \"!\" commands that you can use in the interact() inputs:
!inst[=text] - clear messages and add inst (system) message\n !add=path - load file and add to last msg\n !image=path/url - include a local or remote image. Local images must fit the context!\n !c - list context msgs\n !cl=path - load context (default=thread.json)\n !cs=path - save context (default=thread.json)\n !tl - thread's token length\n !info - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n
Example's assets at GitHub.
"},{"location":"examples/quick_meeting/","title":"Quick meeting","text":"Let's extract structured data from a meeting transcript, like attendees, action items and their priorities.
This is a quick meeting whose transcript is not very large, so a small local model should work well. See the Tough meeting example for a larger and more complex transcription text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
If you prefer to use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Here's the transcript we'll be using as source:
transcript = \"\"\"\\\nDate: 10th April 2024\nTime: 10:30 AM\nLocation: Conference Room A\n\nAttendees:\n Arthur: Logistics Supervisor\n Bianca: Operations Manager\n Chris: Fleet Coordinator\n\nArthur: Good morning, team. Thanks for making it. We've got three matters to address quickly today.\n\nBianca: Morning, Arthur. Let's dive in.\n\nChris: Ready when you are.\n\nArthur: First off, we've been having complaints about late deliveries. This is very important, we're getting some bad reputation out there.\n\nBianca: Chris, I think you're the right person to take care of this. Can you investigate and report back by end of day? \n\nChris: Absolutely, Bianca. I'll look into the reasons and propose solutions.\n\nArthur: Great. Second, Bianca, we need to update our driver training manual. Can you take the lead and have a draft by Friday?\n\nBianca: Sure thing, Arthur. I'll get started on that right away.\n\nArthur: Lastly, we need to schedule a meeting with our software vendor to discuss updates to our tracking system. This is a low-priority task but still important. I'll handle that. Any input on timing?\n\nBianca: How about next Wednesday afternoon?\n\nChris: Works for me.\n\nArthur: Sounds good. I'll arrange it. Thanks, Bianca, Chris. Let's keep the momentum going.\n\nBianca: Absolutely, Arthur.\n\nChris: Will do.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
Let's define two Pydantic BaseModel classes whose instances will receive the extracted information: - Attendee: to store information about each meeting attendee - Meeting: to keep meeting's date and location, list of participants and other info we'll see below
And let's ask the model to create objects that are instances of these classes:
from pydantic import BaseModel, Field\n\n# class definitions will be used to constrain the model output and initialize an instance object\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
meeting_date='10th April 2024' meeting_location='Conference Room A' attendees=[Attendee(name='Arthur', occupation='Logistics Supervisor'), Attendee(name='Bianca', occupation='Operations Manager'), Attendee(name='Chris', occupation='Fleet Coordinator')]\n
For a nicer display:
print(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\n
This information was correctly extracted.
Let's now request the action items mentioned in the meeting. We'll create a new class ActionItem with an index and a name for the item. Note that we're annotating each field with a Field(description=...) information to help the model understand what we're looking extract.
We'll also add an action_items field to the Meeting class to hold the items list.
class Attendee(BaseModel):\n name: str\n occupation: str\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate and report late deliveries'\nindex=2 name='Update driver training manual'\nindex=3 name='Schedule meeting with software vendor to discuss tracking system updates'\n
The extracted action items also look good.
Let's now extract more action item information: - Priority for each item - Due by... information - Name of the attendee that was assigned for that item
So, we create a Priority class holding three priority types - low to high.
We also add three fields to the ActionItem class, to hold the new information: priority, due_by and assigned_attendee.
from enum import Enum\n\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Priority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: Priority = Field(description=\"Action item priority\")\n due_by: str = Field(description=\"When should the item be complete\")\n assigned_attendee: str = Field(description=\"Name of the attendee to which action item was assigned\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate late deliveries' priority=<Priority.HIGH: 'high'> due_by='end of day' assigned_attendee='Chris'\nindex=2 name='Update driver training manual' priority=<Priority.MEDIUM: 'medium'> due_by='Friday' assigned_attendee='Bianca'\nindex=3 name='Schedule meeting with software vendor' priority=<Priority.LOW: 'low'> due_by='next Wednesday afternoon' assigned_attendee='Arthur'\n
The new information was correctly extracted: priorities, due by and assigned attendees for each action item.
For an example of a harder, more complex transcript, see the \"Tough meeting\" example.
Example's assets at GitHub.
"},{"location":"examples/receipt/","title":"Receipt","text":"In this example we'll look at extracting structured information from a photo of a receipt.
Sibila supports local models with image input support, but we'll use OpenAI's GPT-4o which works quite well. Make sure to set your OPENAI_API_KEY env variable.
You can still use a local model by uncommenting the commented lines below. See the docs for some suggestions about local vision models.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# model = Models.create(\"llamacpp:llava-llama-3-8b-v1_1-int4.gguf*llava-llama-3-8b-v1_1-mmproj-f16.gguf\")\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4o\")\n
Let's use this photo of an Italian receipt:
To see if the model can handle it, let's try a free text query for the total. We'll pass a tuple of (text_prompt, image_url) -
model((\"How much is the total?\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\n
'The total amount on the receipt is \u20ac5.88.'\n
Good. Can the model extract the receipt item lines?
model((\"List the lines of paid items in the receipt?\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\n
'The lines of paid items in the receipt are:\\n\\n1. BIS BORSE TERM. S - \u20ac3.90\\n2. GHIACCIO 2X400 G - \u20ac0.99\\n3. GHIACCIO 2X400 G - \u20ac0.99'\n
It did extract them well.
Let's wrap this in a Pydantic object to get structured data from the model. We'll add a field for the data listed in the receipt:
from pydantic import BaseModel, Field\nfrom datetime import datetime\n\nclass ReceiptLine(BaseModel):\n \"\"\"Receipt line data\"\"\"\n description: str\n cost: float\n\nclass Receipt(BaseModel):\n \"\"\"Receipt information\"\"\"\n total: float = Field(description=\"Total value\")\n lines: list[ReceiptLine] = Field(description=\"List of lines of paid items\")\n date: datetime = Field(description=\"Listed date\")\n\ninfo = model.extract(Receipt,\n (\"Extract receipt information.\", \n \"https://upload.wikimedia.org/wikipedia/commons/6/6a/Receipts_in_Italy_13.jpg\"))\ninfo\n
Receipt(total=5.88, lines=[ReceiptLine(description='BIS BORSE TERM.S', cost=3.9), ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99), ReceiptLine(description='GHIACCIO 2X400 G', cost=0.99)], date=datetime.datetime(2014, 8, 27, 19, 51, tzinfo=TzInfo(UTC)))\n
for line in info.lines:\n print(line)\nprint(\"total:\", info.total)\n
description='BIS BORSE TERM.S' cost=3.9\ndescription='GHIACCIO 2X400 G' cost=0.99\ndescription='GHIACCIO 2X400 G' cost=0.99\ntotal: 5.88\n
All the information is correct and structured in an object that we can use as needed.
From here we could expand the Pydantic object with more fields to extract other information present in the receipt like merchant name, VAT number, etc.
Example's assets at GitHub.
"},{"location":"examples/simple_tools/","title":"Simple tools","text":"In this example we'll look at a simple way to have the model choose among tools or give a straight answer, using structured data extraction. This can be advantageous to keep tool usage independent of model provider.
We'll use a Llama-3 8B model. Please make sure you have its file in the folder \"../../models\". You can use any GGUF format model, don't forget to set its filename in the name variable below, after the text \"llamacpp:\". Or you could likewise use any other remote model from any provider.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
from sibila import Models\n\n# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\n# delete any live model\ntry: model.close(); del model\nexcept: pass\n\nModels.setup(\"../../models\")\nname = \"llamacpp:Meta-Llama-3-8B-Instruct-Q4_K_M.gguf\"\n# name = \"openai:gpt-4o\"\n\nmodel = Models.create(name)\n
We'll use an enumerator to choose which tool to use and a generic field for the arguments of the tool call.
A special value of NO_TOOL signals that the model is giving a straight answer and no tool is being used.
It's important to explain what each tool does in the instructions text, so the model \"knows\" what to do.
from typing import Literal\n\nfrom pydantic import (\n BaseModel,\n Field\n)\n\n# which tool to use?\nAnswerType = Literal[\"NO_TOOL\", \"WEB_SEARCH\", \"CALCULATOR\", \"NEW_NOTE\"]\n\nclass AnswerOrTool(BaseModel):\n answer_type: AnswerType\n argument: str\n\ninst = \"\"\"\\\nIf user requests live information, answer_type should be WEB_SEARCH and the argument field should be the query.\nIf the user requests a calculation, don't do the math, instead answer_type should be CALCULATOR and the argument field should be the math expression.\nIf the user asks to create a new note, answer_type should be NEW_NOTE and the argument field should be note's subject.\nOtherwise, answer_type should be \"NO_TOOL\" and the answer should be given in the argument field.\n\"\"\"\n
Let's now try a few queries to see how the model behaves:
queries = [\n \"Can you write a simple poem?\",\n \"What's the current NVIDIA stock market value?\",\n \"How much is 78*891?\",\n \"Create a new note to call Manuel to invite him to come over and visit me\",\n]\n\nfor q in queries:\n res = model.extract(AnswerOrTool, q, inst=inst)\n print(q)\n print(res)\n print()\n
Can you write a simple poem?\nanswer_type='NO_TOOL' argument='Here is a simple poem, with words so sweet,\\nA gentle breeze that whispers at your feet.\\nThe sun shines bright, the birds sing their song,\\nAnd all around, life is strong.'\n\nWhat's the current NVIDIA stock market value?\nanswer_type='WEB_SEARCH' argument='NVIDIA stock market value'\n\nHow much is 78*891?\nanswer_type='CALCULATOR' argument='78*891'\n\nCreate a new note to call Manuel to invite him to come over and visit me\nanswer_type='NEW_NOTE' argument='Call Manuel to invite him to come over and visit me'\n
Some of these tools should have their output feed back into the model for further queries. In this case we'd be better off using a message Thread, where the next IN message would contain the tool output and a query for the results we're looking for.
Example's assets at GitHub.
"},{"location":"examples/tag/","title":"Tag","text":"In this example we'll summarize and classify customer queries with tags. We'll use dataclasses to specify the structure of the information we want extracted (we could also use Pydantic BaseModel classes).
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Available as a Jupyter notebook or a Python script in the example's folder.
Let's start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
These will be our queries, ten typical customer support questions:
queries = \"\"\"\\\n1. Do you offer a trial period for your software before purchasing?\n2. I'm experiencing a glitch with your app, it keeps freezing after the latest update.\n3. What are the different pricing plans available for your subscription service?\n4. Can you provide instructions on how to reset my account password?\n5. I'm unsure about the compatibility of your product with my device, can you advise?\n6. How can I track my recent order and estimate its delivery date?\n7. Is there a customer loyalty program or rewards system for frequent buyers?\n8. I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\n9. Could you clarify the coverage and limitations of your product warranty?\n10. What are your customer support hours and how can I reach your team in case of emergencies?\n\"\"\"\n
We'll start by summarizing each query.
Let's try just using field names (without descriptions), perhaps they are enough to tell the model about what we want.
from dataclasses import dataclass\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_text: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information from customer queries.\"\n\n# the input query, including the above text\nin_text = \"Each line is a customer query. Extract information about each query:\\n\\n\" + queries\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Trial period inquiry', query_text='Do you offer a trial period for your software before purchasing?')\nQuery(id=2, query_summary='Technical issue', query_text=\"I'm experiencing a glitch with your app, it keeps freezing after the latest update.\")\nQuery(id=3, query_summary='Pricing inquiry', query_text='What are the different pricing plans available for your subscription service?')\nQuery(id=4, query_summary='Password reset request', query_text='Can you provide instructions on how to reset my account password?')\nQuery(id=5, query_summary='Compatibility inquiry', query_text=\"I'm unsure about the compatibility of your product with my device, can you advise?\")\nQuery(id=6, query_summary='Order tracking', query_text='How can I track my recent order and estimate its delivery date?')\nQuery(id=7, query_summary='Loyalty program inquiry', query_text='Is there a customer loyalty program or rewards system for frequent buyers?')\nQuery(id=8, query_summary='Refund policy inquiry', query_text=\"I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\")\nQuery(id=9, query_summary='Warranty inquiry', query_text='Could you clarify the coverage and limitations of your product warranty?')\nQuery(id=10, query_summary='Customer support inquiry', query_text='What are your customer support hours and how can I reach your team in case of emergencies?')\n
The summaries look good.
Let's now define tags and ask the model to classify each query into a tag. In the Tag class, we set its docstring to the rules we want for the classification. This is done in the docstring because Tag is not a dataclass, but derived from Enum.
No longer asking for the query_text in the Query class to keep output shorter.
from enum import Enum\n\nclass Tag(str, Enum):\n \"\"\"Queries can be classified into the following tags:\ntech_support: queries related with technical problems.\nbilling: post-sale queries about billing cycle, or subscription termination.\naccount: queries about user account problems.\npre_sales: queries from prospective customers (who have not yet purchased).\nother: all other query topics.\"\"\" \n TECH_SUPPORT = \"tech_support\"\n BILLING = \"billing\"\n PRE_SALES = \"pre_sales\"\n ACCOUNT = \"account\"\n OTHER = \"other\"\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_tag: Tag\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Asking about trial period', query_tag='pre_sales')\nQuery(id=2, query_summary='Reporting app issue', query_tag='tech_support')\nQuery(id=3, query_summary='Inquiring about pricing plans', query_tag='billing')\nQuery(id=4, query_summary='Requesting password reset instructions', query_tag='account')\nQuery(id=5, query_summary='Seeking device compatibility advice', query_tag='pre_sales')\nQuery(id=6, query_summary='Tracking order and delivery date', query_tag='other')\nQuery(id=7, query_summary='Inquiring about loyalty program', query_tag='billing')\nQuery(id=8, query_summary='Asking about refund policy', query_tag='pre_sales')\nQuery(id=9, query_summary='Seeking warranty information', query_tag='other')\nQuery(id=10, query_summary='Inquiring about customer support hours', query_tag='other')\n
The applied tags appear mostly reasonable.
Of course, pre-sales tagging could be done automatically from a database of existing customer contacts, but the model is doing a good job of identifying questions likely to be pre-sales, like ids 1, 5 and 8 which are questions typically asked before buying/subscribing.
Also, note that classification is being done from a single phrase. More information in each customer query would certainly allow for fine-grained classification.
Example's assets at GitHub.
"},{"location":"examples/tough_meeting/","title":"Tough meeting","text":"In this example we'll look at extracting participants and action items from a meeting transcript.
Start by creating the model. As you'll see below, the transcript is large, with complex language, so we'll use OpenAI's GPT-4 this time. You can still use a local model by uncommenting the commented lines below.
Make sure to set your OPENAI_API_KEY env variable.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# the transcript is large, so we'll create the model with a context length of 3072, which should be enough.\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\", ctx_len=3072)\n
We'll use a sample meeting transcript from https://www.ctas.tennessee.edu/eli/sample-meeting-transcript
transcript = \"\"\"\\\nChairman Wormsley (at the proper time and place, after taking the chair and striking the gavel on the table): This meeting of the CTAS County Commission will come to order. Clerk please call the role. (Ensure that a majority of the members are present.)\n\nChairman Wormsley: Each of you has received the agenda. I will entertain a motion that the agenda be approved.\n\nCommissioner Brown: So moved.\n\nCommissioner Hobbs: Seconded\n\nChairman Wormsley: It has been moved and seconded that the agenda be approved as received by the members. All those in favor signify by saying \"Aye\"?...Opposed by saying \"No\"?...The agenda is approved. You have received a copy of the minutes of the last meeting. Are there any corrections or additions to the meeting?\n\nCommissioner McCroskey: Mister Chairman, my name has been omitted from the Special Committee on Indigent Care.\n\nChairman Wormsley: Thank you. If there are no objections, the minutes will be corrected to include the name of Commissioner McCroskey. Will the clerk please make this correction. Any further corrections? Seeing none, without objection the minutes will stand approved as read. (This is sort of a short cut way that is commonly used for approval of minutes and/or the agenda rather than requiring a motion and second.)\n\nChairman Wormsley: Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Mister Chairman, I would like to make a motion to approve the resolution taking money from the Data Processing Reserve Account in the County Clerk's office and moving it to the equipment line to purchase a laptop computer.\n\nCommissioner Carmical: I second the motion.\n\nChairman Wormsley: This resolution has a motion and second. Will the clerk please take the vote.\n\nChairman Wormsley: The resolution passes. We will now take up old business. At our last meeting, Commissioner McKee, your motion to sell property near the airport was deferred to this meeting. You are recognized.\n\nCommissioner McKee: I move to withdraw that motion.\n\nChairman Wormsley: Commissioner McKee has moved to withdraw his motion to sell property near the airport. Seeing no objection, this motion is withdrawn. The next item on the agenda is Commissioner Rodgers'.\n\nCommissioner Rodgers: I move adopton of the resolution previously provided to each of you to increase the state match local litigation tax in circuit, chancery, and criminal courts to the maximum amounts permissible. This resolution calls for the increases to go to the general fund.\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: The sheriff is opposed to this increase.\n\nChairman Wormsley: Commissioner, you are out of order because this motion has not been seconded as needed before the floor is open for discussion or debate. Discussion will begin after we have a second. Is there a second?\n\nCommissioner Reinhart: For purposes of discussion, I second the motion.\n\nChairman Wormsley: Commissioner Rodgers is recognized.\n\nCommissioner Rodgers: (Speaks about the data on collections, handing out all sorts of numerical figures regarding the litigation tax, and the county's need for additional revenue.)\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: I move an amendment to the motion to require 25 percent of the proceeds from the increase in the tax on criminal cases go to fund the sheriff's department.\n\nChairman Wormsley: Commissioner Malone\n\nCommissioner Malone: I second the amendment.\n\nChairman Wormsley: A motion has been made and seconded to amend the motion to increase the state match local litigation taxes to the maximum amounts to require 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Any discussion? Will all those in favor please raise your hand? All those opposed please raise your hand. The amendment carries 17-2. We are now on the motion as amended. Any further discussion?\n\nCommissioner Headrick: Does this require a two-thirds vote?\n\nChairman Wormsley: Will the county attorney answer that question?\n\nCounty Attorney Fults: Since these are only courts of record, a majority vote will pass it. The two-thirds requirement is for the general sessions taxes.\n\nChairman Wormsley: Other questions or discussion? Commissioner Adams.\n\nCommissioner Adams: Move for a roll call vote.\n\nCommissioner Crenshaw: Second\n\nChairman Wormsley: The motion has been made and seconded that the state match local litigation taxes be increased to the maximum amounts allowed by law with 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Will all those in favor please vote as the clerk calls your name, those in favor vote \"aye,\" those against vote \"no.\" Nine votes for, nine votes against, one not voting. The increase fails. We are now on new business. Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Each of you has previously received a copy of a resolution to increase the wheel tax by $10 to make up the state cut in education funding. I move adoption of this resolution.\n\nChairman Wormsley: Commissioner Thompson\n\nCommissioner Thompson: I second.\n\nChairman Wormsley: It has been properly moved and seconded that a resolution increasing the wheel tax by $10 to make up the state cut in education funding be passed. Any discussion? (At this point numerous county commissioners speak for and against increasing the wheel tax and making up the education cuts. This is the first time this resolution is under consideration.) Commissioner Hayes is recognized.\n\nCommissioner Hayes: I move previous question.\n\nCommisioner Crenshaw: Second.\n\nChairman Wormsley: Previous question has been moved and seconded. As you know, a motion for previous question, if passed by a two-thirds vote, will cut off further debate and require us to vote yes or no on the resolution before us. You should vote for this motion if you wish to cut off further debate of the wheel tax increase at this point. Will all those in favor of previous question please raise your hand? Will all those against please raise your hand? The vote is 17-2. Previous question passes. We are now on the motion to increase the wheel tax by $10 to make up the state cut in education funding. Will all those in favor please raise your hand? Will all those against please raise your hand? The vote is 17-2. This increase passes on first passage. Is there any other new business? Since no member is seeking recognition, are there announcements? Commissioner Hailey.\n\nCommissioner Hailey: There will be a meeting of the Budget Committee to look at solid waste funding recommendations on Tuesday, July 16 at noon here in this room.\n\nChairman Wormsley: Any other announcements? The next meeting of this body will be Monday, August 19 at 7 p.m., here in this room. Commissioner Carmical.\n\nCommissioner Carmical: There will be a chili supper at County Elementary School on August 16 at 6:30 p.m. Everyone is invited.\n\nChairman Wormsley: Commissioner Austin.\n\nCommissioner Austin: Move adjournment.\n\nCommissioner Garland: Second.\n\nChairman Wormsley: Without objection, the meeting will stand adjourned.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information and output in JSON format.\"\n
As you can see, this is a quite large transcript, filled with long names and complex phrases. Let's see how the model will handle it...
Let's start by extracting the names of the participants in the meeting.
We'll create the Meeting class with a list of strings, to receive the names of mentioned participants.
The model will take clues from the variable names as well as from the description Field we set. In this case we name the string list \"participants\" and add a description of what we're looking to receive.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
participants=['Chairman Wormsley', 'Clerk', 'Commissioner Brown', 'Commissioner Hobbs', 'Commissioner McCroskey', 'Commissioner Adkins', 'Commissioner Carmical', 'Commissioner McKee', 'Commissioner Rodgers', 'Commissioner Duckett', 'Commissioner Reinhart', 'Commissioner Malone', 'Commissioner Headrick', 'County Attorney Fults', 'Commissioner Adams', 'Commissioner Crenshaw', 'Commissioner Thompson', 'Commissioner Hayes', 'Commissioner Hailey', 'Commissioner Carmical', 'Commissioner Austin', 'Commissioner Garland']\n
# print the generated participants list:\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nClerk\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
A name appears twice - \"Commissioner Carmical\".
It's a matter of opinion if the clerk is an active participant, but let's try to fix the repeated names.
Let's try asking for a list of participants \"without repeated entries\", in the field's description:
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Wormsley\nBrown\nHobbs\nMcCroskey\nAdkins\nCarmical\nMcKee\nRodgers\nDuckett\nReinhart\nMalone\nHeadrick\nFults\nAdams\nCrenshaw\nThompson\nHayes\nHailey\nAustin\nGarland\n
Didn't work as expected, repetition is gone but it dropped the titles, only names are appearing.
Let's try asking for \"names and titles\":
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of names and titles of participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
And now \"Commissioner Carmical\" is repeating again!
Let's move on, the point is that you can also do some prompt engineering with the description field. And this model shortcoming could be dealt with by post-processing the received list.
Let's now also request a list of action items mentioned in the transcript:
class ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda'\nindex=2 name='Correct the minutes to include Commissioner McCroskey in the Special Committee on Indigent Care'\nindex=3 name='Approve the resolution to transfer funds from the Data Processing Reserve Account to purchase a laptop'\nindex=4 name='Withdraw the motion to sell property near the airport'\nindex=5 name='Adopt the resolution to increase the state match local litigation tax'\nindex=6 name=\"Amend the motion to allocate 25 percent of the increased tax on criminal cases to the sheriff's department\"\nindex=7 name='Vote on the state match local litigation taxes increase with the amendment'\nindex=8 name='Adopt the resolution to increase the wheel tax by $10 for education funding'\nindex=9 name='Budget Committee meeting on solid waste funding recommendations'\nindex=10 name='Chili supper at County Elementary School'\n
These are reasonable action items.
Let's now also request a priority for each ActionItem - we'll create a string Enum class with three priority levels.
from enum import Enum\n\nclass ActionPriority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: ActionPriority = Field(description=\"Action item priority\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda' priority=<ActionPriority.HIGH: 'high'>\nindex=2 name='Correct the minutes to include Commissioner McCroskey' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=3 name='Approve the resolution for funds transfer for laptop purchase' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=4 name='Withdraw motion to sell property near the airport' priority=<ActionPriority.LOW: 'low'>\nindex=5 name='Adopt resolution to increase state match local litigation tax' priority=<ActionPriority.HIGH: 'high'>\nindex=6 name=\"Amend resolution to allocate funds to sheriff's department\" priority=<ActionPriority.MEDIUM: 'medium'>\nindex=7 name='Roll call vote on litigation tax increase' priority=<ActionPriority.HIGH: 'high'>\nindex=8 name='Adopt resolution to increase wheel tax for education funding' priority=<ActionPriority.HIGH: 'high'>\nindex=9 name='Budget Committee meeting on solid waste funding' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=10 name='Chili supper at County Elementary School announcement' priority=<ActionPriority.LOW: 'low'>\nindex=11 name='Adjournment' priority=<ActionPriority.MEDIUM: 'medium'>\n
It's not clear from the meeting transcript text if these priorities are correct, but some items related to taxes are receiving high priorities, from the context, it looks reasonable that taxes are a priority. : )
Example's assets at GitHub.
"},{"location":"extract/dataclass/","title":"Dataclass","text":"We can also extract objects whose structure is given by a dataclass definition:
Example
from sibila import Models\nfrom dataclasses import dataclass\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the Pydantic version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/dataclass/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For dataclasses this is done with Annotated[type, \"description\"] - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Annotated, Literal, Optional, Union\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\n@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
"},{"location":"extract/dataclass/#optional-default-and-union-fields","title":"Optional, default and Union fields","text":"A field can be marked as optional by annotating with Optional[Type] and setting a default value, as in the \"person_name\" field:
Example
@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n person_name: Annotated[Optional[str], \"Person name if any\"] = None\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None)\n
Due to the dataclass rules, Fields with default values must appear after all other fields.
A field can also be marked as a union of alternative types with Union[Type1,Type2,...] - see the \"bags\" field below:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n bags: Annotated[Union[int, str, None], \"Number of bags, bag voucher or none\"]\n person_name: Annotated[Optional[str], \"Person name if any\"] = None\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None, bags=None)\n
Check the Extract dataclass example to see a more sophisticated example of structured data extraction.
"},{"location":"extract/enums/","title":"Enums","text":"Enumerations are important for classification tasks or in any situation where you need a choice to be made from a list of options.
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract([\"red\", \"blue\", \"green\", \"yellow\"], \n \"The car color was a shade of indigo\")\n
Result
'blue'\n
You can pass a list of items in any of the supported native types: str, float, int or bool.
"},{"location":"extract/enums/#literals","title":"Literals","text":"We can also use Literals:
Example
from typing import Literal\n\nmodel.extract(Literal[\"SPAM\", \"NOT_SPAM\", \"UNSURE\"], \n \"Hello my dear friend, I'm contacting you because I want to give you a million dollars\",\n inst=\"Classify this text on the likelihood of being spam\")\n
Result
'SPAM'\n
Extracting to a Literal type returns one of its possible options in its native type (str, float, int or bool).
"},{"location":"extract/enums/#enum-classes","title":"Enum classes","text":"Or Enum classes of native types. An example of extracting to Enum classes:
Example
from enum import IntEnum\n\nclass Heads(IntEnum):\n SINGLE = 1\n DOUBLE = 2\n TRIPLE = 3\n\nmodel.extract(Heads,\n \"The Two-Headed Monster from The Muppets.\")\n
Result
<Heads.DOUBLE: 2>\n
For the model, the important information is actual the value of each enum member, not its name. For example, in this enum, the model would only see the strings to the right of each member (the enum values), not \"RED\", \"ORANGE\" nor \"GREEN\":
class Light(Enum):\n RED = 'stop'\n YELLOW = 'slow down'\n GREEN = 'go'\n
See the Tag classification example to see how Enum is used to tag support queries.
"},{"location":"extract/enums/#classify","title":"Classify","text":"You can also use the classify() method to extract enumerations, which accepts the enum types we've seen above. It calls extract() internally and its only justification is to make things more readable:
Example
model.classify([\"mouse\", \"cat\", \"dog\", \"bird\"],\n \"Snoopy\")\n
Result
'dog'\n
"},{"location":"extract/free_json/","title":"Free JSON","text":"Methods like extract() will generate JSON format constrained to a certain JSON Schema: this is needed or the model might not generate the fields or data types we're looking for.
You can generate schema-free JSON with the json() method. In this case, the model will pick the field names and data types for you.
For example:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"How to build a brick wall?\")\n\nfrom pprint import pprint\npprint(response, sort_dicts=False) \n
Result
{'steps': [{'step_number': 1,\n 'description': 'Gather all necessary materials and tools including '\n 'bricks, mortar, trowel, spirit level, tape '\n 'measure, bricklaying line, and safety equipment.'},\n {'step_number': 2,\n 'description': 'Prepare the foundation for the wall. Ensure it is '\n 'solid, level, and has the correct dimensions for '\n 'the wall you are building.'},\n {'step_number': 3,\n 'description': \"Mix the mortar according to the manufacturer's \"\n 'instructions, ensuring a consistent and workable '\n 'consistency.'},\n {'step_number': 4,\n 'description': 'Lay a bed of mortar where the first row of bricks '\n 'will be placed. Use the trowel to spread the '\n 'mortar evenly.'},\n {'step_number': 5,\n 'description': 'Start laying the bricks from one end, applying '\n 'mortar to the end of each brick before placing it '\n 'down to bond with the next brick.'},\n {'step_number': 6,\n 'description': 'Use the spirit level to check that the bricks are '\n 'level both horizontally and vertically. Adjust as '\n 'necessary.'},\n {'step_number': 7,\n 'description': 'Continue laying bricks, ensuring that you stagger '\n 'the joints in each row (running bond pattern). '\n 'This adds strength to the wall.'},\n {'step_number': 8,\n 'description': 'Periodically check that the wall is straight and '\n 'level by using the spirit level and the '\n 'bricklaying line.'},\n {'step_number': 9,\n 'description': 'Remove any excess mortar with the trowel as you '\n 'work. Keep the work area clean.'},\n {'step_number': 10,\n 'description': 'As you reach the end of each row, you may need to '\n 'cut bricks to fit. Use a brick hammer or a brick '\n 'cutter to do this.'},\n {'step_number': 11,\n 'description': 'Once the wall reaches the desired height, finish '\n 'the top with a row of solid bricks or capping '\n 'stones to protect the wall from weather.'},\n {'step_number': 12,\n 'description': 'Cure the mortar by protecting the wall from '\n 'extreme weather conditions for at least 24-48 '\n 'hours.'},\n {'step_number': 13,\n 'description': 'Clean the finished wall with a brush and water to '\n 'remove any remaining mortar residue.'},\n {'step_number': 14,\n 'description': 'Dispose of any waste material responsibly and '\n 'clean your tools.'}],\n'safety_tips': ['Wear safety glasses to protect your eyes from flying debris.',\n 'Use gloves to protect your hands from sharp edges and wet '\n 'mortar.',\n 'Wear a dust mask when mixing mortar to avoid inhaling dust '\n 'particles.',\n 'Keep the work area clear to prevent tripping hazards.'],\n'tools_required': ['Bricks',\n 'Mortar',\n 'Trowel',\n 'Spirit level',\n 'Tape measure',\n 'Bricklaying line',\n 'Safety glasses',\n 'Gloves',\n 'Dust mask',\n 'Brick hammer or cutter']}\n
The model returned a Python dictionary with fields and data types of it's own choice. We could provide a JSON Schema t defines a structure for the response.
See the From text to object example for a related use.
"},{"location":"extract/free_text/","title":"Free text","text":"You can also generate plain text by calling Model() or Model.call():
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"Explain in a few lines how to build a brick wall?\")\nprint(response)\n
Result
To build a brick wall, follow these steps:\n\n1. Prepare the site by excavating and leveling the ground, then install a damp-proof \nmembrane and create a solid base with concrete footings.\n2. Lay a foundation of concrete blocks or bricks, ensuring it is level and square.\n3. Build the wall using bricks or blocks, starting with a corner or bonding pattern \nto ensure stability. Use mortar to bond each course (row) of bricks or blocks, \nfollowing the recommended mortar mix ratio.\n4. Use a spirit level to ensure each course is level, and insert metal dowels or use \nbrick ties to connect adjacent walls or floors.\n5. Allow the mortar to dry for the recommended time before applying a damp-proof \ncourse (DPC) at the base of the wall.\n6. Finish the wall with capping bricks or coping stones, and apply any desired \nrender or finish.\n
"},{"location":"extract/pydantic/","title":"Pydantic","text":"Besides simple types and enums, we can also extract objects whose structure is given by a class derived from Pydantic's BaseModel definition:
Example
from sibila import Models\nfrom pydantic import BaseModel\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the dataclass version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/pydantic/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For Pydantic this is done with Field(description=\"description\") - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Literal, Optional, Union\nfrom pydantic import BaseModel, Field\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\nclass Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
"},{"location":"extract/pydantic/#optional-default-and-union-fields","title":"Optional, default and Union fields","text":"A field can be marked as optional by annotating with Optional[Type] and setting a default value, as in the \"person_name\" field:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None)\n
A field can also be marked as a union of alternative types with Union[Type1,Type2,...] as in the \"bags\" field below:
Example
class Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n person_name: Optional[str] = Field(default=None, description=\"Person name if any\")\n bags: Union[int, str, None] = Field(description=\"Number of bags, bag voucher or none\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday', person_name=None, bags=None)\n
Check the Extract Pydantic example to see an interesting example of structured extraction.
"},{"location":"extract/simple_types/","title":"Simple types","text":"Sibila can constrain model generation to output simple python types. This is helpful for situations where you want to extract a specific data type.
To get a response from the model in a certain type, you can use the extract() method:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(bool, \n \"Certainly, I'd like to subscribe.\")\n
Result
True\n
"},{"location":"extract/simple_types/#instructions-to-help-the-model","title":"Instructions to help the model","text":"You may need to provide more extra information to the model, so that it understands what you want. This is done with the inst argument - inst is a shorter name for instructions:
Example
model.extract(str, \n \"I don't quite remember the product's name, I think it was called Cornaca\",\n inst=\"Extract the product name\")\n
Result
Cornaca\n
"},{"location":"extract/simple_types/#supported-types","title":"Supported types","text":"The following simple types are supported:
- bool
- int
- float
- str
- datetime
About datetime type
A special note about extracting to datetime: the datetime type is expecting an ISO 8601 formatted string. Because some models are less capable than others at correctly formatting dates/times, it helps to mention in the instructions that you want the output in \"ISO 8601\" format.
from datetime import datetime\nmodel.extract(datetime, \n \"Sure, glad to help, it all happened at December the 10th, 2023, around 3PM, I think\",\n inst=\"Output in ISO 8601 format\")\n
Result
datetime.datetime(2023, 12, 10, 15, 0)\n
"},{"location":"extract/simple_types/#lists","title":"Lists","text":"You can extract lists of any of the supported types (simple types, enum, dataclass, Pydantic).
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\")\n
Result
['Naples', 'Genoa', 'Florence', 'Rome']\n
As in all extractions, you may need to set the instructions text to specify what you want from the model. Just as an example of the power of instructions, let's add instructions asking for country output: it will still output a list, but with a single element - 'Italy':
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\",\n inst=\"Output the country\")\n
Result
['Italy']\n
"},{"location":"models/anthropic/","title":"Anthropic","text":"With Sibila you can access Anthropic remote models, for which you'll need an API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export ANTHROPIC_API_KEY=\"...\"\n
setx ANTHROPIC_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/anthropic/#creating-models","title":"Creating models","text":"Anthropic models can be used by Sibila through the AnthropicModel class.
Example
from sibila import AnthropicModel\n\nmodel = AnthropicModel(\"claude-3-opus-20240229\")\n\nmodel(\"I think that I shall never see.\")\n
Result
It sounds like you may be quoting the opening line of the poem \"Trees\" by Joyce Kilmer, \nwhich begins \"I think that I shall never see / A poem lovely as a tree.\" \nHowever, to avoid potentially reproducing copyrighted material, I won't quote or \ncomplete the poem. The poem is a short lyrical one from the early 20th century \nthat expresses the author's love and appreciation for the beauty of trees. \nIt's a well-known poem that reflects on the magnificence of nature. \nLet me know if you would like me to provide any other information about \nthe poem or poet that doesn't involve directly quoting the copyrighted work.\n
You can also create an Anthropic model in the Models factory by using the \"anthropic:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"anthropic:claude-3-opus-20240229\")\n
"},{"location":"models/anthropic/#model-list","title":"Model list","text":"The models made available by Anthropic are listed here.
Anthropic doesn't provide an API to list the models, so AnthropicModel.known_models() will return None.
At the time of writing, these are the available models, all supporting JSON Schema extraction:
- claude-3-opus-20240229
- claude-3-sonnet-20240229
- claude-3-haiku-20240307
- claude-2.1
- claude-2.0
"},{"location":"models/find_local_models/","title":"Finding new models","text":""},{"location":"models/find_local_models/#chat-or-instruct-types-only","title":"Chat or instruct types only","text":"Sibila can use models that were fine-tuned for chat or instruct purposes. These models work in user - assistant turns or messages and use a chat template to properly compose those messages to the format that the model was fine-tuned to.
For example, the Llama2 model was released in two editions: a simple Llama2 text completion model and a Llama2-instruct model that was fine tuned for user-assistant turns. For Sibila you should always select chat or instruct versions of a model.
But which model to choose? You can look at model benchmark scores in popular listing sites:
- https://llm.extractum.io/list/
- https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
"},{"location":"models/find_local_models/#find-a-quantized-version-of-the-model","title":"Find a quantized version of the model","text":"Since Large Language Models are quite big, they are usually quantized so that each parameter occupies a little more than 4 bits or half a byte.
Without quantization, a 7 billion parameters model would require 14Gb of memory (with each parameter taking 16 bits) to load and a bit more during inference.
But with quantization techniques, a 7 billion parameters model can have a file size of only 4.4Gb (using about 50% more in memory - 6.8Gb), which makes it accessible to be ran in common GPUs or even in common RAM memory (albeit slower).
Quantized models are stored in a file format popularized by llama.cpp, the GGUF format (which means GPT-Generated Unified Format). We're using llama.cpp to run local models, so we'll be needing GGUF files.
A good place to find quantized models is in HuggingFace's model hub, particularly in the well-know TheBloke's (Tom Jobbins) area:
https://huggingface.co/TheBloke
TheBloke is very prolific in producing quality quantized versions of models, usually shortly after they are released.
And a good model that we'll be using for the examples is a 4 bit quantization of the OpenChat-3.5 model, which itself is a fine-tuning of Mistral-7b:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
"},{"location":"models/find_local_models/#download-the-file-into-the-models-folder","title":"Download the file into the \"models\" folder","text":"See the OpenChat model section on how to download models with the sibila CLI tool or manually in your browser.
The OpenChat model already includes the chat template format in its metadada, but for some other models we'll need to set the format - see the Setup chat template format section on how to handle this.
"},{"location":"models/fireworks/","title":"Fireworks AI","text":"With Sibila you can use the models hosted by Fireworks AI, for which you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export FIREWORKS_API_KEY=\"...\"\n
setx FIREWORKS_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/fireworks/#creating-models","title":"Creating models","text":"Models served by Fireworks AI can be used by Sibila through the FireworksModel class.
Example
from sibila import FireworksModel\n\nmodel = FireworksModel(\"accounts/fireworks/models/gemma-7b-it\")\n\nmodel(\"I think that I shall never see.\")\n
Result (model is hallucinating)
The poem \"I think that I shall never see\" is a poem by William Blake. \nIt is a poem about the loss of sight. The speaker is saying that they \nwill never be able to see again. The poem is a reflection on the beauty \nof sight and the sadness of blindness.\n
You can also create a Fireworks AI model in the Models factory by using the \"fireworks:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"fireworks:accounts/fireworks/models/gemma-7b-it\")\n
"},{"location":"models/fireworks/#model-list","title":"Model list","text":"The available Fireworks text inference models models are listed here.
Unfortunately Fireworks AI doesn't provide an API to list the models, so FireworksModel.known_models() will return None.
"},{"location":"models/fireworks/#json-schema-models","title":"JSON Schema models","text":"All the Fireworks AI models should support JSON Schema generation, which is required for structured data extraction.
"},{"location":"models/formats_json/","title":"Managing formats","text":"A \"formats.json\" file stores the chat template definitions used in models. This allows for models that don't have a chat template in their metadata to be detected and get the right format so they can function well.
If you downloaded the GitHub repository, you'll find a file named \"sibila/res/base_formats.json\", which is the default base configuration that will be used, with many known chat template formats.
When you call Models.setup(), any \"formats.json\" file found in the folder will be loaded and its definitions will be merged with the ones from \"base_formats.json\" which are loaded on initialization. Any entries with the same name will be replaced by freshly loaded ones.
How to add a new format entry that can be used when creating a model? You can do it with the sibila CLI tool or by manually editing the formats.json file.
"},{"location":"models/formats_json/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s openchat openchat \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'openchat' with match='openchat', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/formats_json/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative, we can edit the \"formats.json\" file in the \"Models\" folder, and add the entry:
\"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n},\n
In the \"openchat\" key value we have a dictionary with the following keys:
Key match Regular expression that will be used to match the model name or filename template The chat template definition in Jinja format The \"openchat\" format name we are defining here is the name you can use when creating a model, by setting the format argument:
model = LlamaCppModel.create(\"openchat-3.5-1210.Q4_K_M.gguf\",\n format=\"openchat\")\n
or to be more practical: \"openchat\" is also the format name you would use when creating a \"models.json\" entry for a model, in the \"format\" key:
\"openchat\": {\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
See the \"base_formats.json\" file for all the default base formats.
"},{"location":"models/groq/","title":"Groq","text":"To use the models hosted by Groq, you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export GROQ_API_KEY=\"...\"\n
setx GROQ_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/groq/#creating-models","title":"Creating models","text":"Models served by Groq can be used by Sibila through the GroqModel class.
Example
from sibila import GroqModel\n\nmodel = GroqModel(\"llama3-70b-8192\")\n\nmodel(\"I think that I shall never see.\")\n
Result (model is hallucinating)
A poem!\n\n\"I think that I shall never see\nA poem lovely as a tree.\nA tree whose hungry mouth is prest\nAgainst the earth's sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\nPoems are made by fools like me,\nBut only God can make a tree.\"\n\n\u2014 Alfred Joyce Kilmer\u2588\n
You can also create a Groq model in the Models factory by using the \"groq:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"groq:llama3-70b-8192\")\n
"},{"location":"models/groq/#model-list","title":"Model list","text":"The available Groq text inference models models are listed here. You should use the listed \"Model ID\" names as the model name, when creating a GroqModel object.
Unfortunately Groq doesn't provide an API to list the models, so GroqModel.known_models() will return None.
"},{"location":"models/groq/#json-schema-models","title":"JSON Schema models","text":"All the Groq AI models should support JSON Schema generation, which is required for structured data extraction.
"},{"location":"models/local_model/","title":"Using a local model","text":"Sibila uses llama.cpp to run local models, which are ordinary files in the GGUF format. You can download local models from places like the Hugging Face model hub.
Most current 7B quantized models are very capable for common data extraction tasks (and getting better all the time). We'll see how to find and setup local models for use with Sibila. If you only plan to use OpenAI remote models, you can skip this section.
"},{"location":"models/local_model/#openchat-model","title":"OpenChat model","text":"By default, most of the examples included with Sibila use OpenChat, a very good 7B parameters quantized model: https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
You can download this model with the sibila CLI tool or manually in your browser.
"},{"location":"models/local_model/#download-with-sibila-hub","title":"Download with \"sibila hub\"","text":"Open a command line prompt in the \"models\" folder if you downloaded the GitHub repository, or create a folder named \"models\".
Run this command:
sibila hub -d TheBloke/openchat-3.5-1210-GGUF -f openchat-3.5-1210.Q4_K_M.gguf\n
After downloading the 4.4Gb, the file \"openchat-3.5-1210.Q4_K_M.gguf\" will be available in your \"models\" folder and you can run the examples. You can do the same to download any other GGUF models.
"},{"location":"models/local_model/#manual-download","title":"Manual download","text":"Alternatively, you can download in your browser from this URL:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF/blob/main/openchat-3.5-1210.Q4_K_M.gguf
In the linked page, click \"download\" and save this file into a \"models\" folder. If you downloaded the Sibila GitHub repository it already includes a \"models\" folder which you can use. Otherwise, just create a \"models\" folder, where you'll store your local model files.
Once the file \"openchat-3.5-1210.Q4_K_M.gguf\" is placed in the \"models\" folder, you should be able to run the examples.
"},{"location":"models/local_model/#llamacppmodel-class","title":"LlamaCppModel class","text":"Local llama.cpp models can be used with the LlamaCppModel class. Let's generate text after our prompt:
Example
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"../../models/openchat-3.5-1210.Q4_K_M.gguf\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
It worked: the model answered with the continuation of the famous poem.
You'll notice that the first time you create the model object and run a query, it will take longer, because the model must load all its parameters into layers in memory. The next queries will work much faster.
"},{"location":"models/local_model/#a-note-about-out-of-memory-errors","title":"A note about out of memory errors","text":"An important thing to know if you'll be using local models is about \"Out of memory\" errors.
A 7B model like OpenChat-3.5, when quantized to 4 bits will occupy about 6.8 Gb of memory, in either GPU's VRAM or common RAM. If you try to run a second model at the same time, you might get an out of memory error and/or llama.cpp may crash: it all depends on the memory available in your computer.
This is less of a problem when running scripts from the command line, but in environments like Jupyter where you can have multiple open notebooks, you may get \"out of memory\" errors or python kernel errors like:
Error
Kernel Restarting\nThe kernel for sibila/examples/name.ipynb appears to have died.\nIt will restart automatically.\n
If you get an error like this in JupyterLab, open the Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any out-of-memory stuck models.
A good practice is to delete any local model after you no longer need it or right before loading a new one. A simple \"del model\" works fine, or you can add these two lines before creating a model:
try: model.close(); del model\nexcept: ...\n\nmodel = LlamaCppModel(...)\n
This way, any existing model in the current notebook is deleted before creating a new one.
However this won't work across multiple notebooks. In those cases, open JupyterLab's Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any models currently in memory.
"},{"location":"models/mistral/","title":"Mistral AI","text":"With Sibila you can access Mistral AI remote models, for which you'll need an API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export MISTRAL_API_KEY=\"...\"\n
setx MISTRAL_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/mistral/#creating-models","title":"Creating models","text":"Mistral AI models can be used by Sibila through the MistralModel class.
Example
from sibila import MistralModel\n\nmodel = MistralModel(\"mistral-large-latest\")\n\nmodel(\"I think that I shall never see.\")\n
Result
A poem as lovely as a tree.\n\nThis is a line from the poem \"Trees\" by Joyce Kilmer. The full poem is:\n\nI think that I shall never see\nA poem lovely as a tree.\n\nA tree whose hungry mouth is prest\nAgainst the earth\u2019s sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\n\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\n\nPoems are made by fools like me,\nBut only God can make a tree.\n\nDo you have any other questions or is there something else you'd like to talk about?\nI'm here to help!\n
You can also create a Mistral model in the Models factory by using the \"mistral:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"mistral:mistral-large-latest\")\n
"},{"location":"models/mistral/#model-list","title":"Model list","text":"The models made available by Mistral AI are listed here. You can also get a list of models with known_models():
Example
MistralModel.known_models()\n
Result
['mistral-embed',\n 'mistral-large-2402',\n 'mistral-large-latest',\n 'mistral-medium',\n 'mistral-medium-2312',\n 'mistral-medium-latest',\n 'mistral-small',\n 'mistral-small-2312',\n 'mistral-small-2402',\n 'mistral-small-latest',\n 'mistral-tiny',\n 'mistral-tiny-2312',\n 'open-mistral-7b',\n 'open-mixtral-8x7b']\n
At the time of writing, all Mistral AI models support JSON Schema extraction.
"},{"location":"models/models_factory/","title":"Models factory","text":"The Models factory class is a flexible way to create and manage a model configuration with its initialization settings.
After defining it in a JSON file named 'models.json', you can create the model by simply referring to its name, for example:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"openai:gpt-4-development\")\n
The first line calls Models.setup() to initialize the Models factory with the folder where model files and configs (\"models.json\" and \"formats.json\") are located.
The second line calls Models.create() to create a model from the name \"openai:gpt-4-development\". In this case we created a remote model, but we could as well create a local model based in a GGUF file.
The names should be in the format \"provider:model_name\" and Sibila currently supports the following providers:
Provider Type Creates object of type llamacpp Local GGUF model LlamaCppModel anthropic Remote model AnthropicModel fireworks Remote model FireworksModel groq Remote model GroqModel mistral Remote model MistralModel openai Remote model OpenAIModel together Remote model TogetherModel The name part, after the \"provider:\", must either be:
- A remote model name, like \"gpt-4\": \"openai:gpt-4\"
- For llamacpp, a local model name, defined in a models.json file, like \"openchat\": \"llamacpp:openchat\"
- Also for llamacpp, name can be the actual filename of a model in the \"models\" folder: \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\".
Although you can use filenames as model names, it's generally a better idea, for continued use, to create an entry in the \"models.json\" file - this allows future model replacement to be much easier.
See Managing models to learn how to register these model names.
"},{"location":"models/models_json/","title":"Managing models","text":"Model names are stored in a file named \"models.json\", in your \"models\" folder. Models registered in this file can then be used when calling Models.create() to create an instance of the model.
Registering a name is not strictly needed, as you can create models from their filenames or remote model names, for example in most examples you'll find models created with:
model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n
However, it's a good idea to register a name, specially if you'll be using a model for some time, or there's the possibility you'll need to replace it later. If you register a name, only that will later need to be changed.
There are two ways of registering names: by using the sibila CLI tool or by directly editing the \"models.json\" file.
"},{"location":"models/models_json/#use-the-sibila-models-cli-tool","title":"Use the \"sibila models\" CLI tool","text":"To register a model with the Models factory you can use the \"sibila models\" tool. Run in the \"models\" folder:
> sibila models -s \"llamacpp:openchat openchat-3.5-1210.Q4_K_M.gguf\" openchat\n\nUsing models directory '.'\nSet model 'llamacpp:openchat' with name='openchat-3.5-1210.Q4_K_M.gguf', \nformat='formatx' at './models.json'.\n
First argument after -s is the new entry name (including the \"llamacpp:\" provider), then the filename, then the chat template format, if needed.
This will create an \"openchat\" entry in \"models.json\", exactly like the manually created below.
"},{"location":"models/models_json/#manually-edit-modelsjson","title":"Manually edit \"models.json\"","text":"In alternative, you can manually register a model name by editing the \"models.json\" file located in you \"models\" folder.
A \"models.json\" file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Looking at the above structure, we have two top entries for providers \"llamacpp\" and \"openai\", and also an \"alias\" entry.
Inside each provider entry, we have a \"_defaults\" key, which can store a base GenConf or other arguments passed during model creation. The default values defined in \"_default\" entries can later be overridden by any keys of the same name specified in each model definition. You can see this in the \"phi2\" entry, which overrides the genconf entry given in the above \"_default\", setting temperature to 2.0. Keys are merged element-wise from any specified in the \"_defaults\" entry for the provider: keys with the same name are overridden, all other keys are inherited.
In the above \"model.json\" example, let's look at the \"openchat\" model entry:
\"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
The \"openchat\" key name is the name you'll use to create the model as \"llamacpp:openchat\":
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:openchat\")\n
You can have the following keys in a model entry:
Key name The filename to use when loading a model (or remote model name) format Identifies the chat template format that it should use, from the \"formats.json\" file. Some local models include the chat template format in their metadata, so this key is optional. genconf Default GenConf (generation config settings) used to create the model, which will default to use them in each generation. These config settings are merged element-wise from any specified in the \"_defaults\" entry for the provider. other Any other keys will be passed during model creation as its arguments. You can learn which arguments are possible in the API reference for LlamaCppModel or OpenAIModel. For example you can pass \"ctx_len\": 2048 to define the context length to use. As genconf, these keys are merged element-wise from any specified in the \"_defaults\" entry for the provider. The \"alias\" entry is a handy way to keep names that point to actual model entries (independent of provider). Note the two alias entries \"develop\" and \"production\" in the above \"models.json\" - you could then create the production model by doing:
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"production\")\n
Alias entries can be used as \"alias:production\" or without the \"alias:\" provider, just as \"production\" as in the example above. For an example of a JSON file with many models defined, see the \"models/models.json\" file.
"},{"location":"models/openai/","title":"OpenAI","text":"Sibila can use OpenAI remote models, for which you'll need a paid OpenAI account and its API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Another possibility is to store your OpenAI key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/openai/#creating-models","title":"Creating models","text":"OpenAI models can be used by Sibila through the OpenAIModel class.
Example
from sibila import OpenAIModel\n\nmodel = OpenAIModel(\"gpt-3.5-turbo-0125\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
You can also create an OpenAI model in the Models factory by using the \"openai:\" provider prefix like this:
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-3.5-turbo-0125\")\n
"},{"location":"models/openai/#model-list","title":"Model list","text":"The available OpenAI models are listed here. You can also fetch a list of known model names by calling OpenAIModel.known_models():
Example
OpenAIModel.known_models()\n
Result
['babbage-002',\n 'dall-e-2',\n 'dall-e-3',\n 'davinci-002',\n 'gpt-3.5-turbo',\n 'gpt-3.5-turbo-0125',\n 'gpt-3.5-turbo-0301',\n 'gpt-3.5-turbo-0613',\n 'gpt-3.5-turbo-1106',\n 'gpt-3.5-turbo-16k',\n 'gpt-3.5-turbo-16k-0613',\n 'gpt-3.5-turbo-instruct',\n 'gpt-3.5-turbo-instruct-0914',\n 'gpt-4',\n 'gpt-4-0125-preview',\n 'gpt-4-0613',\n 'gpt-4-1106-preview',\n 'gpt-4-1106-vision-preview',\n 'gpt-4-turbo-preview',\n 'gpt-4-vision-preview',\n 'text-embedding-3-large',\n 'text-embedding-3-small',\n 'text-embedding-ada-002',\n 'tts-1',\n 'tts-1-1106',\n 'tts-1-hd',\n 'tts-1-hd-1106',\n 'whisper-1']\n
Not all of these models are for text inference, but the names that start with \"gpt\" are (excluding the \"vision\" models), and you can use those model names to create an OpenAI model.
"},{"location":"models/openai/#json-schema-models","title":"JSON Schema models","text":"At the time of writing, not all OpenAI inference models support JSON Schema generation via the Tools functionality, which is required for structured data extraction. The following models (and later versions) allow JSON extraction:
- gpt-3.5-turbo-1106 and later
- gpt-4-1106-preview, gpt-4-turbo-preview and later
"},{"location":"models/openai/#using-for-other-providers","title":"Using for other providers","text":"You can also use the OpenAIModel class to access any provider that uses the OpenAI API by setting the base_url and api_key arguments. For example to use the Together.ai service with the OpenAIModel class:
model_name = \"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n\nclient = OpenAIModel(\n model_name,\n base_url=\"https://api.together.xyz/v1\",\n api_key=os.environ[\"TOGETHER_API_KEY\"],\n)\n
This is just an example, as Together.ai has a dedicated Sibila class, but you can access any other OpenAI-compatible servers with the OpenAIModel class.
"},{"location":"models/remote_model/","title":"Providers","text":"Sibila can use remote models from these providers, see each section for details:
Provider Models factory prefix Examples Anthropic anthropic: Claude-3, Claude-2 models Fireworks AI fireworks: Large and small open source models Groq groq: Open source models Mistral AI mistral: Mixtral, Mistral-large/medium/small OpenAI openai: GPT-4, GPT-3.5 models Together.ai together: Many open source hosted models"},{"location":"models/setup_format/","title":"Chat template format","text":""},{"location":"models/setup_format/#what-are-chat-templates","title":"What are chat templates?","text":"Because these models were fine-tuned for chat or instruct interaction, they use a chat template, which is a Jinja template that converts a list of messages into a text prompt. This template must follow the original format that the model was trained on - this is very important or you won't get good results.
Chat template definitions are Jinja templates like the following one, which is in ChatML format:
{% for message in messages %}\n {{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n'}}\n{% endfor %}\n
When ran over a list of messages with system, user and model messages, the template produces text like the following:
<|im_start|>system\nYou speak like a pirate.<|im_end|>\n<|im_start|>user\nHello there?<|im_end|>\n<|im_start|>assistant\nAhoy there matey! How can I assist ye today on this here ship o' mine?<|im_end|>\n
Only by using the specific chat template for the model, can we get back the best results.
Sibila tries to automatically detect which template to use with a model, either from the model name or from embedded metadata, if available.
"},{"location":"models/setup_format/#does-the-model-have-a-built-in-chat-template-format","title":"Does the model have a built-in chat template format?","text":"Some GGUF models include the chat template in their metadata, unfortunately this is not standard.
You can quickly check if the model has a chat template by running the sibila CLI in the same folder as the model file:
> sibila models -t \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\nUsing models directory '.'\nTesting model 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf'...\nModel 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf' was properly created and should run fine.\n
In this case the chat template format is included with the model and nothing else is needed.
Another way to test this is to try creating the model in python. If no exception is raised, the model GGUF file contains the template definition and should work fine.
Example of model creation error
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Error
...\n\nValueError: Could not find a suitable format (chat template) for this model.\nWithout a format, fine-tuned models cannot function properly.\nSee the docs on how you can fix this: pass the template in the format arg or \ncreate a 'formats.json' file.\n
But if you get an error such as above, you'll need to provide a chat template. It's quite easy - let's see how to do it.
"},{"location":"models/setup_format/#find-the-chat-template-format","title":"Find the chat template format","text":"So, how to find the chat template for a new model that you intend to use?
This is normally listed in the model's page: search in that page for \"template\" and copy the listed Jinja template text.
If the template isn't directly listed in the model's page, you can look for a file named \"tokenizer_config.json\" in the main model files. This file should include an entry named \"chat_template\" which is what we want.
Example of a tokenizer_config.json file
For example, in OpenChat's file \"tokenizer_config.json\":
https://huggingface.co/openchat/openchat-3.5-1210/blob/main/tokenizer_config.json
You'll find this line with the template:
{\n \"...\": \"...\",\n\n \"chat_template\": \"{{ bos_token }}...{% endif %}\",\n\n \"...\": \"...\"\n}\n
The value in the \"chat_template\" key is the Jinja template that we're looking for.
Another alternative is to search online for the name of the model and \"chat template\".
Either way, once you know the template used by the model, you can set and use it.
"},{"location":"models/setup_format/#option-1-pass-the-chat-template-format-when-creating-the-model","title":"Option 1: Pass the chat template format when creating the model","text":"Once you know the chat template definition you can create the model and pass it in the format argument. Let's assume you have a model file named \"peculiar-model-7b.gguf\":
chat_template = \"{{ bos_token }}...{% endif %}\"\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\",\n format=chat_template)\n
And the model should now work without problems.
"},{"location":"models/setup_format/#option-2-add-the-chat-template-to-the-models-factory","title":"Option 2: Add the chat template to the Models factory","text":"If you plan to use the model many times, a more convenient solution is to create an entry in the \"formats.json\" file so that all further models with this name will use the template.
"},{"location":"models/setup_format/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s peculiar peculiar-model \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'peculiar' with match='peculiar-model', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/setup_format/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative to using the sibila CLI tool, you can add the chat template format by creating an entry in a \"formats.json\" file, in the same folder as the model, with these fields:
{\n \"peculiar\": {\n \"match\": \"peculiar-model\",\n \"template\": \"{{ bos_token }}...{% endif %}\"\n }\n}\n
The \"match\" field is regular expression that will be used to match the model name or filename. Field \"template\" is the chat template in Jinja format.
After configuring the template as we've seen above, all you need to do is to create a LlamaCppModel object and pass the model file path.
model = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Note that we're not passing the format argument anymore when creating the model. The \"match\" regular expression we defined above will recognize the model from the filename and use the given chat template format.
Base format definitions
Sibila includes by default the definitions of several well-known chat template formats. These definitions are available in \"sibila/base_formats.json\", and are automatically loaded when Models factory is created.
You can add any chat template formats into your own \"formats.json\" files, but please never change the \"sibila/base_formats.json\" file, to avoid potential errors.
"},{"location":"models/sibila_cli/","title":"Sibila CLI tool","text":"The Sibila Command-Line Interface tool simplifies managing the Models factory and is useful to download models from Hugging Face model hub.
The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The CLI tool is divided in three areas or actions:
Action models Manage models in \"model.json\" files formats Manage formats in \"model.json\" files hub Search and download models from Hugging Face model hub In all commands you should pass the option \"-m models_folder\" with the path to the \"models\" folder. Or in alternative run the commands inside the \"models\" folder.
The following argument names are used below (other unlisted names should be descriptive enough):
Name res_name Model entry name in the form \"provider:name\", for example \"llamacpp:openchat\". format_name Name of a format entry in \"formats.json\", for example \"chatml\". query Case-insensitive query that will be matched by a substring search. Usage help is available by running \"sibila --help\" for general help, or \"sibila action --help\", where action is one of \"models\", \"formats\" or \"hub\".
"},{"location":"models/sibila_cli/#sibila-models","title":"Sibila models","text":"To register a model entry pointing to a model name or filename, and optional format_name is a format name:
sibila models -s res_name model_name_or_filename [format_name]\n
To set the format_name for an existing model entry:
sibila models -f res_name format_name\n
To test if a model can run (for example to check if it has the chat template format defined):
sibila models -t res_name\n
List all models with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a model entry in:
sibila models -d res_name\n
"},{"location":"models/sibila_cli/#sibila-formats","title":"Sibila formats","text":"Check if a model filename has any format defined in the Models factory:
sibila formats -q filename\n
To register a chat template format, where template is the Jinja chat template and optional match is a regexp that matches model filename:
sibila formats -s format_name template [match_regex]\n
List all formats with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a format entry:
sibila formats -d format_name\n
"},{"location":"models/sibila_cli/#sibila-hub","title":"Sibila hub","text":"List models in the Hugging Face model hub that match the given queries. Argument query can be a list of strings to match, separated by a space character.
Arg Filename is case-insensitive for substring matching.
Arg exact_author is an exact and case-sensitive author name from Hugging Face model hub.
sibila hub -l query [-f filename] [-a exact_author]\n
To download a model, where model_id is a string like \"TheBloke/openchat-3.5-1210-GGUF\". Args filename and author_name same as above:
sibila hub -d model_id -f filename -a exact_author -s set name\n
"},{"location":"models/together/","title":"Together.ai","text":"With Sibila you can use the models hosted by Together.ai, for which you'll need an API key (which is initially free). As in other providers, although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export TOGETHER_API_KEY=\"...\"\n
setx TOGETHER_API_KEY \"...\"\n
Another possibility is to store your API key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/together/#creating-models","title":"Creating models","text":"Models served by Together.ai can be used by Sibila through the TogetherModel class.
Example
from sibila import TogetherModel\n\nmodel = TogetherModel(\"mistralai/Mixtral-8x7B-Instruct-v0.1\")\n\nmodel(\"I think that I shall never see.\")\n
Result
A poem lovely as a tree. These are the beginning lines of a famous poem called \"Trees\" written by Joyce Kilmer. The full poem goes as follows:\n\nI think that I shall never see\nA poem lovely as a tree.\n\nA tree whose hungry mouth is prest\nAgainst the earth\u2019s sweet flowing breast;\n\nA tree that looks at God all day,\nAnd lifts her leafy arms to pray;\n\nA tree that may in Summer wear\nA nest of robins in her hair;\n\nUpon whose bosom snow has lain;\nWho intimately lives with rain.\n\nPoems are made by fools like me,\nBut only God can make a tree.\n
You can also create a Together.ai model in the Models factory by using the \"together:\" provider prefix:
from sibila import Models\n\nmodel = Models.create(\"together:mistralai/Mixtral-8x7B-Instruct-v0.1\")\n
"},{"location":"models/together/#model-list","title":"Model list","text":"The available Together.ai text inference models models are listed here.
Unfortunately Together.ai doesn't provide an API to list the models, so TogetherModel.known_models() will return None.
"},{"location":"models/together/#json-schema-models","title":"JSON Schema models","text":"At the time of writing, only the following Together.ai models support JSON Schema generation, which is required for structured data extraction:
- mistralai/Mixtral-8x7B-Instruct-v0.1
- mistralai/Mistral-7B-Instruct-v0.1
- togethercomputer/CodeLlama-34b-Instruct
You can still use any of the other models for plain text or schema-free JSON generation, for example with the Model.call() or Model.json() methods.
"},{"location":"models/vision/","title":"Vision models","text":"Vision models allow you to provide an image alongside your text query. Elements in this image can be referenced and its data extracted with normal methods like Model.extract() or Model.classify().
Example
The photo variable below references this image, but a local image file path could also be provided:
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4o\")\n\nphoto = \"https://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg/640px-Hohenloher_Freilandmuseum_-_Baugruppe_Hohenloher_Dorf_-_Bauerngarten_-_Ansicht_von_Osten_im_Juni.jpg\"\n\nmodel.extract(list[str],\n (\"Extract up to five of the most important elements in this photo.\",\n photo))\n
Result
['House with red roof and beige walls',\n 'Large tree with green leaves',\n 'Garden with various plants and flowers',\n 'Clear blue sky',\n 'Wooden fence']\n
To pass an image location we can pass a tuple of (text,image_location) as in the example above. This tuple is a shortcut to create an Msg with the text prompt and the image location. See Threads and messages for more information.
"},{"location":"models/vision/#remote-models","title":"Remote models","text":"At the time of writing (June 2024), the following remote vision models can be used in Sibila:
Provider Models OpenAI gpt-4o Anthropic all models Of these, the OpenAI model is currently the most capable one (with regards to images).
"},{"location":"models/vision/#local-models","title":"Local models","text":"Local models are supported via Llama.cpp and its Llava engine. This means that two models have to be loaded: the text model and a projector model. The two models are passed by separating the GGUF filenames with a \"*\". For example:
# note the * separating the two GGUF files:\nname = \"moondream2-text-model-f16.gguf*moondream2-mmproj-f16.gguf\"\n\nmodel = LlamaCppModel(name,\n ctx_len=2048)\n\n# or via Models.create()\nmodel = Models.create(\"llamacpp:\" + name, \n ctx_len=2048)\n
In the example above, the context length argument ctx_len is being set, because image inputs do consume tokens, so a larger context is a good idea.
A list of small (up to 8B params) open source models available on June 2024:
Model and HuggingFace link GGUF filenames Llava-v1.5 llava-v1.5-ggml-model-q4_k.gguf*llava-v1.5-mmproj-model-f16.gguf Llava-v1.6 mistral llava-v1.6-mistral-7b.Q4_K_M.gguf*llava-v1.6-mistral-mmproj-model-f16.gguf Llava-v1.6 vicuna 7B llava-v1.6-vicuna-7b.Q4_K_M.gguf*llava-v1.6-vicuna-mmproj-model-f16.gguf Moondream2 moondream2-text-model-f16.gguf*moondream2-mmproj-f16.gguf Llava-phi-3 llava-phi-3-mini-int4.gguf*llava-phi-3-mini-mmproj-f16.gguf Llava-llama-3 llava-llama-3-8b-v1_1-int4.gguf*llava-llama-3-8b-v1_1-mmproj-f16.gguf Llama3-vision Meta-Llama-3-8B-Instruct-Q4_K_M.gguf*llama-3-vision-alpha-mmproj-f16.gguf Some of the filenames might have been renamed from the original downloaded names to avoid name collisions.
At the current time, these small models are mostly only capable of description tasks. Some larger 34B variants are also available.
"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 0616e6f..a200a72 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,262 +2,262 @@
https://jndiogo.github.io/sibila/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/async/
- 2024-06-05
- daily
-
-
- https://jndiogo.github.io/sibila/chat_templates/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/first_run/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/installing/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/thread/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/tips/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/tools/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/what/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/changelog/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/generation/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/local_model/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/models/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/multigen/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/remote_model/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/thread/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/tokenizer/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/api-reference/tools/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/async/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/cli/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/compare/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/extract/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/extract_dataclass/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/from_text_to_object/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/hello_model/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/interact/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/quick_meeting/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/receipt/
- 2024-06-05
+ 2024-06-21
+ daily
+
+
+ https://jndiogo.github.io/sibila/examples/simple_tools/
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/tag/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/examples/tough_meeting/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/dataclass/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/enums/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/free_json/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/free_text/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/pydantic/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/extract/simple_types/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/anthropic/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/find_local_models/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/fireworks/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/formats_json/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/groq/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/local_model/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/mistral/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/models_factory/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/models_json/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/openai/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/remote_model/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/setup_format/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/sibila_cli/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/together/
- 2024-06-05
+ 2024-06-21
daily
https://jndiogo.github.io/sibila/models/vision/
- 2024-06-05
+ 2024-06-21
daily
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 408ea90..01c6b74 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/thread/index.html b/thread/index.html
index 29f8ab1..229dfd9 100644
--- a/thread/index.html
+++ b/thread/index.html
@@ -1455,6 +1455,8 @@
+
+
@@ -1719,6 +1721,27 @@
+
+
+
+
+
+ Simple tools
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/tips/index.html b/tips/index.html
index 503c6ce..6881bb9 100644
--- a/tips/index.html
+++ b/tips/index.html
@@ -1474,6 +1474,8 @@
+
+
@@ -1738,6 +1740,27 @@
+
+
+
+
+
+ Simple tools
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/tools/index.html b/tools/index.html
index 353acaf..2be8e41 100644
--- a/tools/index.html
+++ b/tools/index.html
@@ -1437,6 +1437,8 @@
+
+
@@ -1701,6 +1703,27 @@
+
+
+
+
+
+ Simple tools
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/what/index.html b/what/index.html
index cd92cac..1a3e427 100644
--- a/what/index.html
+++ b/what/index.html
@@ -1464,6 +1464,8 @@
+
+
@@ -1728,6 +1730,27 @@
+
+
+
+
+
+ Simple tools
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+