sibila hub -d model_id -f filename -a exact_author -s set name
+sibila hub -d model_id -f filename -a exact_author -s set name
diff --git a/objects.inv b/objects.inv
index 93f7835..a2e6915 100644
Binary files a/objects.inv and b/objects.inv differ
diff --git a/search/search_index.json b/search/search_index.json
index b9e3220..458860a 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Sibila","text":"Extract structured data from remote or local file LLM models.
- Extract data into Pydantic objects, dataclasses or simple types.
- Same API for local file models and remote OpenAI models.
- Model management: download models, manage configuration and quickly switch between models.
- Tools for evaluating output across local/remote models, for chat-like interaction and more.
See What can you do with Sibila?
To extract structured data from a local model:
from sibila import Models\nfrom pydantic import BaseModel\n\nclass Info(BaseModel):\n event_year: int\n first_name: str\n last_name: str\n age_at_the_time: int\n nationality: str\n\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
Returns an instance of class Info, created from the model's output:
Info(event_year=1969,\n first_name='Neil',\n last_name='Armstrong',\n age_at_the_time=38,\n nationality='American')\n
Or to use OpenAI's GPT-4, we would simply replace the model's name:
model = Models.create(\"openai:gpt-4\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
If Pydantic BaseModel objects are too much for your project, Sibila supports similar functionality with Python dataclass.
"},{"location":"first_run/","title":"First run","text":""},{"location":"first_run/#with-a-remote-model","title":"With a remote model","text":"To use an OpenAI remote model, you'll need a paid OpenAI account and its API key. You can explicitly pass this key in your script but this is a poor security practice.
A better way is to define an environment variable which the OpenAI API will use when needed:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Having set this variable with your OpenAI API key, you can run a \"Hello Model\" like this:
Example
from sibila import OpenAIModel, GenConf\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
You're all set if you only plan to use remote OpenAI models.
"},{"location":"first_run/#with-a-local-model","title":"With a local model","text":"Local models run from files in GGUF format which are loaded run by the llama.cpp component.
You'll need to download a GGUF model file: we suggest OpenChat 3.5 - an excellent 7B parameters quantized model that will run in less thant 7Gb of memory.
To download the OpenChat model file, please see Download OpenChat model.
After downloading the file, you can run this \"Hello Model\" script:
Example
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder - change if different:\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
If the above scripts output similar pirate talk, Sibila should be working fine.
"},{"location":"installing/","title":"Installing","text":""},{"location":"installing/#installation","title":"Installation","text":"Sibila requires Python 3.9+ and uses the llama-cpp-python package for local models and OpenAI's API to access remote models like GPT-4.
Install Sibila from PyPI by running:
pip install sibila\n
If you only plan to use remote models (OpenAI), there's nothing else you need to do. See First Run to get it going.
Installation in edit mode Alternatively you can install Sibila in edit mode by downloading the GitHub repository and running the following in the base folder of the repository:
pip install -e .\n
"},{"location":"installing/#enabling-llamacpp-hardware-acceleration","title":"Enabling llama.cpp hardware acceleration","text":"Local models will run faster with hardware acceleration enabled. Sibila uses llama-cpp-python, a python wrapper for llama.cpp and it's a good idea to make sure it was installed with the best optimization your computer can offer.
See the following sections: depending on which hardware you have, you can run the listed command which will reinstall llama-cpp-python with the selected optimization. If any error occurs you can always install the non-accelerated version, as listed at the end.
"},{"location":"installing/#for-cuda-nvidia-gpus","title":"For CUDA - NVIDIA GPUs","text":"For CUDA acceleration in NVIDA GPUs, you'll need to Install the NVIDIA CUDA Toolkit.
LinuxWindows CMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_CUBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
Installing llama-cpp-python with NVIDIA GPU Acceleration on Windows: A Short Guide More info: Installing llama-cpp-python with GPU Support.
"},{"location":"installing/#for-metal-apple-silicon-macs","title":"For Metal - Apple silicon macs","text":"Mac CMAKE_ARGS=\"-DLLAMA_METAL=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-rocm-amd-gpus","title":"For ROCm AMD GPUS","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_HIPBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_HIPBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-vulkan-supporting-gpus","title":"For Vulkan supporting GPUs","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_VULKAN=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_VULKAN=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#cpu-acceleration-if-none-of-the-above","title":"CPU acceleration (if none of the above)","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
If you get an error running the above commands, please see llama-cpp-python's Installation configuration.
"},{"location":"installing/#non-accelerated","title":"Non-accelerated","text":"In any case, you can always install llama-cpp-python without acceleration by running:
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"tips/","title":"Tips and Tricks","text":"Some general tips from experience with constrained model output in Sibila.
"},{"location":"tips/#temperature","title":"Temperature","text":"Sibila aims at exact results, so generation temperature defaults to 0. You should get the same results from the same model at all times.
For \"creative\" outputs, you can set the temperature to a non-zero value. This is done in GenConf, which can be passed in many places, for example during actual generation/extraction:
Example
from sibila import (Models, GenConf)\n\nModels.setup(\"../models\")\n\nmodel = Models.create(\"llamacpp:openchat\") # default GenConf could be passed here\n\nfor i in range(10):\n print(model.extract(int,\n \"Think of a random number between 1 and 100\",\n genconf=GenConf(temperature=2.)))\n
Result
72\n78\n75\n68\n39\n47\n53\n82\n72\n63\n
"},{"location":"tips/#split-entities-into-separate-classes","title":"Split entities into separate classes","text":"Suppose you want to extract a list of person names from a group. You could use the following class:
class Group(BaseModel):\n persons: list[str] = Field(description=\"List of persons\")\n group_info: str\n\nout = model.extract(Group, in_text)\n
But it tends to work better to separate the Person entity into its own class and leave the list in Group:
class Person(BaseModel):\n name: str\n\nclass Group(BaseModel):\n persons: list[Person]\n group_info: str\n\nout = model.extract(Group, in_text)\n
The same applies to the equivalent dataclass definitions.
Adding descriptions seems to always help, specially for non-trivial extraction. Without descriptions, the model can only look into variable names for clues on what's wanted, so it's important to tell it what we want by adding field descriptions.
"},{"location":"tools/","title":"Tools","text":"The tools module includes some utilities to simplify common tasks.
"},{"location":"tools/#interact","title":"Interact","text":"The interact() function allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done inside interact(), which can manage thread messages and delete older ones when the context maximum length is reached.
Example
from sibila import Models, GenConf\nfrom sibila.tools import interact\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nprint(\"Help available by typing '!'. Enter an empty line to quit.\")\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
Result
Help available by typing '!'. Enter an empty line to quit.\n\n> Hello there!\n\nHello! How can I help you today?\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\nYes, I am familiar with the AI language model named Sibila. It is a large language \nmodel developed by a Spanish company called Wall Street Horus. Sibila is designed to \nunderstand and generate text in multiple languages, particularly those spoken in \nSpain and Latin America. It is similar to other language models like OpenAI's GPT-3 \nand Google's BERT, but it has been specifically fine-tuned for the Spanish language.\nIf you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n(NOTE: ignore the above model-generated paragraph, it's total hallucination)\n\n\n> No way! I've just searched and there's no such company nor model. \nYou are hallucinating, Mr. Model!\n\nApologies for the confusion in my previous response. It appears that Sibila is not \nan existing AI language model or a known company. I may have mixed up information \nfrom different sources. If you have any other questions or need assistance with a \ndifferent topic, please feel free to ask!\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining \na chat with you. It stores all user and model messages in a sibila.Context class.\n\nGot it! So, you are using a function from the Sibila package to have a conversation \nwith me, and it's storing our messages in a sibila.Context object. If you need any help \nor information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n(Continues...)\n
Check the Interact example to see a chat created with this function.
"},{"location":"tools/#multigen","title":"Multigen","text":"The multigen module includes functions that build a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
The query_multigen() function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
See the Compare example for a side-by-side comparison of a local and a remote model, answering to the same queries.
"},{"location":"what/","title":"What can you do with Sibila?","text":"LLM models answer your questions in the best way their training allows, but they always answer back in plain text (or tokens).
With Sibila, you can extract structured data from LLM models. Not whatever the model chose to output (even if you asked it to answer in a certain format), but the exact fields and types that you need.
This not only simplifies handling the model responses but can also open new possibilities: you can now deal with the model in a structured way.
"},{"location":"what/#extract-pydantic-dataclasses-or-simple-types","title":"Extract Pydantic, dataclasses or simple types","text":"To specify the structured output that you want from the model, you can use Pydantic's BaseModel derived classes, or the lightweight Python dataclasses, if you don't need the whole Pydantic.
With Sibila, you can also use simple data types like bool, int, str, enumerations or lists. For example, need to classify something?
Example
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nmodel.classify([\"good\", \"neutral\", \"bad\"], \n \"Running with scissors\")\n
Result
'bad'\n
How does it work? Extraction to the given data types is guaranteed by automatic JSON Schema grammars in local models, or by the Tools functionality of OpenAI API remote models.
"},{"location":"what/#from-your-models-or-openais","title":"From your models or OpenAI's","text":"Small downloadable 7B parameter models are getting better every month and they have reached a level where they are competent enough for most common data extraction or summarization tasks.
With 8Gb or more of RAM or GPU memory, you can get good structured output from models like OpenChat, Zephyr, Mistral 7B, or any other GGUF file.
You can use any paid OpenAI model, as well as any model that llama.cpp can run, with the same API. Choose the best model for each use, allowing you the freedom of choice.
"},{"location":"what/#with-model-management","title":"With model management","text":"Includes a Models factory that creates models from simple names instead of having to track model configurations, filenames or chat templates.
local_model = Models.create(\"llamacpp:openchat\")\n\nremote_model = Models.create(\"openai:gpt-4\") \n
This makes the switch to newer models much easier, and makes it simpler to compare model outputs.
Sibila includes a CLI tool to download GGUF models from Hugging Face model hub, and to manage its Models factory.
"},{"location":"api-reference/generation/","title":"Generation configs, results and errors","text":""},{"location":"api-reference/generation/#generation-configs","title":"Generation Configs","text":""},{"location":"api-reference/generation/#sibila.GenConf","title":"GenConf dataclass
","text":"Model generation configuration, used in Model.gen() and variants.
"},{"location":"api-reference/generation/#sibila.GenConf.max_tokens","title":"max_tokens class-attribute
instance-attribute
","text":"max_tokens = 0\n
Max generated token length. 0 means all available up to output context size (which equals: model.ctx_len - in_prompt_len)
"},{"location":"api-reference/generation/#sibila.GenConf.stop","title":"stop class-attribute
instance-attribute
","text":"stop = field(default_factory=list)\n
List of generation stop text sequences
"},{"location":"api-reference/generation/#sibila.GenConf.temperature","title":"temperature class-attribute
instance-attribute
","text":"temperature = 0.0\n
Generation temperature. Use 0 to always pick the most probable output, without random sampling. Larger positive values will produce more random outputs.
"},{"location":"api-reference/generation/#sibila.GenConf.top_p","title":"top_p class-attribute
instance-attribute
","text":"top_p = 0.9\n
Nucleus sampling top_p value. Only applies if temperature > 0.
"},{"location":"api-reference/generation/#sibila.GenConf.format","title":"format class-attribute
instance-attribute
","text":"format = 'text'\n
Output format: \"text\" or \"json\". For JSON output, text is validated as in json.loads(). Thread msgs must explicitly request JSON output or a warning will be emitted if string json not present (this is automatically done in Model.json() and related calls).
"},{"location":"api-reference/generation/#sibila.GenConf.json_schema","title":"json_schema class-attribute
instance-attribute
","text":"json_schema = None\n
A JSON schema to validate the JSON output. Thread msgs must list the JSON schema and request its use; must also set the format to \"json\".
"},{"location":"api-reference/generation/#sibila.GenConf.__call__","title":"__call__","text":"__call__(**kwargs)\n
Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.
Parameters:
Name Type Description Default **kwargs
Any
update settings of the same names in the returned copy.
{}
Raises:
Type Description KeyError
If key does not exist.
Returns:
Type Description Self
A copy of the current object with kwargs values updated. Doesn't modify object.
Source code in sibila/gen.py
def __call__(self,\n **kwargs: Any) -> Self:\n \"\"\"Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.\n\n Args:\n **kwargs: update settings of the same names in the returned copy.\n\n Raises:\n KeyError: If key does not exist.\n\n Returns:\n A copy of the current object with kwargs values updated. Doesn't modify object.\n \"\"\"\n\n ret = copy(self)\n\n for k,v in kwargs.items():\n if not hasattr(ret, k):\n raise KeyError(f\"No such key '{k}'\")\n setattr(ret, k,v)\n\n return ret\n
"},{"location":"api-reference/generation/#sibila.GenConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/gen.py
def clone(self) -> Self:\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.as_dict","title":"as_dict","text":"as_dict()\n
Return GenConf as a dict.
Source code in sibila/gen.py
def as_dict(self) -> dict:\n \"\"\"Return GenConf as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/gen.py
@staticmethod\ndef from_dict(dic: dict) -> Any: # Any = GenConf\n return GenConf(**dic)\n
"},{"location":"api-reference/generation/#sibila.JSchemaConf","title":"JSchemaConf dataclass
","text":"Configuration for JSON schema massaging and validation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.resolve_refs","title":"resolve_refs class-attribute
instance-attribute
","text":"resolve_refs = True\n
Set for $ref references to be resolved and replaced with actual definition.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.collapse_single_combines","title":"collapse_single_combines class-attribute
instance-attribute
","text":"collapse_single_combines = True\n
Any single-valued \"oneOf\"/\"anyOf\" is replaced with the actual value.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.description_from_title","title":"description_from_title class-attribute
instance-attribute
","text":"description_from_title = 0\n
If a value doesn't have a description entry, make one from its title or name.
- 0: don't make description from name
- 1: copy title or name to description
- 2: 1: + capitalize first letter and convert _ to space: class_label -> \"class label\".
"},{"location":"api-reference/generation/#sibila.JSchemaConf.force_all_required","title":"force_all_required class-attribute
instance-attribute
","text":"force_all_required = False\n
Force all entries in an object to be required (except removed defaults if remove_with_default=True).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.remove_with_default","title":"remove_with_default class-attribute
instance-attribute
","text":"remove_with_default = False\n
Delete any values that have a \"default\" annotation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.default_to_last","title":"default_to_last class-attribute
instance-attribute
","text":"default_to_last = True\n
Move any default value entry into the last position of properties dict.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.additional_allowed_root_keys","title":"additional_allowed_root_keys class-attribute
instance-attribute
","text":"additional_allowed_root_keys = field(default_factory=list)\n
By default only the following properties are allowed in schema's root: description, properties, type, required, additionalProperties, allOf, anyOf, oneOf, not Add to this list to allow additional root properties.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.pydantic_strict_validation","title":"pydantic_strict_validation class-attribute
instance-attribute
","text":"pydantic_strict_validation = None\n
Validate JSON values in a strict manner or not. None means validate individually per each value in the obj. (for example in pydantic with: Field(strict=True)).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/json_schema.py
def clone(self):\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#results","title":"Results","text":""},{"location":"api-reference/generation/#sibila.GenRes","title":"GenRes","text":"Model generation result.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_STOP","title":"OK_STOP class-attribute
instance-attribute
","text":"OK_STOP = 1\n
Generation complete without errors.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_LENGTH","title":"OK_LENGTH class-attribute
instance-attribute
","text":"OK_LENGTH = 0\n
Generation stopped due to reaching max_tokens.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON","title":"ERROR_JSON class-attribute
instance-attribute
","text":"ERROR_JSON = -1\n
Invalid JSON: this is often due to the model returning OK_LENGTH (finished due to max_tokens reached), which cuts off the JSON text.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_VAL","title":"ERROR_JSON_SCHEMA_VAL class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_VAL = -2\n
Failed JSON schema validation.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_ERROR","title":"ERROR_JSON_SCHEMA_ERROR class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_ERROR = -2\n
JSON schema itself is not valid.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_MODEL","title":"ERROR_MODEL class-attribute
instance-attribute
","text":"ERROR_MODEL = -3\n
Other model internal error.
"},{"location":"api-reference/generation/#sibila.GenRes.from_finish_reason","title":"from_finish_reason staticmethod
","text":"from_finish_reason(finish)\n
Convert a ChatCompletion finish result into a GenRes.
Parameters:
Name Type Description Default finish
str
ChatCompletion finish result.
required Returns:
Type Description Any
A GenRes result.
Source code in sibila/gen.py
@staticmethod\ndef from_finish_reason(finish: str) -> Any: # Any=GenRes\n \"\"\"Convert a ChatCompletion finish result into a GenRes.\n\n Args:\n finish: ChatCompletion finish result.\n\n Returns:\n A GenRes result.\n \"\"\"\n if finish == 'stop':\n return GenRes.OK_STOP\n elif finish == 'length':\n return GenRes.OK_LENGTH\n elif finish == '!json':\n return GenRes.ERROR_JSON\n elif finish == '!json_schema_val':\n return GenRes.ERROR_JSON_SCHEMA_VAL\n elif finish == '!json_schema_error':\n return GenRes.ERROR_JSON_SCHEMA_ERROR\n else:\n return GenRes.ERROR_MODEL\n
"},{"location":"api-reference/generation/#sibila.GenRes.as_text","title":"as_text staticmethod
","text":"as_text(res)\n
Returns a friendlier description of the result.
Parameters:
Name Type Description Default res
Any
Model output result.
required Raises:
Type Description ValueError
If unknown GenRes.
Returns:
Type Description str
A friendlier description of the GenRes.
Source code in sibila/gen.py
@staticmethod\ndef as_text(res: Any) -> str: # Any=GenRes\n \"\"\"Returns a friendlier description of the result.\n\n Args:\n res: Model output result.\n\n Raises:\n ValueError: If unknown GenRes.\n\n Returns:\n A friendlier description of the GenRes.\n \"\"\"\n\n if res == GenRes.OK_STOP:\n return \"Stop\"\n elif res == GenRes.OK_LENGTH:\n return \"Length (output cut)\"\n elif res == GenRes.ERROR_JSON:\n return \"JSON decoding error\"\n\n elif res == GenRes.ERROR_JSON_SCHEMA_VAL:\n return \"JSON SCHEMA validation error\"\n elif res == GenRes.ERROR_JSON_SCHEMA_ERROR:\n return \"Error in JSON SCHEMA\"\n\n elif res == GenRes.ERROR_MODEL:\n return \"Model internal error\"\n else:\n raise ValueError(\"Bad/unknow GenRes\")\n
"},{"location":"api-reference/generation/#errors","title":"Errors","text":""},{"location":"api-reference/generation/#sibila.GenError","title":"GenError","text":"GenError(out)\n
Model generation exception, raised when the model was unable to return a response.
An error has happened during model generation.
Parameters:
Name Type Description Default out
GenOut
Model output
required Source code in sibila/gen.py
def __init__(self, \n out: GenOut):\n \"\"\"An error has happened during model generation.\n\n Args:\n out: Model output\n \"\"\"\n\n assert out.res != GenRes.OK_STOP, \"OK_STOP is not an error\" \n\n super().__init__()\n\n self.res = out.res\n self.text = out.text\n self.dic = out.dic\n self.value = out.value\n
"},{"location":"api-reference/generation/#sibila.GenError.raise_if_error","title":"raise_if_error staticmethod
","text":"raise_if_error(out, ok_length_is_error)\n
Raise an exception if the model returned an error
Parameters:
Name Type Description Default out
GenOut
Model returned info.
required ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error?
required Raises:
Type Description GenError
If an error was returned by model.
Source code in sibila/gen.py
@staticmethod\ndef raise_if_error(out: GenOut,\n ok_length_is_error: bool):\n \"\"\"Raise an exception if the model returned an error\n\n Args:\n out: Model returned info.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error?\n\n Raises:\n GenError: If an error was returned by model.\n \"\"\"\n\n if out.res != GenRes.OK_STOP:\n if out.res == GenRes.OK_LENGTH and not ok_length_is_error:\n return # OK_LENGTH to not be considered an error\n\n raise GenError(out)\n
"},{"location":"api-reference/generation/#sibila.GenOut","title":"GenOut dataclass
","text":"Model output, returned by gen_extract(), gen_json() and other model calls that don't raise exceptions.
"},{"location":"api-reference/generation/#sibila.GenOut.res","title":"res instance-attribute
","text":"res\n
Result of model generation.
"},{"location":"api-reference/generation/#sibila.GenOut.text","title":"text instance-attribute
","text":"text\n
Text generated by model.
"},{"location":"api-reference/generation/#sibila.GenOut.dic","title":"dic class-attribute
instance-attribute
","text":"dic = None\n
Python dictionary, output by the structured calls like gen_json().
"},{"location":"api-reference/generation/#sibila.GenOut.value","title":"value class-attribute
instance-attribute
","text":"value = None\n
Initialized instance value, dataclass or Pydantic BaseModel object, as returned in calls like extract().
"},{"location":"api-reference/generation/#sibila.GenOut.as_dict","title":"as_dict","text":"as_dict()\n
Return GenOut as a dict.
Source code in sibila/gen.py
def as_dict(self):\n \"\"\"Return GenOut as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenOut.__str__","title":"__str__","text":"__str__()\n
Source code in sibila/gen.py
def __str__(self):\n out = f\"Error={self.res.as_text(self.res)} text=\u2588{self.text}\u2588\"\n if self.dic is not None:\n out += f\" dic={self.dic}\"\n if self.value is not None:\n out += f\" value={self.value}\"\n return out\n
"},{"location":"api-reference/model/","title":"Model classes","text":""},{"location":"api-reference/model/#local-models","title":"Local models","text":""},{"location":"api-reference/model/#sibila.LlamaCppModel","title":"LlamaCppModel","text":"LlamaCppModel(\n path,\n format=None,\n format_search_order=[\n \"name\",\n \"meta_template\",\n \"models_json\",\n \"formats_json\",\n ],\n *,\n genconf=None,\n schemaconf=None,\n tokenizer=None,\n ctx_len=2048,\n n_gpu_layers=-1,\n main_gpu=0,\n n_batch=512,\n seed=4294967295,\n verbose=False,\n **llamacpp_kwargs\n)\n
Use local GGUF format models via llama.cpp engine.
Supports grammar-constrained JSON output following a JSON schema.
Parameters:
Name Type Description Default path
str
File path to the GGUF file.
required format
Optional[str]
Chat template format to use with model. Leave as None for auto-detection.
None
format_search_order
list[str]
Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"models_json\", \"formats_json\" looks for these configs in file's folder. Defaults to [\"name\",\"meta_template\", \"models_json\", \"formats_json\"].
['name', 'meta_template', 'models_json', 'formats_json']
genconf
Optional[GenConf]
Default generation configuration, which can be used in gen() and related. Defaults to None.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
ctx_len
int
Maximum context length to be used (shared for input and output). Defaults to 2048.
2048
n_gpu_layers
int
Number of model layers to run in a GPU. Defaults to -1 for all.
-1
main_gpu
int
Index of the GPU to use. Defaults to 0.
0
n_batch
int
Prompt processing batch size. Defaults to 512.
512
seed
int
Random number generation seed, for non zero temperature inference. Defaults to 4294967295.
4294967295
verbose
bool
Emit (very) verbose output. Defaults to False.
False
Raises:
Type Description ImportError
If llama-cpp-python is not installed.
ValueError
If ctx_len is 0 or larger than the values supported by model.
Source code in sibila/llamacpp.py
def __init__(self,\n path: str,\n\n format: Optional[str] = None, \n format_search_order: list[str] = [\"name\",\"meta_template\", \"models_json\", \"formats_json\"],\n\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n tokenizer: Optional[Tokenizer] = None,\n ctx_len: int = 2048,\n\n # important LlamaCpp-specific args\n n_gpu_layers: int = -1,\n main_gpu: int = 0,\n n_batch: int = 512,\n seed: int = 4294967295,\n verbose: bool = False,\n\n # other LlamaCpp-specific args\n **llamacpp_kwargs\n ):\n \"\"\"\n Args:\n path: File path to the GGUF file.\n format: Chat template format to use with model. Leave as None for auto-detection.\n format_search_order: Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"models_json\", \"formats_json\" looks for these configs in file's folder. Defaults to [\"name\",\"meta_template\", \"models_json\", \"formats_json\"].\n genconf: Default generation configuration, which can be used in gen() and related. Defaults to None.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). Defaults to 2048.\n n_gpu_layers: Number of model layers to run in a GPU. Defaults to -1 for all.\n main_gpu: Index of the GPU to use. Defaults to 0.\n n_batch: Prompt processing batch size. Defaults to 512.\n seed: Random number generation seed, for non zero temperature inference. Defaults to 4294967295.\n verbose: Emit (very) verbose output. Defaults to False.\n\n Raises:\n ImportError: If llama-cpp-python is not installed.\n ValueError: If ctx_len is 0 or larger than the values supported by model.\n \"\"\"\n\n self._llama = None # type: ignore[assignment]\n self.tokenizer = None # type: ignore[assignment]\n\n if not has_llama_cpp:\n raise ImportError(\"Please install llama-cpp-python by running: pip install llama-cpp-python\")\n\n if ctx_len == 0:\n raise ValueError(\"LlamaCppModel doesn't support ctx_len=0\")\n\n super().__init__(True,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # update kwargs from important args\n llamacpp_kwargs.update(n_ctx=ctx_len,\n n_batch=n_batch,\n n_gpu_layers=n_gpu_layers,\n main_gpu=main_gpu,\n seed=seed,\n verbose=verbose\n )\n\n logger.debug(f\"Creating inner Llama with model_path='{path}', llamacpp_kwargs={llamacpp_kwargs}\")\n\n with normalize_notebook_stdout_stderr(not verbose):\n self._llama = Llama(model_path=path, **llamacpp_kwargs)\n\n self._model_path = path\n\n # correct super __init__ values\n self._ctx_len = self._llama.n_ctx()\n\n n_ctx_train = self._llama._model.n_ctx_train() \n if self.ctx_len > n_ctx_train:\n raise ValueError(f\"ctx_len ({self.ctx_len}) is greater than n_ctx_train ({n_ctx_train})\")\n\n\n if self.tokenizer is None:\n self.tokenizer = LlamaCppTokenizer(self._llama)\n\n try:\n self.init_format(format,\n format_search_order,\n {\"name\": os.path.basename(self._model_path),\n \"path\": self._model_path,\n \"meta_template_name\": \"tokenizer.chat_template\"}\n )\n except Exception as e:\n del self.tokenizer\n del self._llama\n raise e\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Free type constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n\n \"\"\"Free type constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.json","title":"json","text":"json(\n json_schema,\n query,\n *,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self, \n json_schema: Union[dict,str,None],\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(json_schema, \n thread,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
Source code in sibila/model.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. \n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n thread = self._prepare_gen_in(thread, genconf)\n\n prompt = self.text_from_thread(thread)\n\n logger.debug(f\"Prompt: \u2588{prompt}\u2588\")\n\n text,finish = self._gen_text(prompt, genconf)\n\n out = self._prepare_gen_out(text, finish, genconf)\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_json","title":"gen_json","text":"gen_json(\n json_schema,\n thread,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n json_schema: Union[dict,str,None],\n\n thread: Thread,\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n if genconf.json_schema is not None and json_schema is not None:\n logger.warn(\"Both arg json_schema and genconf.json_schema are set: using json_schema arg\")\n\n if json_schema is not None:\n if schemaconf is None:\n schemaconf = self.schemaconf\n\n logger.debug(\"JSON schema conf:\\n\" + pformat(schemaconf))\n\n if massage_schema:\n if not isinstance(json_schema, dict):\n json_schema = json.loads(json_schema)\n\n json_schema = json_schema_massage(json_schema, schemaconf) # type: ignore[arg-type]\n logger.debug(\"Massaged JSON schema:\\n\" + pformat(json_schema))\n\n out = self.gen(thread, \n genconf(format=\"json\", \n json_schema=json_schema))\n\n return out \n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n if is_dataclass(cls):\n schema = build_dataclass_object_json_schema(cls)\n else:\n raise TypeError(\"Only dataclass allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = create_final_instance(cls, \n is_list=False,\n val=out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error initializing object from JSON\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n if is_subclass_of(cls, BaseModel):\n schema = json_schema_from_pydantic(cls)\n else:\n raise TypeError(\"Only pydantic BaseModel allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = pydantic_obj_from_json(cls, \n out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error validating for object (by Pydantic), but JSON is valid for its schema\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.token_len","title":"token_len","text":"token_len(thread, _=None)\n
Calculate token length for a Thread.
Parameters:
Name Type Description Default thread
Thread
For token length calculation.
required Returns:
Type Description int
Number of tokens the thread will use.
Source code in sibila/model.py
def token_len(self,\n thread: Thread,\n _: Optional[GenConf] = None) -> int:\n \"\"\"Calculate token length for a Thread.\n\n Args:\n thread: For token length calculation.\n\n Returns:\n Number of tokens the thread will use.\n \"\"\"\n\n text = self.text_from_thread(thread)\n return self.tokenizer.token_len(text)\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = None\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.ctx_len","title":"ctx_len property
","text":"ctx_len\n
Maximum context length, shared for input + output. We assume a common in+out context where total token length must always be less than this number.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.known_models","title":"known_models classmethod
","text":"known_models()\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if it can accept any model.
Source code in sibila/model.py
@classmethod\ndef known_models(cls) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Returns:\n Returns a list of known models or None if it can accept any model.\n \"\"\"\n return None\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.desc","title":"desc property
","text":"desc\n
Model description.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.n_embd","title":"n_embd property
","text":"n_embd\n
Embedding size of model.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.n_params","title":"n_params property
","text":"n_params\n
Total number of model parameters.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.get_metadata","title":"get_metadata","text":"get_metadata()\n
Returns model metadata.
Source code in sibila/llamacpp.py
def get_metadata(self):\n \"\"\"Returns model metadata.\"\"\"\n out = {}\n buf = bytes(16 * 1024)\n lmodel = self._llama.model\n count = llama_cpp.llama_model_meta_count(lmodel)\n for i in range(count):\n res = llama_cpp.llama_model_meta_key_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n key = buf[:res].decode('utf-8')\n res = llama_cpp.llama_model_meta_val_str_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n value = buf[:res].decode('utf-8')\n out[key] = value\n return out\n
"},{"location":"api-reference/model/#remote-models","title":"Remote models","text":""},{"location":"api-reference/model/#sibila.OpenAIModel","title":"OpenAIModel","text":"OpenAIModel(\n name,\n unknown_name_mask=2,\n *,\n genconf=None,\n schemaconf=None,\n tokenizer=None,\n ctx_len=0,\n api_key=None,\n base_url=None,\n openai_init_kwargs={}\n)\n
Access an OpenAI model.
Supports constrained JSON output, via the OpenAI API tools mechanism. Ref: https://platform.openai.com/docs/api-reference/chat/create
Create an OpenAI remote model. Name resolution depends on unknown_name_mask and will keep removing letters from the end of name and searching existing entries in penAIModel.known_models().
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required unknown_name_mask
int
How to deal with unmatched names in name resolution, a mask of:
- 2: Raise NameError if exact name not found (no generics)
- 1: Only allow versioned names - raise NameError if generic non-versioned model name used
- 0: Accept any name that can be resolved from OpenAIModel.known_models()
2
genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
ctx_len
int
Maximum context length to be used (shared for input and output). Defaults to 0 which means model's maximum.
0
api_key
Optional[str]
OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL.
None
openai_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
Source code in sibila/openai.py
def __init__(self,\n name: str,\n unknown_name_mask: int = 2,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n tokenizer: Optional[Tokenizer] = None,\n ctx_len: int = 0,\n\n # most important OpenAI-specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n\n # OpenAI-specific args\n openai_init_kwargs: dict = {},\n ):\n \"\"\"Create an OpenAI remote model.\n Name resolution depends on unknown_name_mask and will keep removing letters \n from the end of name and searching existing entries in penAIModel.known_models().\n\n Args:\n name: Model name to resolve into an existing model.\n unknown_name_mask: How to deal with unmatched names in name resolution, a mask of:\n\n - 2: Raise NameError if exact name not found (no generics)\n - 1: Only allow versioned names - raise NameError if generic non-versioned model name used\n - 0: Accept any name that can be resolved from OpenAIModel.known_models()\n\n genconf: Model generation configuration. Defaults to None.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). Defaults to 0 which means model's maximum.\n api_key: OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL.\n openai_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n \"\"\"\n\n\n if not has_openai:\n raise ImportError(\"Please install openai by running: pip install openai\")\n\n self._model_name, max_ctx_len, self._tokens_per_message, self._tokens_per_name = resolve_model(\n name,\n unknown_name_mask\n )\n\n\n super().__init__(False,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # only check for \"json\" text presence as json schema is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n logger.debug(f\"Creating inner OpenAI with base_url={base_url}, openai_init_kwargs={openai_init_kwargs}\")\n\n self._client = openai.OpenAI(api_key=api_key,\n base_url=base_url,\n\n **openai_init_kwargs\n )\n\n\n # correct super __init__ values\n if self.tokenizer is None:\n self.tokenizer = OpenAITokenizer(self._model_name)\n\n if ctx_len == 0:\n self._ctx_len = max_ctx_len\n else:\n self._ctx_len = ctx_len\n\n\n self.TOOLS_TOKEN_LEN_FACTOR = self.DEFAULT_TOOLS_TOKEN_LEN_FACTOR\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Free type constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n\n \"\"\"Free type constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description NotImplementedError
If method was not defined by a derived class.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n NotImplementedError: If method was not defined by a derived class.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n thread = self._prepare_gen_in(thread, genconf)\n\n token_len = self.token_len(thread, genconf)\n if genconf.max_tokens == 0:\n max_tokens = self.ctx_len - token_len \n genconf = genconf(max_tokens=max_tokens)\n\n elif token_len + genconf.max_tokens > self.ctx_len:\n # this is not true for all models: 1106 models have 128k max input and 4k max output (in and out ctx are not shared)\n # so we assume the smaller max ctx length for the model\n logger.warn(f\"Token length + genconf.max_tokens ({token_len + genconf.max_tokens}) is greater than model's context window length ({self.ctx_len})\")\n\n fn_name = \"json_out\"\n\n json_kwargs: dict = {}\n format = genconf.format\n if format == \"json\":\n\n if genconf.json_schema is None:\n json_kwargs[\"response_format\"] = {\"type\": \"json_object\"}\n\n else:\n # use json_schema in OpenAi's tool API\n json_kwargs[\"tool_choice\"] = {\n \"type\": \"function\",\n \"function\": {\"name\": fn_name},\n }\n\n if isinstance(genconf.json_schema, str):\n params = json.loads(genconf.json_schema)\n else:\n params = genconf.json_schema\n\n json_kwargs[\"tools\"] = [\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": fn_name,\n \"parameters\": params\n }\n }\n ]\n\n logger.debug(f\"OpenAI json args: {json_kwargs}\")\n\n msgs = thread.as_chatml()\n\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(model=self._model_name,\n messages=msgs, # type: ignore[arg-type]\n\n max_tokens=genconf.max_tokens,\n stop=genconf.stop,\n temperature=genconf.temperature,\n top_p=genconf.top_p,\n **json_kwargs,\n\n n=1\n )\n\n logger.debug(f\"OpenAI response: {response}\")\n\n choice = response.choices[0]\n finish = choice.finish_reason\n message = choice.message\n\n if \"tool_choice\" in json_kwargs:\n\n # json schema generation via the tools API:\n if message.tool_calls is not None:\n fn = message.tool_calls[0].function\n if fn.name != fn_name:\n logger.debug(f\"OpenAIModel: different returned JSON function name ({fn.name})\")\n\n text = fn.arguments\n else: # use content instead\n text = message.content # type: ignore[assignment]\n\n else:\n # text or simple json format\n text = message.content # type: ignore[assignment]\n\n out = self._prepare_gen_out(text, finish, genconf)\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.json","title":"json","text":"json(\n json_schema,\n query,\n *,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self, \n json_schema: Union[dict,str,None],\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(json_schema, \n thread,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_json","title":"gen_json","text":"gen_json(\n json_schema,\n thread,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n json_schema: Union[dict,str,None],\n\n thread: Thread,\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n if genconf.json_schema is not None and json_schema is not None:\n logger.warn(\"Both arg json_schema and genconf.json_schema are set: using json_schema arg\")\n\n if json_schema is not None:\n if schemaconf is None:\n schemaconf = self.schemaconf\n\n logger.debug(\"JSON schema conf:\\n\" + pformat(schemaconf))\n\n if massage_schema:\n if not isinstance(json_schema, dict):\n json_schema = json.loads(json_schema)\n\n json_schema = json_schema_massage(json_schema, schemaconf) # type: ignore[arg-type]\n logger.debug(\"Massaged JSON schema:\\n\" + pformat(json_schema))\n\n out = self.gen(thread, \n genconf(format=\"json\", \n json_schema=json_schema))\n\n return out \n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n if is_dataclass(cls):\n schema = build_dataclass_object_json_schema(cls)\n else:\n raise TypeError(\"Only dataclass allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = create_final_instance(cls, \n is_list=False,\n val=out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error initializing object from JSON\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n if is_subclass_of(cls, BaseModel):\n schema = json_schema_from_pydantic(cls)\n else:\n raise TypeError(\"Only pydantic BaseModel allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = pydantic_obj_from_json(cls, \n out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error validating for object (by Pydantic), but JSON is valid for its schema\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.token_len","title":"token_len","text":"token_len(thread, genconf=None)\n
Calculate the number of tokens used by a list of messages. If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread
Thread
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens the thread will use.
Source code in sibila/openai.py
def token_len(self,\n thread: Thread,\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate the number of tokens used by a list of messages.\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens the thread will use.\n \"\"\"\n\n # name = self._model_name\n\n num_tokens = 0\n for index in range(-1, len(thread)): # -1 for system message\n message = thread.msg_as_chatml(index)\n # print(message)\n num_tokens += self._tokens_per_message\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(value))\n # if key == \"name\":\n # num_tokens += self._tokens_per_name\n\n num_tokens += 3 # every reply is primed with <|start|>assistant<|message|>\n num_tokens += 10 # match API return counts\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * self.TOOLS_TOKEN_LEN_FACTOR)\n\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n return num_tokens\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.ctx_len","title":"ctx_len property
","text":"ctx_len\n
Maximum context length, shared for input + output. We assume a common in+out context where total token length must always be less than this number.
"},{"location":"api-reference/model/#sibila.OpenAIModel.known_models","title":"known_models classmethod
","text":"known_models()\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if it can accept any model.
Source code in sibila/openai.py
@classmethod\ndef known_models(cls) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Returns:\n Returns a list of known models or None if it can accept any model.\n \"\"\"\n return list(KNOWN_MODELS.keys())\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.desc","title":"desc property
","text":"desc\n
Model description.
"},{"location":"api-reference/models/","title":"Models factory","text":""},{"location":"api-reference/models/#sibila.Models","title":"Models","text":"Model and template format directory that unifies (and simplifies) model access and configuration.
This env variable is checked and used during initialization SIBILA_MODELS: ';'-delimited list of folders where to find: models.json, formats.json and model files.
= Model Directory =
Useful to create models from resource names like \"llamacpp:openchat\" or \"openai:gpt-4\". This makes it simple to change a model, store model settings, to compare model outputs, etc.
User can add new entries from script or with JSON filenames, via the add() call. New directory entries with the same name are merged into existing ones for each added config.
Uses file \"sibila/res/base_models.json\" for the initial defaults, which the user can augment by calling setup() with own config files or directly adding model config with set_model().
An example of a model directory JSON config file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
= Format Directory =
Detects chat templates from model name/filename or uses from metadata if possible.
This directory can be setup from a JSON file or by calling set_format().
Any new entries with the same name replace previous ones on each new call.
Initializes from file \"sibila/res/base_formats.json\".
Example of a \"formats.json\" file:
{\n \"chatml\": {\n # template is a Jinja template for this model\n \"template\": \"{% for message in messages %}...\"\n },\n\n \"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n }, \n\n \"phi\": {\n \"match\": \"phi\",\n \"template\": \"...\"\n },\n\n \"phi2\": \"phi\",\n # this is a link: \"phi2\" -> \"phi\"\n}\n
Jinja2 templates receive a standard ChatML messages list (created from a Thread) and must deal with the following:
-
In models that don't use a system message, template must take care of prepending it to first user message.
-
The add_generation_prompt template variable is always set as True.
"},{"location":"api-reference/models/#sibila.Models.setup","title":"setup classmethod
","text":"setup(\n path=None,\n clear=False,\n add_cwd=True,\n load_base=True,\n load_from_env=True,\n)\n
Initialize models and formats directory from given model files folder and/or contained configuration files. Path can start with \"~/\" current account's home directory.
Parameters:
Name Type Description Default path
Optional[Union[str, list[str]]]
Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.
None
clear
bool
Set to clear existing directories before loading from path arg.
False
add_cwd
bool
Add current working directory to search path.
True
load_base
bool
Whether to load \"base_models.json\" and \"base_formats.json\" from \"sibila/res\" folder.
True
load_from_env
bool
Load from SIBILA_MODELS env variable?
True
Source code in sibila/models.py
@classmethod\ndef setup(cls,\n path: Optional[Union[str,list[str]]] = None,\n clear: bool = False,\n add_cwd: bool = True,\n load_base: bool = True,\n load_from_env: bool = True):\n \"\"\"Initialize models and formats directory from given model files folder and/or contained configuration files.\n Path can start with \"~/\" current account's home directory.\n\n Args:\n path: Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.\n clear: Set to clear existing directories before loading from path arg.\n add_cwd: Add current working directory to search path.\n load_base: Whether to load \"base_models.json\" and \"base_formats.json\" from \"sibila/res\" folder.\n load_from_env: Load from SIBILA_MODELS env variable?\n \"\"\"\n\n if clear:\n cls.clear()\n\n cls._ensure(add_cwd, \n load_base,\n load_from_env)\n\n if path is not None:\n if isinstance(path, str):\n path_list = [path]\n else:\n path_list = path\n\n cls._read_any(path_list)\n
"},{"location":"api-reference/models/#sibila.Models.create","title":"create classmethod
","text":"create(res_name, genconf=None, ctx_len=None, **over_args)\n
Create a model.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required genconf
Optional[GenConf]
Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Overrides directory defaults. Defaults to None.
None
over_args
Union[Any]
Model-specific creation args, which will override default args set in model directory.
{}
Returns:
Name Type Description Model
Model
the initialized model.
Source code in sibila/models.py
@classmethod\ndef create(cls,\n res_name: str,\n\n # common to all providers\n genconf: Optional[GenConf] = None,\n ctx_len: Optional[int] = None,\n\n # model-specific overriding:\n **over_args: Union[Any]) -> Model:\n \"\"\"Create a model.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n genconf: Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.\n ctx_len: Maximum context length to be used. Overrides directory defaults. Defaults to None.\n over_args: Model-specific creation args, which will override default args set in model directory.\n\n Returns:\n Model: the initialized model.\n \"\"\"\n\n cls._ensure() \n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(cls.models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = cls.models_dir[provider]\n\n if name in prov:\n model_args = prov[name]\n\n # _default(if any) <- model_args <- over_args\n args = (prov.get(cls.DEFAULT_ENTRY_NAME)).copy() or {}\n args.update(model_args) \n args.update(over_args)\n\n else: \n prov_conf = cls.PROVIDER_CONF[provider] \n\n if \"name_passthrough\" in prov_conf[\"flags\"]:\n model_args = {\n \"name\": name \n }\n else:\n raise ValueError(f\"Model '{name}' not found in provider '{provider}'\")\n\n args = {}\n args.update(model_args)\n args.update(over_args)\n\n\n # override genconf, ctx_len\n if genconf is None:\n genconf = cls.genconf\n\n if genconf is not None:\n args[\"genconf\"] = genconf\n\n elif \"genconf\" in args and isinstance(args[\"genconf\"], dict):\n # transform dict into a GenConf instance:\n args[\"genconf\"] = GenConf.from_dict(args[\"genconf\"])\n\n if ctx_len is not None:\n args[\"ctx_len\"] = ctx_len\n\n logger.debug(f\"Creating model '{provider}:{name}' with resolved args: {args}\")\n\n\n model: Model\n if provider == \"llamacpp\":\n\n # resolve filename -> path\n path = cls._locate_file(args[\"name\"])\n if path is None:\n raise FileNotFoundError(f\"File not found in '{res_name}' while looking for file '{args['name']}'. Make sure you called Models.setup() with a path to the file's folder\")\n\n logger.debug(f\"Resolved llamacpp model '{args['name']}' to '{path}'\")\n\n del args[\"name\"]\n args[\"path\"] = path\n\n from .llamacpp import LlamaCppModel\n\n model = LlamaCppModel(**args)\n\n\n elif provider == \"openai\":\n\n from .openai import OpenAIModel\n\n model = OpenAIModel(**args)\n\n \"\"\"\n elif provider == \"hf\":\n from .hf import HFModel\n\n model = HFModel(**args)\n \"\"\"\n\n return model\n
"},{"location":"api-reference/models/#sibila.Models.add_models_search_path","title":"add_models_search_path classmethod
","text":"add_models_search_path(path)\n
Prepends new paths to model files search path.
Parameters:
Name Type Description Default path
Union[str, list[str]]
A path or list of paths to add to model search path.
required Source code in sibila/models.py
@classmethod\ndef add_models_search_path(cls,\n path: Union[str,list[str]]):\n \"\"\"Prepends new paths to model files search path.\n\n Args:\n path: A path or list of paths to add to model search path.\n \"\"\"\n\n cls._ensure()\n\n prepend_path(cls.models_search_path, path)\n\n logger.debug(f\"Adding '{path}' to search_path\")\n
"},{"location":"api-reference/models/#sibila.Models.set_genconf","title":"set_genconf classmethod
","text":"set_genconf(genconf)\n
Set the GenConf to use as default for model creation.
Parameters:
Name Type Description Default genconf
GenConf
Model generation configuration.
required Source code in sibila/models.py
@classmethod\ndef set_genconf(cls,\n genconf: GenConf):\n \"\"\"Set the GenConf to use as default for model creation.\n\n Args:\n genconf: Model generation configuration.\n \"\"\"\n cls.genconf = genconf\n
"},{"location":"api-reference/models/#sibila.Models.list_models","title":"list_models classmethod
","text":"list_models(name_query, providers, resolved_values)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match model names. Empty string for all.
required providers
list[str]
Filter by these exact provider names. Empty list for all.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are model res_names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_models(cls,\n name_query: str,\n providers: list[str],\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match model names. Empty string for all.\n providers: Filter by these exact provider names. Empty list for all.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are model res_names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n out = {}\n\n name_query = name_query.lower()\n\n for prov_name in cls.models_dir:\n\n if providers and prov_name not in providers:\n continue\n\n prov_dic = cls.models_dir[prov_name]\n\n for name in prov_dic:\n\n if name == cls.DEFAULT_ENTRY_NAME:\n continue\n\n if name_query and name_query not in name.lower():\n continue\n\n entry_res_name = prov_name + \":\" + name\n\n if resolved_values:\n res = cls.get_model_entry(entry_res_name) # type: ignore[assignment]\n if res is None:\n continue\n else:\n val = res[1]\n else:\n val = prov_dic[name]\n\n out[entry_res_name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_model_entry","title":"get_model_entry classmethod
","text":"get_model_entry(res_name)\n
Get a resolved model entry. Resolved means following any links.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required Returns:
Type Description Union[tuple[str, dict], None]
Resolved entry (res_name,dict) or None if not found.
Source code in sibila/models.py
@classmethod\ndef get_model_entry(cls,\n res_name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved model entry. Resolved means following any links.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n\n Returns:\n Resolved entry (res_name,dict) or None if not found.\n \"\"\"\n\n cls._ensure() \n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(cls.models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = cls.models_dir[provider]\n\n if name in prov:\n return provider + \":\" + name, prov[name]\n else:\n return None\n
"},{"location":"api-reference/models/#sibila.Models.has_model_entry","title":"has_model_entry classmethod
","text":"has_model_entry(res_name)\n
Source code in sibila/models.py
@classmethod\ndef has_model_entry(cls,\n res_name: str) -> bool:\n return cls.get_model_entry(res_name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.set_model","title":"set_model classmethod
","text":"set_model(\n res_name, model_name, format_name=None, genconf=None\n)\n
Add model configuration for given res_name.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
str
Model name or filename identifier.
required format_name
Optional[str]
Format name used by model. Defaults to None.
None
genconf
Optional[GenConf]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model(cls,\n res_name: str,\n model_name: str,\n format_name: Optional[str] = None,\n genconf: Optional[GenConf] = None):\n \"\"\"Add model configuration for given res_name.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier.\n format_name: Format name used by model. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry: dict = {\n \"name\": model_name\n }\n\n if format_name:\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n\n if genconf:\n entry[\"genconf\"] = genconf.as_dict()\n\n cls.models_dir[provider][name] = entry\n
"},{"location":"api-reference/models/#sibila.Models.update_model","title":"update_model classmethod
","text":"update_model(\n res_name,\n model_name=None,\n format_name=None,\n genconf=None,\n)\n
update model fields
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
Optional[str]
Model name or filename identifier. Defaults to None.
None
format_name
Optional[str]
Format name used by model. Use \"\" to delete. Defaults to None.
None
genconf
Union[GenConf, str, None]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef update_model(cls,\n res_name: str,\n model_name: Optional[str] = None,\n format_name: Optional[str] = None,\n genconf: Union[GenConf,str,None] = None):\n\n \"\"\"update model fields\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier. Defaults to None.\n format_name: Format name used by model. Use \"\" to delete. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry = cls.models_dir[provider][name]\n\n if model_name:\n entry[\"name\"] = model_name\n\n if format_name is not None:\n if format_name != \"\":\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n else:\n del entry[\"format\"]\n\n if genconf is not None:\n if genconf != \"\":\n entry[\"genconf\"] = genconf\n else:\n del entry[\"genconf\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_model_link","title":"set_model_link classmethod
","text":"set_model_link(res_name, link_name)\n
Create a model link into another model.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required link_name
str
Name of model this entry links to.
required Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model_link(cls,\n res_name: str,\n link_name: str):\n \"\"\"Create a model link into another model.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n link_name: Name of model this entry links to.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, True)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n # first: ensure link_name is a res_name\n if ':' not in link_name:\n link_name = provider + \":\" + link_name\n\n if not cls.has_model_entry(link_name):\n raise ValueError(f\"Could not find linked model '{link_name}'\")\n\n # second: check link name is without provider if same\n link_split = link_name.split(\":\")\n if len(link_split) == 2:\n if link_split[0] == provider: # remove same \"provider:\"\n link_name = link_split[1]\n\n cls.models_dir[provider][name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_model","title":"delete_model classmethod
","text":"delete_model(res_name)\n
Delete a model entry.
Parameters:
Name Type Description Default res_name
str
Model entry in the form \"provider:name\".
required Source code in sibila/models.py
@classmethod\ndef delete_model(cls,\n res_name: str):\n \"\"\"Delete a model entry.\n\n Args:\n res_name: Model entry in the form \"provider:name\".\n \"\"\"\n\n cls._ensure()\n\n provider,name = resolve_model(cls.models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n\n prov = cls.models_dir[provider]\n\n del prov[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_models","title":"save_models classmethod
","text":"save_models(path=None)\n
Source code in sibila/models.py
@classmethod\ndef save_models(cls,\n path: Optional[str] = None):\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"models.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n json.dump(cls.models_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.list_formats","title":"list_formats classmethod
","text":"list_formats(name_query, resolved_values)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match format names. Empty string for all.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are format names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_formats(cls,\n name_query: str,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match format names. Empty string for all.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are format names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n out = {}\n\n name_query = name_query.lower()\n\n for name in cls.formats_dir.keys():\n\n if name_query and name_query not in name.lower():\n continue\n\n val = cls.formats_dir[name]\n\n if resolved_values:\n res = cls.get_format_entry(name)\n if res is None:\n continue\n else:\n val = res[1]\n\n out[name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_format_entry","title":"get_format_entry classmethod
","text":"get_format_entry(name)\n
Get a resolved format entry by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple of (resolved_name, format_entry).
Source code in sibila/models.py
@classmethod\ndef get_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved format entry by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Tuple of (resolved_name, format_entry).\n \"\"\"\n\n cls._ensure()\n\n return get_format_entry(cls.formats_dir, name)\n
"},{"location":"api-reference/models/#sibila.Models.has_format_entry","title":"has_format_entry classmethod
","text":"has_format_entry(name)\n
Source code in sibila/models.py
@classmethod\ndef has_format_entry(cls,\n name: str) -> bool:\n return cls.get_format_entry(name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.get_format_template","title":"get_format_template classmethod
","text":"get_format_template(name)\n
Get a format template by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[str, None]
Resolved format template str.
Source code in sibila/models.py
@classmethod\ndef get_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Get a format template by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Resolved format template str.\n \"\"\"\n\n res = cls.get_format_entry(name)\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.match_format_entry","title":"match_format_entry classmethod
","text":"match_format_entry(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple (name, format_entry) where name is a resolved name. Or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Tuple (name, format_entry) where name is a resolved name. Or None if none found.\n \"\"\"\n\n cls._ensure()\n\n return search_format(cls.formats_dir, name)\n
"},{"location":"api-reference/models/#sibila.Models.match_format_template","title":"match_format_template classmethod
","text":"match_format_template(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[str, None]
Format template or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Format template or None if none found.\n \"\"\"\n\n res = cls.match_format_entry(name)\n\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.is_format_supported","title":"is_format_supported classmethod
","text":"is_format_supported(model_id)\n
Checks if there's template support for a model with this name.
Parameters:
Name Type Description Default model_id
str
Model filename or general name.
required Returns:
Type Description bool
True if Models knows the format.
Source code in sibila/models.py
@classmethod\ndef is_format_supported(cls,\n model_id: str) -> bool:\n \"\"\"Checks if there's template support for a model with this name.\n\n Args:\n model_id: Model filename or general name.\n\n Returns:\n True if Models knows the format.\n \"\"\"\n\n return cls.match_format_entry(model_id) is not None\n
"},{"location":"api-reference/models/#sibila.Models.set_format","title":"set_format classmethod
","text":"set_format(name, match, template)\n
Add a format entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required match
str
Regex that matches names/filenames that use this format.
required template
str
The Chat template format in Jinja2 format
required Source code in sibila/models.py
@classmethod\ndef set_format(cls,\n name: str,\n match: str,\n template: str):\n \"\"\"Add a format entry to the format directory.\n\n Args:\n name: Format entry name.\n match: Regex that matches names/filenames that use this format.\n template: The Chat template format in Jinja2 format\n \"\"\"\n\n cls._ensure()\n\n if \"{{\" not in template: # a link_name for the template\n if not cls.has_format_entry(template):\n raise ValueError(f\"Could not find linked template entry '{template}'.\")\n\n entry = {\n \"match\": match,\n \"template\": template\n }\n cls.formats_dir[name] = entry \n
"},{"location":"api-reference/models/#sibila.Models.set_format_link","title":"set_format_link classmethod
","text":"set_format_link(name, link_name)\n
Add a format link entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required link_name
str
Name of format that this entry links to.
required Source code in sibila/models.py
@classmethod\ndef set_format_link(cls,\n name: str,\n link_name: str):\n \"\"\"Add a format link entry to the format directory.\n\n Args:\n name: Format entry name.\n link_name: Name of format that this entry links to.\n \"\"\"\n\n cls._ensure()\n\n if not cls.has_format_entry(link_name):\n raise ValueError(f\"Could not find linked entry '{link_name}'.\")\n\n cls.formats_dir[name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_format","title":"delete_format classmethod
","text":"delete_format(name)\n
Delete a format entry.
Parameters:
Name Type Description Default name
str
Format entry name.
required Source code in sibila/models.py
@classmethod\ndef delete_format(cls,\n name: str):\n \"\"\"Delete a format entry.\n\n Args:\n name: Format entry name.\n \"\"\"\n\n cls._ensure()\n\n if not cls.has_format_entry(name):\n raise ValueError(f\"Format name '{name}' not found.\")\n\n del cls.formats_dir[name]\n
"},{"location":"api-reference/models/#sibila.Models.merge_from","title":"merge_from classmethod
","text":"merge_from(path, preserve_current=True)\n
Source code in sibila/models.py
@classmethod\ndef merge_from(cls,\n path: str,\n preserve_current: bool = True):\n path = expand_path(path)\n\n if preserve_current:\n with open(path, \"r\", encoding=\"utf-8\") as f:\n new_dir = json.load(f)\n new_dir.update(cls.formats_dir)\n cls.formats_dir = new_dir\n\n else: # normal update: new with the same name will override current\n update_dir_json(cls.formats_dir, path)\n\n sanity_check_formats(cls.formats_dir)\n
"},{"location":"api-reference/models/#sibila.Models.save_formats","title":"save_formats classmethod
","text":"save_formats(path=None)\n
Source code in sibila/models.py
@classmethod\ndef save_formats(cls,\n path: Optional[str] = None):\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"formats.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n json.dump(cls.formats_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.info","title":"info classmethod
","text":"info(verbose=False)\n
Return information about current setup.
Parameters:
Name Type Description Default verbose
bool
If False, formats directory values are abbreviated. Defaults to False.
False
Returns:
Type Description str
Textual information about the current setup.
Source code in sibila/models.py
@classmethod\ndef info(cls,\n verbose: bool = False) -> str:\n \"\"\"Return information about current setup.\n\n Args:\n verbose: If False, formats directory values are abbreviated. Defaults to False.\n\n Returns:\n Textual information about the current setup.\n \"\"\"\n\n cls._ensure()\n\n out = \"\"\n\n out += f\"Models search path: {cls.models_search_path}\\n\"\n out += f\"Models directory:\\n{pformat(cls.models_dir, sort_dicts=False)}\\n\"\n out += f\"Model Genconf:\\n{cls.genconf}\\n\"\n\n if not verbose:\n fordir = {}\n for key in cls.formats_dir:\n fordir[key] = copy(cls.formats_dir[key])\n if isinstance(fordir[key], dict) and \"template\" in fordir[key]:\n fordir[key][\"template\"] = fordir[key][\"template\"][:14] + \"...\"\n else:\n fordir = cls.formats_dir\n\n out += f\"Formats directory:\\n{pformat(fordir)}\"\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.clear","title":"clear classmethod
","text":"clear()\n
Clear directories. Member genconf is not cleared.
Source code in sibila/models.py
@classmethod\ndef clear(cls):\n \"\"\"Clear directories. Member genconf is not cleared.\"\"\"\n cls.models_dir = None\n cls.models_search_path = []\n cls.formats_dir = None\n
"},{"location":"api-reference/multigen/","title":"Multigen","text":""},{"location":"api-reference/multigen/#sibila.multigen","title":"multigen","text":"Functions for comparing output across models.
- thread_multigen(), query_multigen() and multigen(): Compare outputs across models.
- cycle_gen_print(): For a list of models, sequentially grow a Thread with model responses to given IN messages.
"},{"location":"api-reference/multigen/#sibila.multigen.thread_multigen","title":"thread_multigen","text":"thread_multigen(\n threads,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n thread_titles=None,\n)\n
Generate a single thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
thread_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def thread_multigen(threads: list[Thread],\n model_names: list[str],\n\n text: Union[str,list[str],None] = None,\n csv: Union[str,list[str],None] = None,\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n thread_titles: Optional[list[str]] = None \n ) -> list[list[GenOut]]:\n \"\"\"Generate a single thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n thread_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n table = multigen(threads,\n model_names=model_names, \n gencall=gencall,\n genconf=genconf)\n\n # table[threads,models]\n\n if thread_titles is None:\n thread_titles = [str(th) for th in threads]\n\n def format(format_fn, cmds):\n if cmds is None or not cmds:\n return\n\n f = StringIO(newline='')\n\n format_fn(f,\n table, \n title_list=thread_titles,\n model_names=model_names,\n out_keys=out_keys)\n fmtd = f.getvalue()\n\n if not isinstance(cmds, list):\n cmds = [cmds]\n for c in cmds:\n if c == 'print':\n print(fmtd)\n else: # path\n with open(c, \"w\", encoding=\"utf-8\") as f:\n f.write(fmtd)\n\n format(format_text, text)\n format(format_csv, csv)\n\n return table\n
"},{"location":"api-reference/multigen/#sibila.multigen.query_multigen","title":"query_multigen","text":"query_multigen(\n in_list,\n inst_text,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n in_titles=None,\n)\n
Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
in_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)).
list[list[GenOut]]
For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def query_multigen(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n text: Union[str,list[str],None] = None, # \"print\", path\n csv: Union[str,list[str],None] = None, # \"print\", path\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n in_titles: Optional[list[str]] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n in_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). \n For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\" \n\n th_list = []\n for in_text in in_list:\n th = Thread.make_INST_IN(inst_text, in_text)\n th_list.append(th)\n\n if in_titles is None:\n in_titles = in_list\n\n out = thread_multigen(th_list, \n model_names=model_names, \n text=text,\n csv=csv,\n gencall=gencall,\n genconf=genconf,\n out_keys=out_keys,\n thread_titles=in_titles)\n\n return out\n
"},{"location":"api-reference/multigen/#sibila.multigen.multigen","title":"multigen","text":"multigen(\n threads,\n *,\n models=None,\n model_names=None,\n model_names_del_after=True,\n gencall=None,\n genconf=None\n)\n
Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.
Actual generation for each model is implemented by the gencall arg Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required models
Optional[list[Model]]
A list of initialized models. Defaults to None.
None
model_names
Optional[list[str]]
--Or-- A list of Models names. Defaults to None.
None
model_names_del_after
bool
Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.
True
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
Raises:
Type Description ValueError
Only one of models or model_names can be given.
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def multigen(threads: list[Thread],\n *,\n models: Optional[list[Model]] = None, # existing models\n\n model_names: Optional[list[str]] = None,\n model_names_del_after: bool = True,\n\n gencall: Optional[Callable] = None,\n genconf: Optional[GenConf] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.\n\n Actual generation for each model is implemented by the gencall arg Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n models: A list of initialized models. Defaults to None.\n model_names: --Or-- A list of Models names. Defaults to None.\n model_names_del_after: Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n\n Raises:\n ValueError: Only one of models or model_names can be given.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n if not ((models is None) ^ ((model_names is None))):\n raise ValueError(\"Only one of models or model_names can be given\")\n\n if gencall is None:\n gencall = _default_gencall_text\n\n mod_count = len(models) if models is not None else len(model_names) # type: ignore[arg-type]\n\n all_out = []\n\n for i in range(mod_count):\n if models is not None:\n model = models[i]\n logger.debug(f\"Model: {model.desc}\")\n else:\n name = model_names[i] # type: ignore[index]\n model = Models.create(name)\n logger.info(f\"Model: {name} -> {model.desc}\")\n\n mod_out = []\n for th in threads:\n out = gencall(model, th, genconf)\n\n mod_out.append(out)\n\n all_out.append(mod_out)\n\n if model_names_del_after and models is None:\n del model\n\n # all_out is currently shaped (M,T) -> transpose to (T,M), so that each row contains thread t for all models\n tout = []\n for t in range(len(threads)):\n tmout = [] # thread t for all models\n for m in range(mod_count):\n tmout.append(all_out[m][t])\n\n tout.append(tmout)\n\n return tout\n
"},{"location":"api-reference/multigen/#sibila.multigen.cycle_gen_print","title":"cycle_gen_print","text":"cycle_gen_print(\n in_list,\n inst_text,\n model_names,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n json_kwargs={\n \"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False,\n },\n)\n
For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.
Works by doing:
- Generate an INST+IN prompt for a list of models. (Same INST for all).
- Append the output of each model to its own Thread.
- Append the next IN prompt and generate again. Back to 2.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
json_kwargs
dict
JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.
{'indent': 2, 'sort_keys': False, 'ensure_ascii': False}
Source code in sibila/multigen.py
def cycle_gen_print(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n json_kwargs: dict = {\"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False}\n ):\n \"\"\"For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.\n\n Works by doing:\n\n 1. Generate an INST+IN prompt for a list of models. (Same INST for all).\n 2. Append the output of each model to its own Thread.\n 3. Append the next IN prompt and generate again. Back to 2.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n json_kwargs: JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n if gencall is None:\n gencall = _default_gencall_text\n\n\n n_model = len(model_names)\n n_ins = len(in_list)\n\n for m in range(n_model):\n\n name = model_names[m]\n model = Models.create(name)\n\n print('=' * 80)\n print(f\"Model: {name} -> {model.desc}\")\n\n th = Thread(inst=inst_text)\n\n for i in range(n_ins):\n in_text = in_list[i]\n print(f\"IN: {in_text}\")\n\n th += (MsgKind.IN, in_text)\n\n out = gencall(model, th, genconf)\n\n out_dict = out.as_dict()\n\n print(\"OUT\")\n\n for k in out_keys:\n\n if k in out_dict and out_dict[k] is not None:\n\n if k != out_keys[0]: # not first\n print(\"-\" * 20)\n\n val = nice_print(k, out_dict[k], json_kwargs)\n print(val)\n\n th += (MsgKind.OUT, out.text)\n\n del model\n
"},{"location":"api-reference/thread/","title":"Threads, messages, context","text":""},{"location":"api-reference/thread/#sibila.Thread","title":"Thread","text":"Thread(t=None, inst='', join_sep='\\n')\n
A sequence of messages alternating between IN (\"user\" role) and OUT (\"assistant\" role).
Stores a special initial INST information (known as \"system\" role in ChatML) providing instructions to the model. Some models don't use system instructions - in those cases it's prepended to first IN message.
Messages are kept in a strict IN,OUT,IN,OUT,... order. To enforce this, if two IN messages are added, the second just appends to the text of the first.
Examples:
Creation with a list of messages
>>> from sibila import Thread, MsgKind\n>>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")],\n... inst=\"Be helpful.\")\n>>> print(th)\ninst=\u2588Be helpful.\u2588, sep='\\n', len=2\n0: IN=\u2588Hello model!\u2588\n1: OUT=\u2588Hello there human!\u2588\n
Adding messages
>>> from sibila import Thread, MsgKind\n>>> th = Thread(inst=\"Be helpful.\")\n>>> th.add(MsgKind.IN, \"Can you teach me how to cook?\")\n>>> th.add_IN(\"I mean really cook as a chef?\") # gets appended\n>>> print(th)\ninst=\u2588Be helpful.\u2588, sep='\\n', len=1\n0: IN=\u2588Can you teach me how to cook?\\nI mean really cook as a chef?\u2588\n
Another way to add a message
>>> from sibila import Thread, MsgKind\n>>> th = Thread(inst=\"Be informative.\")\n>>> th.add_IN(\"Tell me about kangaroos, please?\")\n>>> th += \"They are so impressive.\" # appends text to last message\n>>> print(th)\ninst=\u2588Be informative.\u2588, sep='\\n', len=1\n0: IN=\u2588Tell me about kangaroos, please?\\nThey are so impressive.\u2588\n
Return thread as a ChatML message list
>>> from sibila import Thread, MsgKind\n>>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")], \n... inst=\"Be helpful.\")\n>>> th.as_chatml()\n[{'role': 'system', 'content': 'Be helpful.'},\n {'role': 'user', 'content': 'Hello model!'},\n {'role': 'assistant', 'content': 'Hello there human!'}]\n
Parameters:
Name Type Description Default t
Optional[Union[Any, list, str, dict, tuple]]
Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.
None
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Raises:
Type Description TypeError
On invalid args passed.
Source code in sibila/thread.py
def __init__(self,\n t: Optional[Union[Any,list,str,dict,tuple]] = None, # Any=Thread\n inst: str = '',\n join_sep: str = \"\\n\"):\n \"\"\"\n Examples:\n Creation with a list of messages\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")],\n ... inst=\"Be helpful.\")\n >>> print(th)\n inst=\u2588Be helpful.\u2588, sep='\\\\n', len=2\n 0: IN=\u2588Hello model!\u2588\n 1: OUT=\u2588Hello there human!\u2588\n\n Adding messages\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread(inst=\"Be helpful.\")\n >>> th.add(MsgKind.IN, \"Can you teach me how to cook?\")\n >>> th.add_IN(\"I mean really cook as a chef?\") # gets appended\n >>> print(th)\n inst=\u2588Be helpful.\u2588, sep='\\\\n', len=1\n 0: IN=\u2588Can you teach me how to cook?\\\\nI mean really cook as a chef?\u2588\n\n Another way to add a message\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread(inst=\"Be informative.\")\n >>> th.add_IN(\"Tell me about kangaroos, please?\")\n >>> th += \"They are so impressive.\" # appends text to last message\n >>> print(th)\n inst=\u2588Be informative.\u2588, sep='\\\\n', len=1\n 0: IN=\u2588Tell me about kangaroos, please?\\\\nThey are so impressive.\u2588\n\n Return thread as a ChatML message list\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")], \n ... inst=\"Be helpful.\")\n >>> th.as_chatml()\n [{'role': 'system', 'content': 'Be helpful.'},\n {'role': 'user', 'content': 'Hello model!'},\n {'role': 'assistant', 'content': 'Hello there human!'}]\n\n Args:\n t: Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n\n Raises:\n TypeError: On invalid args passed.\n \"\"\"\n\n if isinstance(t, Thread):\n self._msgs = t._msgs.copy()\n self.inst = t.inst\n self.join_sep = t.join_sep\n else:\n self._msgs = []\n self.inst = inst\n self.join_sep = join_sep\n\n if t is not None:\n self.concat(t)\n
"},{"location":"api-reference/thread/#sibila.Thread.clear","title":"clear","text":"clear()\n
Delete all messages and clear inst.
Source code in sibila/thread.py
def clear(self):\n \"\"\"Delete all messages and clear inst.\"\"\"\n self.inst = \"\"\n self._msgs = []\n
"},{"location":"api-reference/thread/#sibila.Thread.last_kind","title":"last_kind property
","text":"last_kind\n
Get kind of last message in thread .
Returns:
Type Description MsgKind
Kind of last message or MsgKind.IN if empty.
"},{"location":"api-reference/thread/#sibila.Thread.last_text","title":"last_text property
","text":"last_text\n
Get text of last message in thread .
Returns:
Type Description str
Last message text.
Raises:
Type Description IndexError
If thread is empty.
"},{"location":"api-reference/thread/#sibila.Thread.inst","title":"inst instance-attribute
","text":"inst\n
Text for system instructions, defaults to empty string
"},{"location":"api-reference/thread/#sibila.Thread.add","title":"add","text":"add(t, text=None)\n
Add a message to Thread by parsing a mix of types.
Accepts any of these argument combinations:
- t=MsgKind, text=str
- t=str, text=None -> uses last thread message's MsgKind
- (MsgKind, text)
- {\"kind\": \"...\", text: \"...\"}
- {\"role\": \"...\", content: \"...\"} - ChatML format
Parameters:
Name Type Description Default t
Union[str, tuple, dict, MsgKind]
One of the accepted types listed above.
required text
Optional[str]
Message text if first type is MsgKind. Defaults to None.
None
Source code in sibila/thread.py
def add(self, \n t: Union[str,tuple,dict,MsgKind],\n text: Optional[str] = None):\n \"\"\"Add a message to Thread by parsing a mix of types.\n\n Accepts any of these argument combinations:\n\n - t=MsgKind, text=str\n - t=str, text=None -> uses last thread message's MsgKind\n - (MsgKind, text)\n - {\"kind\": \"...\", text: \"...\"}\n - {\"role\": \"...\", content: \"...\"} - ChatML format\n\n Args:\n t: One of the accepted types listed above.\n text: Message text if first type is MsgKind. Defaults to None.\n \"\"\"\n\n kind, text = self._parse_msg(t, text)\n\n if kind == MsgKind.INST:\n self.inst = self.join_text(self.inst, text)\n else:\n if kind == self.last_kind and len(self._msgs):\n self._msgs[-1] = self.join_text(self._msgs[-1], text)\n else:\n self._msgs.append(text) # in new kind\n
"},{"location":"api-reference/thread/#sibila.Thread.addx","title":"addx","text":"addx(path=None, text=None, kind=None)\n
Add message with text from a supplied arg or loaded from a path.
Parameters:
Name Type Description Default path
Optional[str]
If given, text is loaded from an UTF-8 file in this path. Defaults to None.
None
text
Optional[str]
If given, text is added. Defaults to None.
None
kind
Optional[MsgKind]
MsgKind of message. If not given or the same as last thread message, it's appended to it. Defaults to None.
None
Source code in sibila/thread.py
def addx(self, \n path: Optional[str] = None, \n text: Optional[str] = None,\n kind: Optional[MsgKind] = None):\n \"\"\"Add message with text from a supplied arg or loaded from a path.\n\n Args:\n path: If given, text is loaded from an UTF-8 file in this path. Defaults to None.\n text: If given, text is added. Defaults to None.\n kind: MsgKind of message. If not given or the same as last thread message, it's appended to it. Defaults to None.\n \"\"\"\n\n assert (path is not None) ^ (text is not None), \"Only one of path or text\"\n\n if path is not None:\n with open(path, 'r', encoding=\"utf-8\") as f:\n text = f.read()\n\n if kind is None: # use last message role, so that it gets appended\n kind = self.last_kind\n\n self.add(kind, text)\n
"},{"location":"api-reference/thread/#sibila.Thread.get_text","title":"get_text","text":"get_text(index)\n
Return text for message at index.
Parameters:
Name Type Description Default index
int
Message index. Use -1 to get inst value.
required Returns:
Type Description str
Message text at index.
Source code in sibila/thread.py
def get_text(self,\n index: int) -> str:\n \"\"\"Return text for message at index.\n\n Args:\n index: Message index. Use -1 to get inst value.\n\n Returns:\n Message text at index.\n \"\"\" \n if index == -1:\n return self.inst\n else:\n return self._msgs[index]\n
"},{"location":"api-reference/thread/#sibila.Thread.set_text","title":"set_text","text":"set_text(index, text)\n
Set text for message at index.
Parameters:
Name Type Description Default index
int
Message index. Use -1 to set inst value.
required text
str
Text to replace in message at index.
required Source code in sibila/thread.py
def set_text(self,\n index: int,\n text: str): \n \"\"\"Set text for message at index.\n\n Args:\n index: Message index. Use -1 to set inst value.\n text: Text to replace in message at index.\n \"\"\"\n if index == -1:\n self.inst = text\n else:\n self._msgs[index] = text\n
"},{"location":"api-reference/thread/#sibila.Thread.concat","title":"concat","text":"concat(t)\n
Concatenate a Thread or list of messages to the current Thread.
Take care that the other list starts with an IN message, therefore, if last message in self is also an IN kind, their text will be joined as in add().
Parameters:
Name Type Description Default t
Optional[Union[Self, list, str, dict, tuple]]
A Thread or a list of messages. Otherwise a single message as in add().
required Raises:
Type Description TypeError
If bad arg types provided.
Source code in sibila/thread.py
def concat(self,\n t: Optional[Union[Self,list,str,dict,tuple]]):\n \"\"\"Concatenate a Thread or list of messages to the current Thread.\n\n Take care that the other list starts with an IN message, therefore, \n if last message in self is also an IN kind, their text will be joined as in add().\n\n Args:\n t: A Thread or a list of messages. Otherwise a single message as in add().\n\n Raises:\n TypeError: If bad arg types provided.\n \"\"\"\n if isinstance(t, Thread):\n for msg in t:\n self.add(msg)\n self.inst = self.join_text(self.inst, t.inst)\n\n elif isinstance(t, list): # message list\n for msg in t:\n self.add(msg)\n\n elif isinstance(t, str) or isinstance(t, dict) or isinstance(t, tuple): # single message\n self.add(t)\n\n else:\n raise TypeError(\"Arg t must be: Thread --or-- list[messages] --or-- an str, tuple or dict single message.\")\n
"},{"location":"api-reference/thread/#sibila.Thread.load","title":"load","text":"load(path)\n
Load this Thread from a JSON file.
Parameters:
Name Type Description Default path
str
Path of file to load.
required Source code in sibila/thread.py
def load(self,\n path: str):\n \"\"\"Load this Thread from a JSON file.\n\n Args:\n path: Path of file to load.\n \"\"\"\n\n with open(path, 'r', encoding='utf-8') as f:\n js = f.read()\n state = json.loads(js)\n\n self._msgs = state[\"_msgs\"]\n self.inst = state[\"inst\"]\n self.join_sep = state[\"join_sep\"]\n
"},{"location":"api-reference/thread/#sibila.Thread.save","title":"save","text":"save(path)\n
Serialize this Thread to JSON.
Parameters:
Name Type Description Default path
str
Path of file to save into.
required Source code in sibila/thread.py
def save(self,\n path: str):\n \"\"\"Serialize this Thread to JSON.\n\n Args:\n path: Path of file to save into.\n \"\"\"\n\n state = {\"_msgs\": self._msgs,\n \"inst\": self.inst,\n \"join_sep\": self.join_sep\n }\n\n json_str = json.dumps(state, indent=2, default=vars)\n\n with open(path, 'w', encoding='utf-8') as f:\n f.write(json_str)\n
"},{"location":"api-reference/thread/#sibila.Thread.msg_as_chatml","title":"msg_as_chatml","text":"msg_as_chatml(index)\n
Returns message in a ChatML dict.
Parameters:
Name Type Description Default index
int
Index of the message to return.
required Returns:
Type Description dict
A ChatML dict with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def msg_as_chatml(self,\n index: int) -> dict:\n \"\"\"Returns message in a ChatML dict.\n\n Args:\n index: Index of the message to return.\n\n Returns:\n A ChatML dict with \"role\" and \"content\" keys.\n \"\"\"\n\n kind = Thread._kind_from_pos(index)\n role = MsgKind.chatml_role_from_kind(kind)\n text = self._msgs[index] if index >= 0 else self.inst\n return {\"role\": role, \"content\": text}\n
"},{"location":"api-reference/thread/#sibila.Thread.as_chatml","title":"as_chatml","text":"as_chatml()\n
Returns Thread as a list of ChatML messages.
Returns:
Type Description list[dict]
A list of ChatML dict elements with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self) -> list[dict]:\n \"\"\"Returns Thread as a list of ChatML messages.\n\n Returns:\n A list of ChatML dict elements with \"role\" and \"content\" keys.\n \"\"\"\n msgs = []\n\n for index,msg in enumerate(self._msgs):\n if index == 0 and self.inst:\n msgs.append(self.msg_as_chatml(-1))\n msgs.append(self.msg_as_chatml(index))\n\n return msgs\n
"},{"location":"api-reference/thread/#sibila.Thread.has_text_lower","title":"has_text_lower","text":"has_text_lower(text_lower)\n
Can the lowercase text be found in one of the messages?
Parameters:
Name Type Description Default text_lower
str
The lowercase text to search for in messages.
required Returns:
Type Description bool
True if such text was found.
Source code in sibila/thread.py
def has_text_lower(self,\n text_lower: str) -> bool:\n \"\"\"Can the lowercase text be found in one of the messages?\n\n Args:\n text_lower: The lowercase text to search for in messages.\n\n Returns:\n True if such text was found.\n \"\"\"\n for msg in self._msgs:\n if text_lower in msg.lower():\n return True\n\n return False \n
"},{"location":"api-reference/thread/#sibila.MsgKind","title":"MsgKind","text":"Enumeration for kinds of messages in a Thread.
"},{"location":"api-reference/thread/#sibila.MsgKind.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 0\n
Input message, from user.
"},{"location":"api-reference/thread/#sibila.MsgKind.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 1\n
Model output message.
"},{"location":"api-reference/thread/#sibila.MsgKind.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 2\n
Initial instructions for model.
"},{"location":"api-reference/thread/#sibila.Context","title":"Context","text":"Context(\n t=None,\n max_token_len=None,\n pinned_inst_text=\"\",\n join_sep=\"\\n\",\n)\n
A class based on Thread that manages total token length, so that it's kept under a certain value. Also supports a persistent inst (instructions) text.
Parameters:
Name Type Description Default t
Optional[Union[Thread, list, str, dict, tuple]]
Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.
None
max_token_len
Optional[int]
Maximum token count to use when trimming. Defaults to None, which will use max model context length.
None
pinned_inst_text
str
Pinned inst text which survives clear(). Defaults to \"\".
''
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Source code in sibila/context.py
def __init__(self, \n t: Optional[Union[Thread,list,str,dict,tuple]] = None, \n max_token_len: Optional[int] = None, \n pinned_inst_text: str = \"\",\n join_sep: str = \"\\n\"):\n \"\"\"\n Args:\n t: Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.\n max_token_len: Maximum token count to use when trimming. Defaults to None, which will use max model context length.\n pinned_inst_text: Pinned inst text which survives clear(). Defaults to \"\".\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n \"\"\"\n\n super().__init__(t,\n inst=pinned_inst_text,\n join_sep=join_sep)\n\n self.max_token_len = max_token_len\n\n self.pinned_inst_text = pinned_inst_text\n
"},{"location":"api-reference/thread/#sibila.Context.clear","title":"clear","text":"clear()\n
Delete all messages but reset inst to a pinned text if any.
Source code in sibila/context.py
def clear(self):\n \"\"\"Delete all messages but reset inst to a pinned text if any.\"\"\"\n super().clear() \n if self.pinned_inst_text is not None:\n self.inst = self.pinned_inst_text\n
"},{"location":"api-reference/thread/#sibila.Context.trim","title":"trim","text":"trim(trim_flags, model, *, max_token_len=None)\n
Trim context by selectively removing older messages until thread fits max_token_len.
Parameters:
Name Type Description Default trim_flags
Trim
Flags to guide selection of which messages to remove.
required model
Model
Model that will process the thread.
required max_token_len
Optional[int]
Cut messages until size is lower than this number. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to trim anything.
Returns:
Type Description bool
True if any context trimming occurred.
Source code in sibila/context.py
def trim(self,\n trim_flags: Trim,\n model: Model,\n *,\n max_token_len: Optional[int] = None,\n ) -> bool:\n \"\"\"Trim context by selectively removing older messages until thread fits max_token_len.\n\n Args:\n trim_flags: Flags to guide selection of which messages to remove.\n model: Model that will process the thread.\n max_token_len: Cut messages until size is lower than this number. Defaults to None.\n\n Raises:\n RuntimeError: If unable to trim anything.\n\n Returns:\n True if any context trimming occurred.\n \"\"\"\n\n if max_token_len is None:\n max_token_len = self.max_token_len\n\n if max_token_len is None:\n max_token_len = model.ctx_len\n\n # if genconf is None:\n # genconf = model.genconf \n # assert max_token_len < model.ctx_len, f\"max_token_len ({max_token_len}) must be < model's context size ({model.ctx_len}) - genconf.max_new_tokens\"\n\n if trim_flags == Trim.NONE: # no trimming\n return False\n\n thread = self.clone()\n\n any_trim = False\n\n while True:\n\n curr_len = model.token_len(thread)\n\n if curr_len <= max_token_len:\n break\n\n logger.debug(f\"len={curr_len} / max={max_token_len}\")\n\n if self.inst and trim_flags & Trim.INST:\n self.inst = ''\n any_trim = True\n logger.debug(f\"Cutting INST {self.inst[:80]} (...)\")\n continue\n\n # cut first possible message, starting from oldest first ones\n trimmed = False\n in_index = out_index = 0\n\n for index,m in enumerate(thread):\n kind,text = m\n\n if kind == MsgKind.IN:\n if trim_flags & Trim.IN:\n if not (trim_flags & Trim.KEEP_FIRST_IN and in_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting IN {text[:80]} (...)\")\n break\n in_index += 1\n\n elif kind == MsgKind.OUT:\n if trim_flags & Trim.OUT: \n if not (trim_flags & Trim.KEEP_FIRST_OUT and out_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting OUT {text[:80]} (...)\")\n break\n out_index += 1\n\n if not trimmed:\n # all thread messages were cycled but not a single could be cut, so size remains the same\n # arriving here we did all we could for trim_flags but could not remove any more\n raise RuntimeError(\"Unable to trim anything out of thread\")\n else:\n any_trim = True\n\n # while end\n\n\n if any_trim:\n self._msgs = thread._msgs\n\n return any_trim\n
"},{"location":"api-reference/thread/#sibila.Trim","title":"Trim","text":"Flags for Thread trimming.
"},{"location":"api-reference/thread/#sibila.Trim.NONE","title":"NONE class-attribute
instance-attribute
","text":"NONE = 0\n
No trimming.
"},{"location":"api-reference/thread/#sibila.Trim.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 1\n
Can remove INST message.
"},{"location":"api-reference/thread/#sibila.Trim.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 2\n
Can remove IN messages.
"},{"location":"api-reference/thread/#sibila.Trim.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 4\n
Can remove OUT messages.
"},{"location":"api-reference/thread/#sibila.Trim.KEEP_FIRST_IN","title":"KEEP_FIRST_IN class-attribute
instance-attribute
","text":"KEEP_FIRST_IN = 1024\n
If trimming IN messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Trim.KEEP_FIRST_OUT","title":"KEEP_FIRST_OUT class-attribute
instance-attribute
","text":"KEEP_FIRST_OUT = 2048\n
If trimming OUT messages, never remove first one.
"},{"location":"api-reference/tokenizer/","title":"Model tokenizers","text":""},{"location":"api-reference/tokenizer/#llamacpp","title":"LlamaCpp","text":""},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer","title":"LlamaCppTokenizer","text":"LlamaCppTokenizer(llama, reg_flags=None)\n
Tokenizer for llama.cpp loaded GGUF models.
Source code in sibila/llamacpp.py
def __init__(self, \n llama: Llama, \n reg_flags: Optional[str] = None):\n self._llama = llama\n\n self.vocab_size = self._llama.n_vocab()\n\n self.bos_token_id = self._llama.token_bos()\n self.bos_token = llama_token_get_text(self._llama.model, self.bos_token_id).decode(\"utf-8\")\n\n self.eos_token_id = self._llama.token_eos()\n self.eos_token = llama_token_get_text(self._llama.model, self.eos_token_id).decode(\"utf-8\")\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None # ? fill by taking a look at id 0?\n self.unk_token = None\n\n # workaround for https://github.com/ggerganov/llama.cpp/issues/4772\n self._workaround1 = reg_flags is not None and \"llamacpp1\" in reg_flags\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/llamacpp.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n\n if self._workaround1:\n # append a space after each bos and eos, so that llama's tokenizer matches HF\n def space_post(text, s):\n out = \"\"\n while (index := text.find(s)) != -1:\n after = index + len(s)\n out += text[:after]\n if text[after] != ' ':\n out += ' '\n text = text[after:]\n\n out += text\n return out\n\n text = space_post(text, self.bos_token)\n text = space_post(text, self.eos_token)\n # print(text)\n\n # str -> bytes\n btext = text.encode(\"utf-8\", errors=\"ignore\")\n\n return self._llama.tokenize(btext, add_bos=False, special=True)\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/llamacpp.py
def decode(self,\n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n\n if not len(token_ids):\n return \"\"\n\n output = b\"\"\n size = 32\n buffer = (ctypes.c_char * size)()\n\n if not skip_special:\n special_toks = {self.bos_token_id: self.bos_token.encode(\"utf-8\"), # type: ignore[union-attr]\n self.eos_token_id: self.eos_token.encode(\"utf-8\")} # type: ignore[union-attr]\n\n for token in token_ids:\n if token == self.bos_token_id:\n output += special_toks[token]\n elif token == self.eos_token_id:\n output += special_toks[token]\n else:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n else: # skip special\n for token in token_ids:\n if token != self.bos_token_id and token != self.eos_token_id:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n\n # \"User code is responsible for removing the leading whitespace of the first non-BOS token when decoding multiple tokens.\"\n if (# token_ids[0] != self.bos_token_id and # we also try cutting if first is bos to approximate HF tokenizer\n len(output) and output[0] <= 32 # 32 = ord(' ')\n ):\n output = output[1:]\n\n return output.decode(\"utf-8\", errors=\"ignore\")\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tokenizer/#openai","title":"OpenAI","text":""},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer","title":"OpenAITokenizer","text":"OpenAITokenizer(model)\n
Tokenizer for OpenAI models.
Source code in sibila/openai.py
def __init__(self, \n model: str\n ):\n\n if not has_tiktoken:\n raise Exception(\"Please install tiktoken by running: pip install tiktoken\")\n\n self._tok = tiktoken.encoding_for_model(model)\n\n self.vocab_size = self._tok.n_vocab\n\n self.bos_token_id = None\n self.bos_token = None\n\n self.eos_token_id = None\n self.eos_token = None\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/openai.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n return self._tok.encode(text)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/openai.py
def decode(self, \n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n assert skip_special, \"OpenAITokenizer only supports skip_special=True\"\n\n return self._tok.decode(token_ids)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tools/","title":"Tools","text":""},{"location":"api-reference/tools/#sibila.tools","title":"tools","text":"Tools for model interaction, summarization, etc.
- interact(): Interact with model as in a chat, using input().
- loop(): Iteratively append inputs and generate model outputs.
- recursive_summarize(): Recursively summarize a (large) text or text file.
"},{"location":"api-reference/tools/#sibila.tools.interact","title":"interact","text":"interact(\n model,\n *,\n ctx=None,\n inst_text=None,\n trim_flags=TRIM_DEFAULT,\n genconf=None\n)\n
Interact with model as in a chat, using input().
Includes a list of commands: type !? to see help.
Parameters:
Name Type Description Default model
Model
Model to use for generating.
required ctx
Optional[Context]
Optional input Context. Defaults to None.
None
inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
trim_flags
Trim
Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, defaults to model's.
None
Returns:
Type Description Context
Context after all the interactions.
Source code in sibila/tools.py
def interact(model: Model,\n *,\n ctx: Optional[Context] = None,\n inst_text: Optional[str] = None,\n trim_flags: Trim = TRIM_DEFAULT,\n\n genconf: Optional[GenConf] = None,\n ) -> Context:\n \"\"\"Interact with model as in a chat, using input().\n\n Includes a list of commands: type !? to see help.\n\n Args:\n model: Model to use for generating.\n ctx: Optional input Context. Defaults to None.\n inst_text: text for Thread instructions. Defaults to None.\n trim_flags: Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n genconf: Model generation configuration. Defaults to None, defaults to model's. \n\n Returns:\n Context after all the interactions.\n \"\"\"\n\n def callback(out: Union[GenOut,None], \n ctx: Context, \n model: Model,\n genconf: GenConf) -> bool:\n\n if out is not None:\n if out.res != GenRes.OK_STOP:\n print(f\"***Result={GenRes.as_text(out.res)}***\")\n\n if out.text:\n text = out.text\n else:\n text = \"***No text out***\"\n\n ctx.add_OUT(text)\n print(text)\n print()\n\n\n def print_thread_info():\n if ctx.max_token_len is not None: # use from ctx\n max_token_len = ctx.max_token_len\n else: # assume max possible for model context and genconf\n max_token_len = model.ctx_len - genconf.max_tokens\n\n length = model.token_len(ctx, genconf)\n print(f\"Thread token len={length}, max len before next gen={max_token_len}\")\n\n\n\n # input loop ===============================================\n MARKER: str = '\"\"\"'\n multiline: str = \"\"\n\n while True:\n\n user = input('>').strip()\n\n if multiline:\n if user.endswith(MARKER):\n user = multiline + \"\\n\" + user[:-3]\n multiline = \"\"\n else:\n multiline += \"\\n\" + user\n continue\n\n else:\n if not user:\n return False # terminate loop\n\n elif user.startswith(MARKER):\n multiline = user[3:]\n continue\n\n elif user.endswith(\"\\\\\"):\n user = user[:-1]\n user = user.replace(\"\\\\n\", \"\\n\")\n ctx.add_IN(user)\n continue\n\n elif user.startswith(\"!\"): # a command\n params = user[1:].split(\"=\")\n cmd = params[0]\n params = params[1:]\n\n if cmd == \"inst\":\n ctx.clear()\n if params:\n text = params[0].replace(\"\\\\n\", \"\\n\")\n ctx.inst = text\n\n elif cmd == \"add\" or cmd == \"a\":\n if params:\n try:\n path = params[0]\n ctx.addx(path=path)\n ct = ctx.last_text\n print(ct[:500])\n except FileNotFoundError:\n print(f\"Could not load '{path}'\")\n else:\n print(\"Path needed\")\n\n elif cmd == 'c':\n print_thread_info()\n print(ctx)\n\n elif cmd == 'cl':\n if not params:\n params.append(\"ctx.json\")\n try:\n ctx.load(params[0])\n print(f\"Loaded context from {params[0]}\")\n except FileNotFoundError:\n print(f\"Could not load '{params[0]}'\")\n\n elif cmd == 'cs':\n if not params:\n params.append(\"ctx.json\")\n ctx.save(params[0])\n print(f\"Saved context to {params[0]}\")\n\n elif cmd == 'tl':\n print_thread_info()\n\n elif cmd == 'i':\n print(f\"Model:\\n{model.info()}\")\n print(f\"GenConf:\\n{genconf}\\n\")\n\n print_thread_info()\n\n # elif cmd == 'p':\n # print(model.text_from_turns(ctx.turns))\n\n # elif cmd == 'to':\n # token_ids = model.tokens_from_turns(ctx.turns)\n # print(f\"Prompt tokens={token_ids}\")\n\n\n else:\n print(f\"Unknown command '!{cmd}' - known commands:\\n\"\n \" !inst[=text] - clear messages and add inst (system) message\\n\"\n \" !add|!a=path - load file and add to last msg\\n\"\n \" !c - list context msgs\\n\"\n \" !cl=path - load context (default=ctx.json)\\n\"\n \" !cs=path - save context (default=ctx.json)\\n\"\n \" !tl - thread's token length\\n\"\n \" !i - model and genconf info\\n\"\n ' Delimit with \"\"\" for multiline begin/end or terminate line with \\\\ to continue into a new line\\n'\n \" Empty line + enter to quit\"\n )\n # \" !p - show formatted prompt (if model supports it)\\n\"\n # \" !to - prompt's tokens\\n\"\n\n print()\n continue\n\n # we have a user prompt\n user = user.replace(\"\\\\n\", \"\\n\")\n break\n\n\n ctx.add_IN(user)\n\n return True # continue looping\n\n\n\n # start prompt loop\n ctx = loop(callback,\n model,\n\n ctx=ctx,\n inst_text=inst_text,\n in_text=None, # call callback for first prompt\n trim_flags=trim_flags)\n\n return ctx\n
"},{"location":"api-reference/tools/#sibila.tools.loop","title":"loop","text":"loop(\n callback,\n model,\n *,\n inst_text=None,\n in_text=None,\n trim_flags=TRIM_DEFAULT,\n ctx=None,\n genconf=None\n)\n
Iteratively append inputs and generate model outputs.
Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.
If last Thread msg is not MsgKind.IN, callback() will be called with out_text=None.
Parameters:
Name Type Description Default callback
Callable[[Union[GenOut, None], Context, Model, GenConf], bool]
A function(out, ctx, model) that will be iteratively called with model's output.
required model
Model
Model to use for generating.
required inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
in_text
Optional[str]
Text for Thread's initial MsgKind.IN. Defaults to None.
None
trim_flags
Trim
Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
ctx
Optional[Context]
Optional input Context. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, defaults to model's.
None
Source code in sibila/tools.py
def loop(callback: Callable[[Union[GenOut,None], Context, Model, GenConf], bool],\n model: Model,\n *,\n inst_text: Optional[str] = None,\n in_text: Optional[str] = None,\n\n trim_flags: Trim = TRIM_DEFAULT,\n ctx: Optional[Context] = None,\n\n genconf: Optional[GenConf] = None,\n ) -> Context:\n \"\"\"Iteratively append inputs and generate model outputs.\n\n Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.\n\n If last Thread msg is not MsgKind.IN, callback() will be called with out_text=None.\n\n Args:\n callback: A function(out, ctx, model) that will be iteratively called with model's output.\n model: Model to use for generating.\n inst_text: text for Thread instructions. Defaults to None.\n in_text: Text for Thread's initial MsgKind.IN. Defaults to None.\n trim_flags: Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n ctx: Optional input Context. Defaults to None.\n genconf: Model generation configuration. Defaults to None, defaults to model's.\n \"\"\"\n\n if ctx is None:\n ctx = Context()\n else:\n ctx = ctx\n\n if inst_text is not None:\n ctx.inst = inst_text\n if in_text is not None:\n ctx.add_IN(in_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if ctx.max_token_len is not None: # use from ctx\n max_token_len = ctx.max_token_len\n else: # assume max possible for model context and genconf\n max_token_len = model.ctx_len - genconf.max_tokens\n\n\n while True:\n\n if len(ctx) and ctx.last_kind == MsgKind.IN:\n # last is an IN message: we can trim and generate\n\n ctx.trim(trim_flags,\n model,\n max_token_len=max_token_len\n )\n\n out = model.gen(ctx, genconf)\n else:\n out = None # first call\n\n res = callback(out, \n ctx, \n model,\n genconf)\n\n if not res:\n break\n\n\n return ctx\n
"},{"location":"api-reference/tools/#sibila.tools.recursive_summarize","title":"recursive_summarize","text":"recursive_summarize(\n model, text=None, path=None, overlap_size=20\n)\n
Recursively summarize a (large) text or text file.
Works by:
- Breaking text into chunks that fit models context.
- Run model to summarize chunks.
- Join generated summaries and jump to 1. - do this until text size no longer decreases.
Parameters:
Name Type Description Default model
Model
Model to use for summarizing.
required text
Optional[str]
Initial text.
None
path
Optional[str]
--Or-- A path to an UTF-8 text file.
None
overlap_size
int
Size in model tokens of the overlapping portions at beginning and end of chunks.
20
Returns:
Type Description str
The summarized text.
Source code in sibila/tools.py
def recursive_summarize(model: Model,\n text: Optional[str] = None,\n path: Optional[str] = None,\n overlap_size: int = 20) -> str:\n \"\"\"Recursively summarize a (large) text or text file.\n\n Works by:\n\n 1. Breaking text into chunks that fit models context.\n 2. Run model to summarize chunks.\n 3. Join generated summaries and jump to 1. - do this until text size no longer decreases.\n\n Args:\n model: Model to use for summarizing.\n text: Initial text.\n path: --Or-- A path to an UTF-8 text file.\n overlap_size: Size in model tokens of the overlapping portions at beginning and end of chunks.\n\n Returns:\n The summarized text.\n \"\"\"\n\n if (text is not None) + (path is not None) != 1:\n raise ValueError(\"Only one of text or path can be given\")\n\n if path is not None:\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n\n inst_text = \"\"\"Your task is to do short summaries of text.\"\"\"\n in_text = \"Summarize the following text:\\n\"\n ctx = Context(pinned_inst_text=inst_text)\n\n # split initial text\n max_token_len = model.ctx_len - model.genconf.max_tokens - (model.tokenizer.token_len(inst_text + in_text) + 16) \n logger.debug(f\"Max ctx token len {max_token_len}\")\n\n token_len_fn = model.tokenizer.token_len_lambda\n logger.debug(f\"Initial text token_len {token_len_fn(text)}\") # type: ignore[arg-type]\n\n spl = RecursiveTextSplitter(max_token_len, overlap_size, len_fn=token_len_fn)\n\n round = 0\n while True: # summarization rounds\n logger.debug(f\"Round {round} {'='*60}\")\n\n in_list = spl.split(text=text)\n in_len = sum([len(t) for t in in_list])\n\n logger.debug(f\"Split in {len(in_list)} parts, total len {in_len} chars\")\n\n out_list = []\n for i,t in enumerate(in_list):\n\n logger.debug(f\"{round}>{i} {'='*30}\")\n\n ctx.clear()\n ctx.add_IN(in_text)\n ctx.add_IN(t)\n\n out = model.gen(ctx) \n logger.debug(out)\n\n out_list.append(out.text)\n\n text = \"\\n\".join(out_list)\n\n out_len = len(text) # sum([len(t) for t in out_list])\n if out_len >= in_len:\n break\n elif len(out_list) == 1:\n break\n else:\n round += 1\n\n return text\n
"},{"location":"examples/","title":"Examples","text":"Example Description Hello model Introductory pirate arrr-example: create local or remote models, use the Models class to simplify. From text to object Keypoint extractor, showing progressively better ways to query a model, from plain text, JSON, to Pydantic classes. Extract information Extract information about all persons mentioned in a text. Also available in a dataclass version. Tag customer queries Summarize and classify customer queries into tags. Quick meeting Extracting participants, action items and priorities from a simple meeting transcript. Tough meeting Extracting information from a long and complex transcript. Compare model output Compare sentiment analyses of customer reviews done by two models. Chat interaction Interact with the model as in a back-and-forth chat session. Model management with CLI Download and manage models with the command-line sibila. Each example is explained in a Read Me and usually include a Jupyter notebook and/or a .py script version.
Most of the examples use a local model but you can quickly change to using OpenAI models by uncommenting one or two lines.
"},{"location":"examples/cli/","title":"Sibila CLI","text":"In this example we'll see how to use the sibila Command-Line Interface (CLI) to download a GGUF model from the Hugging Face model hub.
We'll then register it in the Models factory, so that it can be easily used with Models.create(). The Models factory is based in a folder where model GGUF format files are stored and two configuration files: \"models.json\" and \"formats.json\".
After Doing the above, we'll be able to use this model in Python with two lines:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:rocket\")\n
Let's run sibila CLI to get help:
> sibila --help\n\nusage: sibila [-h] [--version] {models,formats,hub} ...\n\nSibila cli tool for managing models and formats.\n\noptions:\n -h, --help show this help message and exit\n --version show program's version number and exit\n\nactions:\n hf, models, formats\n\n {models,formats,hub} Run 'sibila {command} --help' for specific help.\n
Sibila CLI has three modes:
- models: to edit a 'models.json' file, create model entries set format, etc.
- formats: to edit a 'formats.json' file, add new formats, etc.
- hub: search and download models from Hugging Face model hub.
Specific help for each mode is available by doing: sibila mode --help
Let's download the Rocket 3B model, a small but capable model, fine-tuned for chat/instruct prompts:
https://huggingface.co/TheBloke/rocket-3B-GGUF
We'll use a \"sibila hub -d\" command to download to \"../../models\" folder. We'll get the 4-bit quantization (Q4_K_M):
> sibila hub -d 'TheBloke/rocket-3B-GGUF' -f Q4_K_M -m '../../models'\n\nSearching...\nDownloading model 'TheBloke/rocket-3B-GGUF' file 'rocket-3b.Q4_K_M.gguf' to '../../models/rocket-3b.Q4_K_M.gguf'\n\nDownload complete.\nFor information about this and other models, please visit https://huggingface.co\n
After this command, the \"rocket-3b.Q4_K_M.gguf\" file has now been downloaded to the \"../../models\" folder.
We'll now register it with the Models factory, which is located in the folder to where we downloaded the model.
This can be done by editing the \"models.json\" file directly or even simpler, with a \"sibila models -s\" command:
> sibila models -s llamacpp:rocket rocket-3b.Q4_K_M.gguf -m '../../models'\n\nUsing models directory '../../models'\nSet model 'llamacpp:rocket' with name='rocket-3b.Q4_K_M.gguf' at '/home/jorge/ai/sibila/models/models.json'.\n
An entry has now been created in \"models.json\" for this model.
However, we did not set the chat template format - but let's first test if the downloaded GGUF file already includes it in its metadata.
This is done with \"sibila models -t\":
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nError: Could not find a suitable chat template format for this model. Without a format, fine-tuned models cannot function properly. See the docs on how you can fix this: either setup the format in Models factory, or provide the chat template in the 'format' arg.\n
Error. Looks like we need to set the chat template format!
Checking the model's page, we find that it uses the ChatML prompt/chat template, which is great because it's one of the base formats included with Sibila.
So let's set the template format in the \"llamacpp:rocket\" entry we've just created:
> sibila models -f llamacpp:rocket chatml -m '../../models'\n\nUsing models directory '/home/jorge/ai/sibila/models'\nUpdated model 'llamacpp:rocket' with format 'chatml' at '/home/jorge/ai/sibila/models/models.json'.\n
Let's now test again:
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nModel 'llamacpp:rocket' was properly created and should run fine.\n
Great - the model passed the test and should be ready for use.
Let's try using it from Python:
from sibila import Models\n\nModels.setup(\"../../models\") # the folder with models and configs\n\nmodel = Models.create(\"llamacpp:rocket\") # model name in provider:name format\n\nmodel(\"Hello there!\")\n
\"Hello! I'm an AI language model here to assist you with your inquiries or generate content for you. I am programmed to be polite and respectful, so please let me know how I can help you today.\"\n
Seems to be working - and politely too!
"},{"location":"examples/compare/","title":"Compare","text":"In this example we'll use an utility function from the multigen module that builds a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
This function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
For the local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the local_name variable below, after the text \"llamacpp:\".
Jupyter notebook and Python script versions are available in the example's folder.
Instead of directly creating models as we've seen in previous examples, multigen will create the models via the Models class directory.
We'll start by choosing a local and a remote model that we'll compare.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nlocal_name = \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\n# to use an OpenAI model:\nremote_name = \"openai:gpt-3.5\"\n
Now let's define a list of reviews that we'll ask the two models to do sentiment analysis upon.
These are generic product reviews, that you could find in an online store.
reviews = [\n\"The user manual was confusing, but once I figured it out, the product more or less worked.\",\n\"This widget changed my life! It's sleek, efficient, and worth every penny.\",\n\"I'm disappointed with the product quality. It broke after just a week of use.\",\n\"The customer service team was incredibly helpful in resolving my issue with the device.\",\n\"I'm blown away by the functionality of this gadget. It exceeded my expectations.\",\n\"The packaging was damaged upon arrival, but the product itself works great.\",\n\"I've been using this tool for months, and it's still as good as new. Highly recommended!\",\n\"I regret purchasing this item. It doesn't perform as advertised.\",\n\"I've never had so much trouble with a product before. It's been a headache from day one.\",\n\"I bought this as a gift for my friend, and they absolutely love it!\",\n\"The price seemed steep at first, but after using it, I understand why. Quality product.\",\n\"This gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\"\n]\n\n# model instructions text, also known as system message\ninst_text = \"You are a helpful assistant that analyses text sentiment.\"\n
Since we just want to obtain a sentiment classification, we'll use a convenient enumeration: a list with three values: positive, negative or neutral.
Let's try the first review on a local model:
sentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nin_text = \"Each line is a product review. Extract the sentiment associated with each review:\\n\\n\" + reviews[0]\n\nprint(reviews[0])\n\nlocal_model = Models.create(local_name)\n\nout = local_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\n# to clear memory\ndel local_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
Definitely neutral is a good answer for this one.
Let's now try the remote model:
print(reviews[0])\n\nremote_model = Models.create(remote_name)\n\nout = remote_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\ndel remote_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
And the remote model (GPT-3.5) seems to agree on neutrality.
By using the query_multigen() function that we'll import from sibila.multigen, we'll be able to compare what multiple models generate in response to each input.
In our case the inputs will be the list of reviews. This function accepts these interesting arguments: - text: type of text output, which can be the word \"print\" or a text filename to which it will save. - csv: type of CSV output, which can also be \"print\" or a text filename to save into. - out_keys: what we want listed: the generated raw text (\"text\"), a Python dict (\"dict\") or a Pydantic object (\"obj\"). For our case \"dict\" is the right one. - gencall: we need to pass a function that will actually call the model for each input. We use a convenient predefined function and provide it with the sentiment_type definition.
Let's run it with our two models:
from sibila.multigen import (\n query_multigen,\n make_extract_gencall\n)\n\nsentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nout = query_multigen(reviews,\n inst_text,\n model_names = [local_name, remote_name],\n text=\"print\",\n csv=\"sentiment.csv\",\n out_keys = [\"value\"],\n gencall = make_extract_gencall(sentiment_enum)\n )\n
////////////////////////////////////////////////////////////\nThe user manual was confusing, but once I figured it out, the product more or less worked.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-3.5 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nThis widget changed my life! It's sleek, efficient, and worth every penny.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm disappointed with the product quality. It broke after just a week of use.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nThe customer service team was incredibly helpful in resolving my issue with the device.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm blown away by the functionality of this gadget. It exceeded my expectations.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe packaging was damaged upon arrival, but the product itself works great.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-3.5 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nI've been using this tool for months, and it's still as good as new. Highly recommended!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI regret purchasing this item. It doesn't perform as advertised.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI've never had so much trouble with a product before. It's been a headache from day one.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI bought this as a gift for my friend, and they absolutely love it!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe price seemed steep at first, but after using it, I understand why. Quality product.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n
The output format is - see comments nearby -----> arrows:
//////////////////////////////////////////////////////////// -----> This is the model input, a review text:\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP <----- Local model name and result\n'positive' <----- What the local model output\n==================== openai:gpt-3.5 -> OK_STOP <----- Remote model name and result\n'positive' <----- Remote model output\n
We also requested the creation of a CSV file with the results: sentiment.csv.
Example's assets at GitHub.
"},{"location":"examples/extract/","title":"Extract Pydantic","text":"In this example we'll extract information about all persons mentioned in a text. This example is also available in a dataclass version.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from pydantic import BaseModel, Field\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' source_location='London'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' source_location='Barcelona'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' source_location='Tokyo'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' source_location='Mumbai'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' source_location='Paris'\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident' source_location='Barcelona' source_country='Spain'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories' source_location='Tokyo' source_country='Japan'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology' source_location='Mumbai' source_country='India'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace' source_location='Paris' source_country='France'\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married: bool\" field to the Person class.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom' is_married=False\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident' source_location='Barcelona' source_country='Spain' is_married=False\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories' source_location='Tokyo' source_country='Japan' is_married=False\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology' source_location='Mumbai' source_country='India' is_married=True\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace' source_location='Paris' source_country='France' is_married=False\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/extract_dataclass/","title":"Extract dataclass","text":"This is the Python dataclass version of of the Pydantic extraction example.
We'll extract information about all persons mentioned in a text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from dataclasses import dataclass\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', source_location='London')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', source_location='Barcelona')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', source_location='Tokyo')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', source_location='Mumbai')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', source_location='Paris')\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='a 28-year-old journalist from London, her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='a 35-year-old architect from the sun-kissed streets of Barcelona, with a sketchbook in hand, he exuded creativity', source_location='Barcelona', source_country='Spain')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='a 23-year-old musician from the bustling streets of Tokyo, her fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='a married 40-year-old engineer from the bustling metropolis of Mumbai, with a laptop at his side, he navigated the complexities of technology with ease', source_location='Mumbai', source_country='India')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='a 32-year-old fashion designer from the romantic streets of Paris, her impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France')\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married\" field to the Person dataclass.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='a 28-year-old journalist from London, her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom', is_married=False)\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='a 35-year-old architect from the sun-kissed streets of Barcelona, with a sketchbook in hand, he exuded creativity', source_location='Barcelona', source_country='Spain', is_married=False)\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='a 23-year-old musician from the bustling streets of Tokyo, her fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan', is_married=False)\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='a married 40-year-old engineer from the bustling metropolis of Mumbai, with a laptop at his side, he navigated the complexities of technology with ease', source_location='Mumbai', source_country='India', is_married=True)\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='a 32-year-old fashion designer from the romantic streets of Paris, her impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France', is_married=False)\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/from_text_to_object/","title":"From text to object","text":"In this example we'll ask the model to extract keypoints from a text: - First in plain text format - Then free JSON output (with fields selected by the model) - Later constrained by a JSON schema (so that we can specify which fields) - And finally by generating to a Pydantic object (from a class definition)
All the queries will be made at temperature=0, which is the default GenConf setting. This means that the model is giving it's best (as in most probable) answer and that it will always output the same results, given the same inputs.
Also available as a Jupyter notebook or a Python script in the example's folder.
We'll start by creating either a local model or a GPT-4 model.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\". For an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Let's use this fragment from Wikipedia's entry on the Fiji islands: https://en.wikipedia.org/wiki/
doc = \"\"\"\\\nFiji, officially the Republic of Fiji,[n 2] is an island country in Melanesia,\npart of Oceania in the South Pacific Ocean. It lies about 1,100 nautical miles \n(2,000 km; 1,300 mi) north-northeast of New Zealand. Fiji consists of \nan archipelago of more than 330 islands\u2014of which about 110 are permanently \ninhabited\u2014and more than 500 islets, amounting to a total land area of about \n18,300 square kilometres (7,100 sq mi). The most outlying island group is \nOno-i-Lau. About 87% of the total population of 924,610 live on the two major \nislands, Viti Levu and Vanua Levu. About three-quarters of Fijians live on \nViti Levu's coasts, either in the capital city of Suva, or in smaller \nurban centres such as Nadi (where tourism is the major local industry) or \nLautoka (where the sugar-cane industry is dominant). The interior of Viti Levu \nis sparsely inhabited because of its terrain.[13]\n\nThe majority of Fiji's islands were formed by volcanic activity starting around \n150 million years ago. Some geothermal activity still occurs today on the islands \nof Vanua Levu and Taveuni.[14] The geothermal systems on Viti Levu are \nnon-volcanic in origin and have low-temperature surface discharges (of between \nroughly 35 and 60 degrees Celsius (95 and 140 \u00b0F)).\n\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and \nlater Melanesians, with some Polynesian influences. Europeans first visited Fiji \nin the 17th century.[15] In 1874, after a brief period in which Fiji was an \nindependent kingdom, the British established the Colony of Fiji. Fiji operated as \na Crown colony until 1970, when it gained independence and became known as \nthe Dominion of Fiji. In 1987, following a series of coups d'\u00e9tat, the military \ngovernment that had taken power declared it a republic. In a 2006 coup, Commodore \nFrank Bainimarama seized power. In 2009, the Fijian High Court ruled that the \nmilitary leadership was unlawful. At that point, President Ratu Josefa Iloilo, \nwhom the military had retained as the nominal head of state, formally abrogated \nthe 1997 Constitution and re-appointed Bainimarama as interim prime minister. \nLater in 2009, Ratu Epeli Nailatikau succeeded Iloilo as president.[16] On 17 \nSeptember 2014, after years of delays, a democratic election took place. \nBainimarama's FijiFirst party won 59.2% of the vote, and international observers \ndeemed the election credible.[17] \n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Be helpful and provide concise answers.\"\n
Let's start with a free text query by calling model().
in_text = \"Extract 5 keypoints of the following text:\\n\" + doc\n\nout = model(in_text, inst=inst_text)\nprint(out)\n
1. Fiji is an island country located in Melanesia, part of Oceania in the South Pacific Ocean. It lies approximately 1,100 nautical miles north-northeast of New Zealand.\n2. The country consists of more than 330 islands with about 110 permanently inhabited islands and over 500 islets, totaling a land area of about 18,300 square kilometers.\n3. Approximately 87% of Fiji's total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu, with a majority living on Viti Levu's coasts.\n4. The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago, with some geothermal activity still occurring on certain islands.\n5. Fiji has a complex history, transitioning from an independent kingdom to a British colony, then a Dominion, and finally a republic after a series of coups and constitutional changes. In 2014, a democratic election took place, marking a significant milestone in the country's political history.\n
These are quite reasonable keypoints!
Let's now ask for JSON output, taking care to explicitly request it in the query (in_text variable).
Instead of model() we now use json() which returns a Python dict. We'll pass None as the first parameter because we're not using a JSON schema.
import pprint\npp = pprint.PrettyPrinter(width=300, sort_dicts=False)\n\nin_text = \"Extract 5 keypoints of the following text in JSON format:\\n\\n\" + doc\n\nout = model.json(None,\n in_text,\n inst=inst_text)\npp.pprint(out)\n
{'keypoints': [{'title': 'Location', 'description': 'Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.'},\n {'title': 'Geography', 'description': 'Consists of more than 330 islands with about 110 permanently inhabited islands.'},\n {'title': 'Population', 'description': 'Total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.'},\n {'title': 'History', 'description': 'Humans have lived in Fiji since the second millennium BC with Austronesians, Melanesians, and Polynesian influences.'},\n {'title': 'Political Status', 'description': 'Officially known as the Republic of Fiji, gained independence from British rule in 1970.'}]}\n
Note how the model chose to return different fields like \"title\" or \"description\".
Because we didn't specify which fields we want, each model will generate different ones.
To specify a fixed format, let's now generate by setting a JSON schema that defines which fields and types we want:
json_schema = {\n \"properties\": {\n \"keypoint_list\": {\n \"description\": \"Keypoint list\",\n \"items\": {\n \"type\": \"string\",\n \"description\": \"Keypoint\"\n },\n \"type\": \"array\"\n }\n },\n \"required\": [\n \"keypoint_list\"\n ],\n \"type\": \"object\"\n}\n
This JSON schema requests that the generated dict constains a \"keypoint_list\" with a list of strings.
We'll also use json(), now passing the json_schema as first argument:
out = model.json(json_schema,\n in_text,\n inst=inst_text)\n\nprint(out)\n
{'keypoint_list': ['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]}\n
for kpoint in out[\"keypoint_list\"]:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
It has generated a string list in the \"keypoint_list\" field, as we specified in the JSON schema.
This is better, but the problem with JSON schemas is that they can be quite hard to work with.
Let's use an easier way to specify the fields we want returned: Pydantic classes derived from BaseModel. This is way simpler to use than JSON schemas.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Keypoints(BaseModel):\n keypoint_list: list[str]\n\nout = model.pydantic(Keypoints,\n in_text,\n inst=inst_text)\nprint(out)\n
keypoint_list=['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]\n
for kpoint in out.keypoint_list:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
The pydantic() method returns an object of class Keypoints, instantiated with the model output.
This is a much simpler way to extract structured data from model.
Please see other examples for more interesting objects. In particular, we did not add descriptions to the fields, which are important clues to help the model understand what we want.
Besides Pydantic classes, Sibila can also use Python's dataclass to extract structured data. This is a lighter and easier alternative to using Pydantic.
Example's assets at GitHub.
"},{"location":"examples/hello_model/","title":"Hello model","text":"In this example we see how to directly create local or remote model objects and later to do that more easily with the Models class.
"},{"location":"examples/hello_model/#using-a-local-model","title":"Using a local model","text":"To use a local model, make sure you download its GGUF format file and save it into the \"../../models\" folder.
In these examples, we'll use a 4-bit quantization of the OpenChat-3.5 7 billion parameters model, which at the current time is quite a good model for its size.
The file is named \"openchat-3.5-1210.Q4_K_M.gguf\" and was downloaded from the above link. Make sure to save it into the \"../../models\" folder.
See here for more information about setting up your local models.
With the model file in the \"../../models\" folder, we can run the following script:
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Run the script above and after a few seconds (it has to load the model from disk), the good model answers back something like:
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
"},{"location":"examples/hello_model/#using-an-openai-model","title":"Using an OpenAI model","text":"To use a remote model like GPT-4 you'll need a paid OpenAI account: https://openai.com/pricing
With an OpenAI account, you'll be able to generate an access token that you should set into the OPENAI_API_KEY env variable.
(An even better way is to use .env files with your variables, and use the dotenv library to read them.)
Once a valid OPENAI_API_KEY env variable is set, you can run this script:
from sibila import OpenAIModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
We get back the usual funny pirate answer:
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
"},{"location":"examples/hello_model/#using-the-models-directory","title":"Using the Models directory","text":"In these two scripts we created different objects to access the LLM model: LlamaCppModel and OpenAIModel.
This was done to simplify, but a better way is to use the Models class directory.
Models is a singleton class that implements a directory of models where you can store file locations, configurations, aliases, etc.
After setting up a JSON configuration file you can have the Models class create models by using names like \"llamacpp:openchat\" or \"openai:gpt-4\" together with their predefined settings. This permits easy model change, comparing model outputs, etc.
In the scripts above, instead on instancing different classes for different models, we could use Models class to create the model from a name, by setting the model_name variable:
from sibila import Models, GenConf\n\n# Using a local llama.cpp model: we first setup the ../../models directory:\n# Models.setup(\"../../models\")\n# model_name = \"llamacpp:openchat\"\n\n# OpenAI: make sure you set the environment variable named OPENAI_API_KEY with your API key.\nmodel_name = \"openai:gpt-4\"\n\nmodel = Models.create(model_name,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
The magic happens in the line:
model = Models.create(model_name, ...)\n
The Models class will take care of initializing the model based on the name you provide.
Example's assets at GitHub.
"},{"location":"examples/interact/","title":"Interact","text":"In this example we look at the interact() function, which allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done by using a Context class object, which can manage thread messages and delete older ones when the context maximum length is reached.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
So, let's create a local model.
from sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Import interact() from the tools module and run it.
What follows is the actual chat where the user prompts start with a \">\" and the model answer's immediately below.
Read on, it does get funny at times... : )
from sibila.tools import (\n interact\n)\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
> Hello there!\n\n\nHello! How can I help you today?\n\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\n\nYes, I am familiar with the AI language model named Sibila. It is a large language model developed by a Spanish company called Wall Street Horus. Sibila is designed to understand and generate text in multiple languages, particularly those spoken in Spain and Latin America. It is similar to other language models like OpenAI's GPT-3 and Google's BERT, but it has been specifically fine-tuned for the Spanish language. If you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n\n\n> No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\n\n\nApologies for the confusion in my previous response. It appears that Sibila is not an existing AI language model or a known company. I may have mixed up information from different sources. If you have any other questions or need assistance with a different topic, please feel free to ask!\n\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Context class.\n\n\nGot it! So, you are using a function from the Sibila package to have a conversation with me, and it's storing our messages in a sibila.Context object. If you need any help or information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n\n\n> That's right - I can list the context messages by entering \"!c\"\n\n\nUnderstood! If you ever need to view the messages that have been exchanged in the conversation, you can enter \"!c\" and I will provide you with a summary of the messages stored in the sibila.Context object. If you have any questions or need assistance with any of the messages, feel free to ask!\n\n\n\n> Not you - the interact() function!\n\n\nI apologize for the confusion. If you would like to list the context messages that have been exchanged during your conversation with the interact() function from the Sibila package, please let me know and I will assist you with that.\n\n\n\n> !c\n\n\nThread token len=596, max len before next gen=2048\ninst=\u2588Be helpful.\u2588, sep='\\n', len=12\n0: IN=\u2588Hello there!\u2588\n1: OUT=\u2588Hello! How can I help you today?\u2588\n2: IN=\u2588I'm doing an example of this LLM access package called Sibila - have you heard about it?\u2588\n3: OUT=\u2588Yes, I am familiar with the AI language model named Sibila. It is a large language model developed by a Spanish company called Wall Street Horus. Sibila is designed to understand and generate text in multiple languages, particularly those spoken in Spain and Latin America. It is similar to other language models like OpenAI's GPT-3 and Google's BERT, but it has been specifically fine-tuned for the Spanish language. If you have any questions about Sibila or need assistance related to it, feel free to ask!\u2588\n4: IN=\u2588No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\u2588\n5: OUT=\u2588Apologies for the confusion in my previous response. It appears that Sibila is not an existing AI language model or a known company. I may have mixed up information from different sources. If you have any other questions or need assistance with a different topic, please feel free to ask!\u2588\n6: IN=\u2588No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Context class.\u2588\n7: OUT=\u2588Got it! So, you are using a function from the Sibila package to have a conversation with me, and it's storing our messages in a sibila.Context object. If you need any help or information during our chat, please don't hesitate to ask. I'm here to assist you!\u2588\n8: IN=\u2588That's right - I can list the context messages by entering \"!c\"\u2588\n9: OUT=\u2588Understood! If you ever need to view the messages that have been exchanged in the conversation, you can enter \"!c\" and I will provide you with a summary of the messages stored in the sibila.Context object. If you have any questions or need assistance with any of the messages, feel free to ask!\u2588\n10: IN=\u2588Not you - the interact() function!\u2588\n11: OUT=\u2588I apologize for the confusion. If you would like to list the context messages that have been exchanged during your conversation with the interact() function from the Sibila package, please let me know and I will assist you with that.\u2588\n\n\n\n> Info about the model: \"!i\"\n\n\nAs an AI language model, my main purpose is to provide information, answer questions, and offer assistance. If you have any specific questions or topics related to models like GPT-3, BERT, or other AI-related subjects, please feel free to ask, and I'll be happy to help!\n\n\n\n> !i\n\n\nModel:\ndesc='LlamaCppModel: /home/jorge/ai/sibila/models/openchat-3.5-1210.Q4_K_M.gguf - 'llama 7B Q4_K - Medium'',ctx_len=2048,\ngenconf=GenConf(max_tokens=0,\n stop=[],\n temperature=0.0,\n top_p=0.9,\n format='text',\n json_schema=None)\nGenConf:\nGenConf(max_tokens=0,\n stop=[],\n temperature=0.9,\n top_p=0.9,\n format='text',\n json_schema=None)\n\nThread token len=684, max len before next gen=2048\n\n\n\n> For all commands: \"!\"\n\n\nIt seems that you can use the \"!\" command prefix to access various functions within the Sibila package. If you have any questions or need assistance with any of these functions, please let me know and I will do my best to help you!\n\n\n\n> !\n\n\nUnknown command '!' - known commands:\n !inst[=text] - clear messages and add inst (system) message\n !add|!a=path - load file and add to last msg\n !c - list context msgs\n !cl=path - load context (default=ctx.json)\n !cs=path - save context (default=ctx.json)\n !tl - thread's token length\n !i - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n\n\n\n> Sorry, I was not talking to you, just demoing the commands available with \"!\"\n\n\nApologies for the misunderstanding. If you have any questions or need assistance with the Sibila package or any other topic, please don't hesitate to ask. I'm here to help!\n\n\n\n> !cs\n\n\nSaved context to ctx.json\n\n\n\n> !tl\n\n\nThread token len=832, max len before next gen=2048\n\n\n\n> The Context class manages total token length: it will delete older messages if the thread becomes larger than the content max length.\n\n\nThank you for providing that information. It seems that the sibila.Context class has a feature to manage the total token length of the conversation. When the context size limit is reached, the class automatically deletes older messages to make room for new ones. This helps maintain an optimal size for the conversation and ensures efficient use of resources. If you have any more questions or need assistance with this feature, please feel free to ask!\n\n\n\n> Right. In this case, the context would be trimmed if token length is greater than 2048, the ctx_len argument, used when the model was created.\n\n\nI understand now. When the sibila.Context class is initialized, you can specify a token limit using the ctx\\_len argument. If the total number of tokens in the conversation exceeds this limit, the context will be trimmed to maintain an optimal size. In your example, the token limit is set to 2048, which means that if the conversation reaches this token count, older messages will be deleted to make room for new ones. This helps manage the conversation's size and ensures efficient use of resources. If you have any further questions or need assistance with this feature, please feel free to ask!\n\n\n\n> Right. Let's do the pirate number? Or do you prefer the Homer Simpson one?\n\n\nI can provide both versions for you: the pirate-themed \"Sibila\" song and the Homer Simpson-inspired version. Which one would you like to hear first?\n\n\n\n> Nevermind - let's do the pirate. I'll enter !inst=You are a pirate.\n\n\nAlright! Here is the pirate-themed \"Sibila\" song based on your instruction:\n\n```javascript\n!inst=You are a pirate.\n```\n\nYo ho, yo ho, it's a Sibila life for me!\n\n\n\n> You're just too much.\n\n\nThank you! I'm here to provide assistance and have some fun along the way. If you have any more questions or need help with anything else, feel free to ask!\n\n\n\n> !inst=You are a pirate.\n\n\n\n\n\n> Hey there!\n\n\nAhoy, me hearty! What brings ye to these seas of ours? Are ye lookin' for treasure or perhaps a bit o' adventure?\n\n\n\n> Right \"me hearty\", I'm here for great pirate adventures, and that just by talking to an LLM model!\n\n\nWell then, prepare to set sail on a thrilling journey with the most talkative mate aboard these digital waters! Let's dive right in and explore the wonders of this world with our trusty AI companion. Yarr!\n\nSo, me mate, what be yer name? And how did ye end up here, talkin' to a pirate like yours truly?\n\n\n\n> Arrrr!\n\n\nThat's the spirit! A proper pirate response right there. So, let's see, what be yer name, landlubber? And what brings ye to these pirate-infested waters? Are ye here by choice or did some devious captain force you into service? Speak up, me hearty, before I send out a search party for ye!\n\n\n\n>\n\n\n\n\n\n<sibila.context.Context at 0x7f27f2714650>\n
These are the \"!\" commands that you can use in the interact() inputs:
! - to show this help\n !inst[=text] - clear messages and add inst (system) message\n !add|!a=path - load file and add to last msg\n !c - list context msgs\n !cl=path - load context (default=ctx.json)\n !cs=path - save context (default=ctx.json)\n !tl - thread's token length\n !i - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n
Example's assets at GitHub.
"},{"location":"examples/quick_meeting/","title":"Quick meeting","text":"Let's extract structured data from a meeting transcript, like attendees, action items and their priorities.
This is a quick meeting whose transcript is not very large, so a small local model should work well. See the Tough meeting example for a larger and more complex transcription text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
If you prefer to use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Here's the transcript we'll be using as source:
transcript = \"\"\"\\\nDate: 10th April 2024\nTime: 10:30 AM\nLocation: Conference Room A\n\nAttendees:\n Arthur: Logistics Supervisor\n Bianca: Operations Manager\n Chris: Fleet Coordinator\n\nArthur: Good morning, team. Thanks for making it. We've got three matters to address quickly today.\n\nBianca: Morning, Arthur. Let's dive in.\n\nChris: Ready when you are.\n\nArthur: First off, we've been having complaints about late deliveries. This is very important, we're getting some bad reputation out there.\n\nBianca: Chris, I think you're the right person to take care of this. Can you investigate and report back by end of day? \n\nChris: Absolutely, Bianca. I'll look into the reasons and propose solutions.\n\nArthur: Great. Second, Bianca, we need to update our driver training manual. Can you take the lead and have a draft by Friday?\n\nBianca: Sure thing, Arthur. I'll get started on that right away.\n\nArthur: Lastly, we need to schedule a meeting with our software vendor to discuss updates to our tracking system. This is a low-priority task but still important. I'll handle that. Any input on timing?\n\nBianca: How about next Wednesday afternoon?\n\nChris: Works for me.\n\nArthur: Sounds good. I'll arrange it. Thanks, Bianca, Chris. Let's keep the momentum going.\n\nBianca: Absolutely, Arthur.\n\nChris: Will do.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
Let's define two Pydantic BaseModel classes whose instances will receive the extracted information: - Attendee: to store information about each meeting attendee - Meeting: to keep meeting's date and location, list of participants and other info we'll see below
And let's ask the model to create objects that are instances of these classes:
from pydantic import BaseModel, Field\n\n# class definitions will be used to constrain the model output and initialize an instance object\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
meeting_date='10th April 2024' meeting_location='Conference Room A' attendees=[Attendee(name='Arthur', occupation='Logistics Supervisor'), Attendee(name='Bianca', occupation='Operations Manager'), Attendee(name='Chris', occupation='Fleet Coordinator')]\n
A prettier display:
print(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\n
This information was correctly extracted.
Let's now request the action items mentioned in the meeting. We'll create a new class ActionItem with an index and a name for the item. Note that we're annotating each field with a Field(description=...) information to help the model understand what we're looking extract.
We'll also add an action_items field to the Meeting class to hold the items list.
class Attendee(BaseModel):\n name: str\n occupation: str\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate and report on late deliveries'\nindex=2 name='Update driver training manual'\nindex=3 name='Schedule a meeting with software vendor to discuss tracking system updates'\n
The extracted action items also look good.
Let's now extract more action item information: - Priority for each item - Due by... information - Name of the attendee that was assigned for that item
So, we create a Priority class holding three priority types - low to high.
We also add three fields to the ActionItem class, to hold the new information: priority, due_by and assigned_attendee.
from enum import Enum\n\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Priority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: Priority = Field(description=\"Action item priority\")\n due_by: str = Field(description=\"When should the item be complete\")\n assigned_attendee: str = Field(description=\"Name of the attendee to which action item was assigned\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate late deliveries' priority=<Priority.HIGH: 'high'> due_by='end of day' assigned_attendee='Chris'\nindex=2 name='Update driver training manual' priority=<Priority.MEDIUM: 'medium'> due_by='Friday' assigned_attendee='Bianca'\nindex=3 name='Schedule meeting with software vendor' priority=<Priority.LOW: 'low'> due_by='next Wednesday afternoon' assigned_attendee='Arthur'\n
The new information was correctly extracted: priorities, due by and assigned attendees for each action item.
For an example of a harder, more complex transcript see the Tough meeting example.
Example's assets at GitHub.
"},{"location":"examples/tag/","title":"Tag","text":"In this example we'll summarize and classify customer queries with tags. We'll use dataclasses to specify the structure of the information we want extracted (we could also use Pydantic BaseModel classes).
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Available as a Jupyter notebook or a Python script in the example's folder.
Let's start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
These will be our queries, ten typical customer support questions:
queries = \"\"\"\\\n1. Do you offer a trial period for your software before purchasing?\n2. I'm experiencing a glitch with your app, it keeps freezing after the latest update.\n3. What are the different pricing plans available for your subscription service?\"\n4. Can you provide instructions on how to reset my account password?\"\n5. I'm unsure about the compatibility of your product with my device, can you advise?\"\n6. How can I track my recent order and estimate its delivery date?\"\n7. Is there a customer loyalty program or rewards system for frequent buyers?\"\n8. I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\"\n9. Could you clarify the coverage and limitations of your product warranty?\"\n10. What are your customer support hours and how can I reach your team in case of emergencies?\n\"\"\"\n
We'll start by summarizing each query.
Let's try just using field names (without descriptions), perhaps they are enough to tell the model about what we want.
from dataclasses import dataclass\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_text: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information from customer queries.\"\n\n# the input query, including the above text\nin_text = \"Each line is a customer query. Extract information about each query:\\n\\n\" + queries\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Trial period inquiry', query_text='Do you offer a trial period for your software before purchasing?')\nQuery(id=2, query_summary='Technical issue', query_text=\"I'm experiencing a glitch with your app, it keeps freezing after the latest update.\")\nQuery(id=3, query_summary='Pricing inquiry', query_text='What are the different pricing plans available for your subscription service?')\nQuery(id=4, query_summary='Password reset request', query_text='Can you provide instructions on how to reset my account password?')\nQuery(id=5, query_summary='Compatibility inquiry', query_text=\"I'm unsure about the compatibility of your product with my device, can you advise?\")\nQuery(id=6, query_summary='Order tracking', query_text='How can I track my recent order and estimate its delivery date?')\nQuery(id=7, query_summary='Loyalty program inquiry', query_text='Is there a customer loyalty program or rewards system for frequent buyers?')\nQuery(id=8, query_summary='Refund policy inquiry', query_text=\"I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\")\nQuery(id=9, query_summary='Warranty inquiry', query_text='Could you clarify the coverage and limitations of your product warranty?')\nQuery(id=10, query_summary='Customer support inquiry', query_text='What are your customer support hours and how can I reach your team in case of emergencies?')\n
The summaries look good.
Let's now define tags and ask the model to classify each query into a tag. In the Tag class, we set its docstring to the rules we want for the classification. This is done in the docstring because Tag is not a dataclass, but derived from Enum.
No longer asking for the query_text in the Query class to keep output shorter.
from enum import Enum\n\nclass Tag(str, Enum):\n \"\"\"Queries can be classified into the following tags:\ntech_support: queries related with technical problems.\nbilling: post-sale queries about billing cycle, or subscription termination.\naccount: queries about user account problems.\npre_sales: queries from prospective customers (who have not yet purchased).\nother: all other query topics.\"\"\" \n TECH_SUPPORT = \"tech_support\"\n BILLING = \"billing\"\n PRE_SALES = \"pre_sales\"\n ACCOUNT = \"account\"\n OTHER = \"other\"\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_tag: Tag\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Asking about trial period', query_tag='pre_sales')\nQuery(id=2, query_summary='Reporting app issue', query_tag='tech_support')\nQuery(id=3, query_summary='Inquiring about pricing plans', query_tag='billing')\nQuery(id=4, query_summary='Requesting password reset instructions', query_tag='account')\nQuery(id=5, query_summary='Seeking device compatibility advice', query_tag='pre_sales')\nQuery(id=6, query_summary='Tracking order and delivery date', query_tag='other')\nQuery(id=7, query_summary='Inquiring about loyalty program', query_tag='billing')\nQuery(id=8, query_summary='Asking about refund policy', query_tag='pre_sales')\nQuery(id=9, query_summary='Seeking warranty information', query_tag='other')\nQuery(id=10, query_summary='Inquiring about customer support hours', query_tag='other')\n
The applied tags appear mostly reasonable.
Of course, pre-sales tagging could be done automatically from a database of existing customer contacts, but the model is doing a good job of identifying questions likely to be pre-sales, like ids 1, 5 and 8 which are questions typically asked before buying/subscribing.
Also, note that classification is being done from a single phrase. More information in each customer query would certainly allow for fine-grained classification.
Example's assets at GitHub.
"},{"location":"examples/tough_meeting/","title":"Tough meeting","text":"In this example we'll look at extracting participants and action items from a meeting transcript.
Start by creating the model. As you'll see below, the transcript is large, with complex language, so we'll use OpenAI's GPT-4 this time. You can still use a local model by uncommenting the commented lines below.
Make sure to set your OPENAI_API_KEY env variable.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# the transcript is large, so we'll create the model with a context length of 3072, which should be enough.\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\", ctx_len=3072)\n
We'll use a sample meeting transcript from https://www.ctas.tennessee.edu/eli/sample-meeting-transcript
transcript = \"\"\"\\\nChairman Wormsley (at the proper time and place, after taking the chair and striking the gavel on the table): This meeting of the CTAS County Commission will come to order. Clerk please call the role. (Ensure that a majority of the members are present.)\n\nChairman Wormsley: Each of you has received the agenda. I will entertain a motion that the agenda be approved.\n\nCommissioner Brown: So moved.\n\nCommissioner Hobbs: Seconded\n\nChairman Wormsley: It has been moved and seconded that the agenda be approved as received by the members. All those in favor signify by saying \"Aye\"?...Opposed by saying \"No\"?...The agenda is approved. You have received a copy of the minutes of the last meeting. Are there any corrections or additions to the meeting?\n\nCommissioner McCroskey: Mister Chairman, my name has been omitted from the Special Committee on Indigent Care.\n\nChairman Wormsley: Thank you. If there are no objections, the minutes will be corrected to include the name of Commissioner McCroskey. Will the clerk please make this correction. Any further corrections? Seeing none, without objection the minutes will stand approved as read. (This is sort of a short cut way that is commonly used for approval of minutes and/or the agenda rather than requiring a motion and second.)\n\nChairman Wormsley: Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Mister Chairman, I would like to make a motion to approve the resolution taking money from the Data Processing Reserve Account in the County Clerk's office and moving it to the equipment line to purchase a laptop computer.\n\nCommissioner Carmical: I second the motion.\n\nChairman Wormsley: This resolution has a motion and second. Will the clerk please take the vote.\n\nChairman Wormsley: The resolution passes. We will now take up old business. At our last meeting, Commissioner McKee, your motion to sell property near the airport was deferred to this meeting. You are recognized.\n\nCommissioner McKee: I move to withdraw that motion.\n\nChairman Wormsley: Commissioner McKee has moved to withdraw his motion to sell property near the airport. Seeing no objection, this motion is withdrawn. The next item on the agenda is Commissioner Rodgers'.\n\nCommissioner Rodgers: I move adopton of the resolution previously provided to each of you to increase the state match local litigation tax in circuit, chancery, and criminal courts to the maximum amounts permissible. This resolution calls for the increases to go to the general fund.\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: The sheriff is opposed to this increase.\n\nChairman Wormsley: Commissioner, you are out of order because this motion has not been seconded as needed before the floor is open for discussion or debate. Discussion will begin after we have a second. Is there a second?\n\nCommissioner Reinhart: For purposes of discussion, I second the motion.\n\nChairman Wormsley: Commissioner Rodgers is recognized.\n\nCommissioner Rodgers: (Speaks about the data on collections, handing out all sorts of numerical figures regarding the litigation tax, and the county's need for additional revenue.)\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: I move an amendment to the motion to require 25 percent of the proceeds from the increase in the tax on criminal cases go to fund the sheriff's department.\n\nChairman Wormsley: Commissioner Malone\n\nCommissioner Malone: I second the amendment.\n\nChairman Wormsley: A motion has been made and seconded to amend the motion to increase the state match local litigation taxes to the maximum amounts to require 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Any discussion? Will all those in favor please raise your hand? All those opposed please raise your hand. The amendment carries 17-2. We are now on the motion as amended. Any further discussion?\n\nCommissioner Headrick: Does this require a two-thirds vote?\n\nChairman Wormsley: Will the county attorney answer that question?\n\nCounty Attorney Fults: Since these are only courts of record, a majority vote will pass it. The two-thirds requirement is for the general sessions taxes.\n\nChairman Wormsley: Other questions or discussion? Commissioner Adams.\n\nCommissioner Adams: Move for a roll call vote.\n\nCommissioner Crenshaw: Second\n\nChairman Wormsley: The motion has been made and seconded that the state match local litigation taxes be increased to the maximum amounts allowed by law with 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Will all those in favor please vote as the clerk calls your name, those in favor vote \"aye,\" those against vote \"no.\" Nine votes for, nine votes against, one not voting. The increase fails. We are now on new business. Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Each of you has previously received a copy of a resolution to increase the wheel tax by $10 to make up the state cut in education funding. I move adoption of this resolution.\n\nChairman Wormsley: Commissioner Thompson\n\nCommissioner Thompson: I second.\n\nChairman Wormsley: It has been properly moved and seconded that a resolution increasing the wheel tax by $10 to make up the state cut in education funding be passed. Any discussion? (At this point numerous county commissioners speak for and against increasing the wheel tax and making up the education cuts. This is the first time this resolution is under consideration.) Commissioner Hayes is recognized.\n\nCommissioner Hayes: I move previous question.\n\nCommisioner Crenshaw: Second.\n\nChairman Wormsley: Previous question has been moved and seconded. As you know, a motion for previous question, if passed by a two-thirds vote, will cut off further debate and require us to vote yes or no on the resolution before us. You should vote for this motion if you wish to cut off further debate of the wheel tax increase at this point. Will all those in favor of previous question please raise your hand? Will all those against please raise your hand? The vote is 17-2. Previous question passes. We are now on the motion to increase the wheel tax by $10 to make up the state cut in education funding. Will all those in favor please raise your hand? Will all those against please raise your hand? The vote is 17-2. This increase passes on first passage. Is there any other new business? Since no member is seeking recognition, are there announcements? Commissioner Hailey.\n\nCommissioner Hailey: There will be a meeting of the Budget Committee to look at solid waste funding recommendations on Tuesday, July 16 at noon here in this room.\n\nChairman Wormsley: Any other announcements? The next meeting of this body will be Monday, August 19 at 7 p.m., here in this room. Commissioner Carmical.\n\nCommissioner Carmical: There will be a chili supper at County Elementary School on August 16 at 6:30 p.m. Everyone is invited.\n\nChairman Wormsley: Commissioner Austin.\n\nCommissioner Austin: Move adjournment.\n\nCommissioner Garland: Second.\n\nChairman Wormsley: Without objection, the meeting will stand adjourned.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information and output in JSON format.\"\n
As you can see, this is a quite large transcript, filled with long names and complex phrases. Let's see how the model will handle it...
Let's start by extracting the names of the participants in the meeting.
We'll create the Meeting class with a list of strings, to receive the names of mentioned participants.
The model will take clues from the variable names as well as from the description Field we set. In this case we name the string list \"participants\" and add a description of what we're looking to receive.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
participants=['Chairman Wormsley', 'Commissioner Brown', 'Commissioner Hobbs', 'Commissioner McCroskey', 'Commissioner Adkins', 'Commissioner Carmical', 'Commissioner McKee', 'Commissioner Rodgers', 'Commissioner Duckett', 'Commissioner Reinhart', 'Commissioner Malone', 'Commissioner Headrick', 'County Attorney Fults', 'Commissioner Adams', 'Commissioner Crenshaw', 'Commissioner Thompson', 'Commissioner Hayes', 'Commissioner Hailey', 'Commissioner Carmical', 'Commissioner Austin', 'Commissioner Garland']\n
# print the generated participants list:\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
Some names appear twice (\"Commissioner Carmical\") and the \"clerk\", which is mentioned in the text, is not listed.
It's a matter of opinion if the clerk is an active participant, but let's try to fix the repeated names.
Let's try asking for a list of participants \"without repeated entries\", in the field's description:
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Wormsley\nBrown\nHobbs\nMcCroskey\nAdkins\nCarmical\nMcKee\nRodgers\nDuckett\nReinhart\nMalone\nHeadrick\nFults\nAdams\nCrenshaw\nThompson\nHayes\nHailey\nAustin\nGarland\n
Didn't work as expected, repetition is gone but it dropped the titles, only names are appearing.
Let's try asking for \"names and titles\":
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of names and titles of participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
And now \"Commissioner Carmical\" is repeating again!
Let's move on, the point is that you can also do some prompt engineering with the description field. And this model shortcoming could be dealt with by post-processing the received list.
Let's now also request a list of action items mentioned in the transcript:
class ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda'\nindex=2 name='Correct the minutes to include Commissioner McCroskey in the Special Committee on Indigent Care'\nindex=3 name='Approve the resolution to transfer funds from the Data Processing Reserve Account to purchase a laptop'\nindex=4 name='Withdraw the motion to sell property near the airport'\nindex=5 name='Adopt the resolution to increase the state match local litigation tax'\nindex=6 name=\"Amend the motion to allocate 25 percent of the proceeds from the tax increase to fund the sheriff's department\"\nindex=7 name='Vote on the state match local litigation taxes increase with the amendment'\nindex=8 name='Adopt the resolution to increase the wheel tax by $10 for education funding'\nindex=9 name='Hold a Budget Committee meeting on solid waste funding recommendations'\nindex=10 name='Announce the chili supper at County Elementary School'\n
These are reasonable action items.
Let's now also request a priority for each ActionItem - we'll create a string Enum class with three priority levels.
from enum import Enum\n\nclass ActionPriority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: ActionPriority = Field(description=\"Action item priority\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda' priority=<ActionPriority.HIGH: 'high'>\nindex=2 name='Correct the minutes to include Commissioner McCroskey' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=3 name='Approve the resolution to transfer funds for laptop purchase' priority=<ActionPriority.HIGH: 'high'>\nindex=4 name='Withdraw motion to sell property near the airport' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=5 name='Adopt resolution to increase state match local litigation tax' priority=<ActionPriority.HIGH: 'high'>\nindex=6 name=\"Amend resolution to allocate funds to sheriff's department\" priority=<ActionPriority.HIGH: 'high'>\nindex=7 name='Vote on the amended resolution for litigation tax increase' priority=<ActionPriority.HIGH: 'high'>\nindex=8 name='Adopt resolution to increase the wheel tax' priority=<ActionPriority.HIGH: 'high'>\nindex=9 name='Budget Committee meeting on solid waste funding' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=10 name='Announce chili supper at County Elementary School' priority=<ActionPriority.LOW: 'low'>\nindex=11 name='Adjourn the meeting' priority=<ActionPriority.MEDIUM: 'medium'>\n
It's not clear from the meeting transcript text if these priorities are correct, but some items related to taxes are receiving high priorities, from the context, it looks reasonable that taxes are a priority. : )
Example's assets at GitHub.
"},{"location":"extract/dataclass/","title":"Dataclass","text":"Besides simple types and enums, we can also extract objects whose structure is given by a dataclass definition:
Example
from sibila import Models\nfrom dataclasses import dataclass\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the Pydantic version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/dataclass/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For dataclasses this is done with Annotated[type, \"description\"] - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Annotated\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\n@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
Check the Extract dataclass example to see this in action.
"},{"location":"extract/enums/","title":"Enums","text":"Enumerations are important for classification tasks or in any situation where you need a choice to be made from a list of options.
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract([\"red\", \"blue\", \"green\", \"yellow\"], \n \"The car color was a shade of indigo\")\n
Result
'blue'\n
You can pass a list of items in any of the supported native types: str, float, int or bool.
"},{"location":"extract/enums/#literals","title":"Literals","text":"We can also use Literals:
Example
from typing import Literal\n\nmodel.extract(Literal[\"SPAM\", \"NOT_SPAM\", \"UNSURE\"], \n \"Hello my dear friend, I'm contacting you because I want to give you a million dollars\",\n inst=\"Classify this text on the likelihood of being spam\")\n
Result
'SPAM'\n
Extracting to a Literal type returns one of its possible options in its native type (str, float, int or bool).
"},{"location":"extract/enums/#enum-classes","title":"Enum classes","text":"Or Enum classes of native types. An example of extracting to Enum classes:
Example
from enum import IntEnum\n\nclass Heads(IntEnum):\n SINGLE = 1\n DOUBLE = 2\n TRIPLE = 3\n\nmodel.extract(Heads,\n \"The Two-Headed Monster from The Muppets.\")\n
Result
<Heads.DOUBLE: 2>\n
For the model, the important information is actual the value of each enum member, not its name. For example, in this enum, the model would only see the strings to the right of each member (the enum values), not \"RED\", \"ORANGE\" nor \"GREEN\":
class Light(Enum):\n RED = 'stop'\n YELLOW = 'slow down'\n GREEN = 'go'\n
See the Tag classification example to see how Enum is used to tag support queries.
"},{"location":"extract/enums/#classify","title":"Classify","text":"You can also use the classify() method to extract enumerations, which accepts the enum types we've seen above. It calls extract() internally and its only justification is to make things more readable:
Example
model.classify([\"mouse\", \"cat\", \"dog\", \"bird\"],\n \"Snoopy\")\n
Result
'dog'\n
"},{"location":"extract/free_text/","title":"Free text","text":"You can also generate free text by calling model():
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"Explain in a few lines how to build a brick wall?\")\nprint(response)\n
Result
To build a brick wall, follow these steps:\n\n1. Prepare the site by excavating and leveling the ground, then install a damp-proof \nmembrane and create a solid base with concrete footings.\n2. Lay a foundation of concrete blocks or bricks, ensuring it is level and square.\n3. Build the wall using bricks or blocks, starting with a corner or bonding pattern \nto ensure stability. Use mortar to bond each course (row) of bricks or blocks, \nfollowing the recommended mortar mix ratio.\n4. Use a spirit level to ensure each course is level, and insert metal dowels or use \nbrick ties to connect adjacent walls or floors.\n5. Allow the mortar to dry for the recommended time before applying a damp-proof \ncourse (DPC) at the base of the wall.\n6. Finish the wall with capping bricks or coping stones, and apply any desired \nrender or finish.\n
"},{"location":"extract/pydantic/","title":"Pydantic","text":"Besides simple types and enums, we can also extract objects whose structure is given by a class derived from Pydantic's BaseModel definition:
Example
from sibila import Models\nfrom pydantic import BaseModel\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the dataclass version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/pydantic/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For Pydantic this is done with Field(description=\"description\") - see the \"start\" and \"end\" attributes of the Period class:
Example
from pydantic import Field\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\nclass Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
Check the Extract Pydantic example to see this kind of extraction.
"},{"location":"extract/simple_types/","title":"Simple types","text":"Sibila can constrain model generation to output simple python types. This is helpful for situations where you want to extract a specific data type.
To get a response from the model in a certain type, you can use the extract() method:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(bool, \n \"Certainly, I'd like to subscribe.\")\n
Result
True\n
"},{"location":"extract/simple_types/#instructions-to-help-the-model","title":"Instructions to help the model","text":"You may need to provide more extra information to the model, so that it understands what you want. This is done with the inst argument - inst is a shorter name for instructions:
Example
model.extract(str, \n \"I don't quite remember the product's name, I think it was called Cornaca\",\n inst=\"Extract the product name\")\n
Result
Cornaca\n
"},{"location":"extract/simple_types/#supported-types","title":"Supported types","text":"The following simple types are supported:
- bool
- int
- float
- str
- datetime
About datetime type
A special note about extracting to datetime: the datetime type is expecting an ISO 8601 formatted string. Because some models are less capable than others at correctly formatting dates/times, it helps to mention in the instructions that you want the output in \"ISO 8601\" format.
from datetime import datetime\nmodel.extract(datetime, \n \"Sure, glad to help, it all happened at December the 10th, 2023, around 3PM, I think\",\n inst=\"Output in ISO 8601 format\")\n
Result
datetime.datetime(2023, 12, 10, 15, 0)\n
"},{"location":"extract/simple_types/#lists","title":"Lists","text":"You can extract lists of any of the supported types (simple types, enum, dataclass, Pydantic).
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\")\n
Result
['Naples', 'Genoa', 'Florence', 'Rome']\n
As in all extractions, you may need to set the instructions text to specify what you want from the model. Just as an example of the power of instructions, let's add instructions asking for country output: it will still output a list, but with a single element - 'Italy':
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\",\n inst=\"Output the country\")\n
Result
['Italy']\n
"},{"location":"models/find_local_models/","title":"Finding new models","text":""},{"location":"models/find_local_models/#chat-or-instruct-types-only","title":"Chat or instruct types only","text":"Sibila can use models that were fine-tuned for chat or instruct purposes. These models work in user - assistant turns or messages and use a chat template to properly compose those messages to the format that the model was fine-tuned to.
For example, the Llama2 model was released in two editions: a simple Llama2 text completion model and a Llama2-instruct model that was fine tuned for user-assistant turns. For Sibila you should always select chat or instruct versions of a model.
But which model to choose? You can look at model benchmark scores in popular listing sites:
- https://llm.extractum.io/list/
- https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
"},{"location":"models/find_local_models/#find-a-quantized-version-of-the-model","title":"Find a quantized version of the model","text":"Since Large Language Models are quite big, they are usually quantized so that each parameter occupies a little more than 4 bits or half a byte.
Without quantization, a 7 billion parameters model would require 14Gb of memory (with each parameter taking 16 bits) to load and a bit more during inference.
But with quantization techniques, a 7 billion parameters model can have a file size of only 4.4Gb (using about 50% more in memory - 6.8Gb), which makes it accessible to be ran in common GPUs or even in common RAM memory (albeit slower).
Quantized models are stored in a file format popularized by llama.cpp, the GGUF format (which means GPT-Generated Unified Format). We're using llama.cpp to run local models, so we'll be needing GGUF files.
A good place to find quantized models is in HuggingFace's model hub, particularly in the well-know TheBloke's (Tom Jobbins) area:
https://huggingface.co/TheBloke
TheBloke is very prolific in producing quality quantized versions of models, usually shortly after they are released.
And a good model that we'll be using for the examples is a 4 bit quantization of the OpenChat-3.5 model, which itself is a fine-tuning of Mistral-7b:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
"},{"location":"models/find_local_models/#download-the-file-into-the-models-folder","title":"Download the file into the \"models\" folder","text":"See the OpenChat model section on how to download models with the sibila CLI tool or manually in your browser.
The OpenChat model already includes the chat template format in its metadada, but for some other models we'll need to set the format - see the Setup chat template format section on how to handle this.
"},{"location":"models/formats_json/","title":"Managing formats","text":"A \"formats.json\" file stores the chat template definitions used in models. This allows for models that don't have a chat template in their metadata to be detected and get the right format so they can function well.
If you downloaded the GitHub repository, you'll find a file named \"sibila/res/base_formats.json\", which is the default base configuration that will be used, with many known chat template formats.
When you call Models.setup(), any \"formats.json\" file found in the folder will be loaded and its definitions will be merged with the ones from \"base_formats.json\" which are loaded on initialization. Any entries with the same name will be replaced by freshly loaded ones.
How to add a new format entry that can be used when creating a model? You can do it with the sibila CLI tool or by manually editing the formats.json file.
"},{"location":"models/formats_json/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s openchat openchat \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'openchat' with match='openchat', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/formats_json/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative, we can edit the \"formats.json\" file in the \"Models\" folder, and add the entry:
\"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n},\n
In the \"openchat\" key value we have a dictionary with the following keys:
Key match Regular expression that will be used to match the model name or filename template The chat template definition in Jinja format The \"openchat\" format name we are defining here is the name you can use when creating a model, by setting the format argument:
model = LlamaCppModel.create(\"openchat-3.5-1210.Q4_K_M.gguf\",\n format=\"openchat\")\n
or to be more practical: \"openchat\" is also the format name you would use when creating a \"models.json\" entry for a model, in the \"format\" key:
\"openchat\": {\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
See the \"base_formats.json\" file for all the default base formats.
"},{"location":"models/local_model/","title":"Using a local model","text":"Sibila uses llama.cpp to run local models, which are ordinary files in the GGUF format. You can download local models from places like the Hugging Face model hub.
Most current 7B quantized models are very capable for common data extraction tasks (and getting better all the time). We'll see how to find and setup local models for use with Sibila. If you only plan to use OpenAI remote models, you can skip this section.
"},{"location":"models/local_model/#openchat-model","title":"OpenChat model","text":"By default, most of the examples included with Sibila use OpenChat, a very good 7B parameters quantized model: https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
You can download this model with the sibila CLI tool or manually in your browser.
"},{"location":"models/local_model/#download-with-sibila-hub","title":"Download with \"sibila hub\"","text":"Open a command line prompt in the \"models\" folder if you downloaded the GitHub repository, or create a folder named \"models\".
Run this command:
sibila hub -d TheBloke/openchat-3.5-1210-GGUF -f openchat-3.5-1210.Q4_K_M.gguf\n
After downloading the 4.4Gb, the file \"openchat-3.5-1210.Q4_K_M.gguf\" will be available in your \"models\" folder and you can run the examples. You can do the same to download any other GGUF models.
"},{"location":"models/local_model/#manual-download","title":"Manual download","text":"Alternatively, you can download in your browser from this URL:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF/blob/main/openchat-3.5-1210.Q4_K_M.gguf
In the linked page, click \"download\" and save this file into a \"models\" folder. If you downloaded the Sibila GitHub repository it already includes a \"models\" folder which you can use. Otherwise, just create a \"models\" folder, where you'll store your local model files.
Once the file \"openchat-3.5-1210.Q4_K_M.gguf\" is placed in the \"models\" folder, you should be able to run the examples.
"},{"location":"models/local_model/#llamacppmodel-class","title":"LlamaCppModel class","text":"Local llama.cpp models can be used with the LlamaCppModel class. Let's generate text after our prompt:
Example
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"../../models/openchat-3.5-1210.Q4_K_M.gguf\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
It worked: the model answered with the continuation of the famous poem.
You'll notice that the first time you create the model object and run a query, it will take longer, because the model must load all its parameters into layers in memory. The next queries will work much faster.
"},{"location":"models/local_model/#a-note-about-out-of-memory-errors","title":"A note about out of memory errors","text":"An important thing to know if you'll be using local models is about \"Out of memory\" errors.
A 7B model like OpenChat-3.5, when quantized to 4 bits will occupy about 6.8 Gb of memory, in either GPU's VRAM or common RAM. If you try to run a second model at the same time, you might get an out of memory error and/or llama.cpp may crash: it all depends on the memory available in your computer.
This is less of a problem when running scripts from the command line, but in environments like Jupyter where you can have multiple open notebooks, you may get \"out of memory\" errors or python kernel errors like:
Error
Kernel Restarting\nThe kernel for sibila/examples/name.ipynb appears to have died.\nIt will restart automatically.\n
If you get an error like this in JupyterLab, open the Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any out-of-memory stuck models.
A good practice is to delete any local model after you no longer need it or right before loading a new one. A simple \"del model\" works fine, or you can add these two lines before creating a model:
try: del model\nexcept: ...\n\nmodel = LlamaCppModel(...)\n
This way, any existing model in the current notebook is deleted before creating a new one.
However this won't work across multiple notebooks. In those cases, open JupyterLab's Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any models currently in memory.
"},{"location":"models/models_factory/","title":"Models factory","text":"The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The Models factory class is a more flexible way to create models, for example:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"openai:gpt-4\")\n
The first line calls Models.setup() to initialize the factory with the folder where model files and configs (\"models.json\" and \"formats.json\") are located.
The second line calls Models.create() to create a model from the name \"openai:gpt-4\". In this case we created a remote model, but we could as well create a local model based in a GGUF file.
The names should be in the format \"provider:model_name\" and Sibila currently supports two providers:
Provider Type Creates object of type llamacpp Local GGUF model LlamaCppModel openai Remote model OpenAIModel The name part, after the \"provider:\" must either be:
- A remote model name, like \"gpt-4\": \"openai:gpt-4\"
- A local model name, like \"openchat\": \"llamacpp:openchat\"
- The actual filename of a model in the \"models\" folder: \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\" - this is the form we use in the examples, but of course using \"openchat\" instead of the filename would be better...
Although you can use filenames as model names, it's generally a better idea, for continued use, to create an entry in the \"models.json\" file - this allows future model replacement to be much easier.
See Managing models to learn how to register these model names.
"},{"location":"models/models_json/","title":"Managing models","text":"Model names are stored in a file named \"models.json\", in your \"models\" folder. Models registered in this file can then be used when calling Models.create() to create an instance of the model.
Registering a name is not strictly needed, as you can create models from their filenames or remote model names, for example in most examples you'll find models created with:
model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n
However, it's a good idea to register a name, specially if you'll be using a model for some time, or there's the possibility you'll need to replace it later. If you register a name, only that will later need to be changed.
There are two ways of registering names: by using the sibila CLI tool or by directly editing the \"models.json\" file.
"},{"location":"models/models_json/#use-the-sibila-models-cli-tool","title":"Use the \"sibila models\" CLI tool","text":"To register a model with the Models factory you can use the \"sibila models\" tool. Run in the \"models\" folder:
> sibila models -s \"llamacpp:openchat openchat-3.5-1210.Q4_K_M.gguf\" openchat\n\nUsing models directory '.'\nSet model 'llamacpp:openchat' with name='openchat-3.5-1210.Q4_K_M.gguf', \nformat='formatx' at './models.json'.\n
First argument after -s is the new entry name (including the \"llamacpp:\" provider), then the filename, then the chat template format, if needed.
This will create an \"openchat\" entry in \"models.json\", exactly like the manually created below.
"},{"location":"models/models_json/#manually-edit-modelsjson","title":"Manually edit \"models.json\"","text":"In alternative, you can manually register a model name by editing the \"models.json\" file located in you \"models\" folder.
A \"models.json\" file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Looking at the above structure, we have two top entries for providers \"llamacpp\" and \"openai\", and also an \"alias\" entry.
Inside each provider entry, we have a \"_defaults\" key, which can store a base GenConf or other arguments passed during model creation. The default values defined in \"_default\" entries can later be overridden by any keys of the same name specified in each model definition. You can see this in the \"phi2\" entry, which overrides the genconf entry given in the above \"_default\", setting temperature to 2.0. Keys are merged element-wise from any specified in the \"_defaults\" entry for the provider: keys with the same name are overridden, all other keys are inherited.
In the above \"model.json\" example, let's look at the \"openchat\" model entry:
\"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
The \"openchat\" key name is the name you'll use to create the model as \"llamacpp:openchat\":
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:openchat\")\n
You can have the following keys in a model entry:
Key name The filename to use when loading a model (or remote model name) format Identifies the chat template format that it should use, from the \"formats.json\" file. Some local models include the chat template format in their metadata, so this key is optional. genconf Default GenConf (generation config settings) used to create the model, which will default to use them in each generation. These config settings are merged element-wise from any specified in the \"_defaults\" entry for the provider. other Any other keys will be passed during model creation as its arguments. You can learn which arguments are possible in the API reference for LlamaCppModel or OpenAIModel. For example you can pass \"ctx_len\": 2048 to define the context length to use. As genconf, these keys are merged element-wise from any specified in the \"_defaults\" entry for the provider. The \"alias\" entry is a handy way to keep names that point to actual model entries (independent of provider). Note the two alias entries \"develop\" and \"production\" in the above \"models.json\" - you could then create the production model by doing:
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"production\")\n
Alias entries can be used as \"alias:production\" or without the \"alias:\" provider, just as \"production\" as in the example above. For an example of a JSON file with many models defined, see the \"models/models.json\" file.
"},{"location":"models/remote_model/","title":"Remote models","text":"Sibila can use OpenAI remote models, for which you'll need a paid OpenAI account and its API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Another possibility is to store your OpenAI key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/remote_model/#model-names","title":"Model names","text":"OpenAI models can be used by Sibila through the OpenAIModel class. To get a list of known model names:
Example
from sibila import OpenAIModel\n\nOpenAIModel.known_models()\n
Result
['gpt-4-0613',\n'gpt-4-32k-0613',\n'gpt-4-0314',\n'gpt-4-32k-0314',\n'gpt-4-1106-preview',\n'gpt-4',\n'gpt-4-32k',\n'gpt-3.5-turbo-1106',\n'gpt-3.5-turbo-0613',\n'gpt-3.5-turbo-16k-0613',\n'gpt-3.5-turbo-0301',\n'gpt-3.5-turbo',\n'gpt-3.5-turbo-16k',\n'gpt-3',\n'gpt-3.5']\n
You can use any of these model names to create an OpenAI model. For example:
Example
model = OpenAIModel(\"gpt-3.5\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
"},{"location":"models/setup_format/","title":"Chat template format","text":""},{"location":"models/setup_format/#what-are-chat-templates","title":"What are chat templates?","text":"Because these models were fine-tuned for chat or instruct interaction, they use a chat template, which is a Jinja template that converts a list of messages into a text prompt. This template must follow the original format that the model was trained on - this is very important or you won't get good results.
Chat template definitions are Jinja templates like the following one, which is in ChatML format:
{% for message in messages %}\n {{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n'}}\n{% endfor %}\n
When ran over a list of messages with system, user and model messages, the template produces text like the following:
<|im_start|>system\nYou speak like a pirate.<|im_end|>\n<|im_start|>user\nHello there?<|im_end|>\n<|im_start|>assistant\nAhoy there matey! How can I assist ye today on this here ship o' mine?<|im_end|>\n
Only by using the specific chat template for the model, can we get back the best results.
Sibila tries to automatically detect which template to use with a model, either from the model name or from embedded metadata, if available.
"},{"location":"models/setup_format/#does-the-model-have-a-built-in-chat-template-format","title":"Does the model have a built-in chat template format?","text":"Some GGUF models include the chat template in their metadata, unfortunately this is not standard.
You can quickly check if the model has a chat template by running the sibila CLI in the same folder as the model file:
> sibila models -t \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\nUsing models directory '.'\nTesting model 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf'...\nModel 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf' was properly created and should run fine.\n
In this case the chat template format is included with the model and nothing else is needed.
Another way to test this is to try creating the model in python. If no exception is raised, the model GGUF file contains the template definition and should work fine.
Example of model creation error
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Error
...\n\nValueError: Could not find a suitable format (chat template) for this model.\nWithout a format, fine-tuned models cannot function properly.\nSee the docs on how you can fix this: pass the template in the format arg or \ncreate a 'formats.json' file.\n
But if you get an error such as above, you'll need to provide a chat template. It's quite easy - let's see how to do it.
"},{"location":"models/setup_format/#find-the-chat-template-format","title":"Find the chat template format","text":"So, how to find the chat template for a new model that you intend to use?
This is normally listed in the model's page: search in that page for \"template\" and copy the listed Jinja template text.
If the template isn't directly listed in the model's page, you can look for a file named \"tokenizer_config.json\" in the main model files. This file should include an entry named \"chat_template\" which is what we want.
Example of a tokenizer_config.json file
For example, in OpenChat's file \"tokenizer_config.json\":
https://huggingface.co/openchat/openchat-3.5-1210/blob/main/tokenizer_config.json
You'll find this line with the template:
{\n \"...\": \"...\",\n\n \"chat_template\": \"{{ bos_token }}...{% endif %}\",\n\n \"...\": \"...\"\n}\n
The value in the \"chat_template\" key is the Jinja template that we're looking for.
Another alternative is to search online for the name of the model and \"chat template\".
Either way, once you know the template used by the model, you can set and use it.
"},{"location":"models/setup_format/#option-1-pass-the-chat-template-format-when-creating-the-model","title":"Option 1: Pass the chat template format when creating the model","text":"Once you know the chat template definition you can create the model and pass it in the format argument. Let's assume you have a model file named \"peculiar-model-7b.gguf\":
chat_template = \"{{ bos_token }}...{% endif %}\"\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\",\n format=chat_template)\n
And the model should now work without problems.
"},{"location":"models/setup_format/#option-2-add-the-chat-template-to-the-models-factory","title":"Option 2: Add the chat template to the Models factory","text":"If you plan to use the model many times, a more convenient solution is to create an entry in the \"formats.json\" file so that all further models with this name will use the template.
"},{"location":"models/setup_format/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s peculiar peculiar-model \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'peculiar' with match='peculiar-model', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/setup_format/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative to using the sibila CLI tool, you can add the chat template format by creating an entry in a \"formats.json\" file, in the same folder as the model, with these fields:
{\n \"peculiar\": {\n \"match\": \"peculiar-model\",\n \"template\": \"{{ bos_token }}...{% endif %}\"\n }\n}\n
The \"match\" field is regular expression that will be used to match the model name or filename. Field \"template\" is the chat template in Jinja format.
After configuring the template as we've seen above, all you need to do is to create a LlamaCppModel object and pass the model file path.
model = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Note that we're not passing the format argument anymore when creating the model. The \"match\" regular expression we defined above will recognize the model from the filename and use the given chat template format.
Base format definitions
Sibila includes by default the definitions of several well-known chat template formats. These definitions are available in \"sibila/base_formats.json\", and are automatically loaded when Models factory is created.
You can add any chat template formats into your own \"formats.json\" files, but please never change the \"sibila/base_formats.json\" file, to avoid potential errors.
"},{"location":"models/sibila_cli/","title":"Sibila CLI tool","text":"The Sibila Command-Line Interface tool simplifies managing the Models factory and is useful to download models from Hugging Face model hub.
The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The CLI tool is divided in three areas or actions:
Action models Manage models in \"model.json\" files formats Manage formats in \"model.json\" files hub Search and download models from Hugging Face model hub In all commands you should pass the option \"-m models_folder\" with the path to the \"models\" folder. Or in alternative run the commands inside the \"models\" folder.
The following argument names are used below (other unlisted names should be descriptive enough):
Name res_name Model entry name in the form \"provider:name\", for example \"llamacpp:openchat\". format_name Name of a format entry in \"formats.json\", for example \"chatml\". query Case-insensitive query that will be matched by a substring search. Usage help is available by running \"sibila --help\" for general help, or \"sibila action --help\", where action is one of \"models\", \"formats\" or \"hub\".
"},{"location":"models/sibila_cli/#sibila-models","title":"Sibila models","text":"To register a model entry pointing to a model name or filename:
sibila models -s res_name model_name_or_filename [format_name]\n
To set the format_name for an existing model entry:
sibila models -f res_name format_name\n
To test if a model can run (for example to check if it has the chat template format defined):
sibila models -t res_name\n
List all models with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a model entry in:
sibila models -d res_name\n
"},{"location":"models/sibila_cli/#sibila-formats","title":"Sibila formats","text":"Check if a model filename has any format defined in the Models factory:
sibila formats -q filename\n
To register a chat template format, where match is a regexp that matches the model filename, template is the Jinja chat template:
sibila formats -s format_name match template\n
List all formats with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a format entry:
sibila formats -d format_name\n
Update the local \"formats.json\" file by merging with with GitHub's \"sibila/res/base_formats.json\" file, preserving all existing local entries.
sibila formats -u\n
"},{"location":"models/sibila_cli/#sibila-hub","title":"Sibila hub","text":"List models in the Hugging Face model hub that match the given queries. Argument query can be a list of strings to match, separated by a space character.
Arg Filename is case-insensitive for substring matching.
Arg exact_author is an exact and case-sensitive author name from Hugging Face model hub.
sibila hub -l query [-f filename] [-a exact_author]\n
To download a model, where model_id is a string like \"TheBloke/openchat-3.5-1210-GGUF\". Args filename and author_name same as above:
sibila hub -d model_id -f filename -a exact_author -s set name\n
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Sibila","text":"Extract structured data from remote or local file LLM models.
- Extract data into Pydantic objects, dataclasses or simple types.
- Same API for local file models and remote OpenAI models.
- Model management: download models, manage configuration and quickly switch between models.
- Tools for evaluating output across local/remote models, for chat-like interaction and more.
See What can you do with Sibila?
To extract structured data from a local model:
from sibila import Models\nfrom pydantic import BaseModel\n\nclass Info(BaseModel):\n event_year: int\n first_name: str\n last_name: str\n age_at_the_time: int\n nationality: str\n\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
Returns an instance of class Info, created from the model's output:
Info(event_year=1969,\n first_name='Neil',\n last_name='Armstrong',\n age_at_the_time=38,\n nationality='American')\n
Or to use OpenAI's GPT-4, we would simply replace the model's name:
model = Models.create(\"openai:gpt-4\")\n\nmodel.extract(Info, \"Who was the first man in the moon?\")\n
If Pydantic BaseModel objects are too much for your project, Sibila supports similar functionality with Python dataclass.
"},{"location":"first_run/","title":"First run","text":""},{"location":"first_run/#with-a-remote-model","title":"With a remote model","text":"To use an OpenAI remote model, you'll need a paid OpenAI account and its API key. You can explicitly pass this key in your script but this is a poor security practice.
A better way is to define an environment variable which the OpenAI API will use when needed:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Having set this variable with your OpenAI API key, you can run a \"Hello Model\" like this:
Example
from sibila import OpenAIModel, GenConf\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
You're all set if you only plan to use remote OpenAI models.
"},{"location":"first_run/#with-a-local-model","title":"With a local model","text":"Local models run from files in GGUF format which are loaded run by the llama.cpp component.
You'll need to download a GGUF model file: we suggest OpenChat 3.5 - an excellent 7B parameters quantized model that will run in less thant 7Gb of memory.
To download the OpenChat model file, please see Download OpenChat model.
After downloading the file, you can run this \"Hello Model\" script:
Example
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder - change if different:\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Result
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
If the above scripts output similar pirate talk, Sibila should be working fine.
"},{"location":"installing/","title":"Installing","text":""},{"location":"installing/#installation","title":"Installation","text":"Sibila requires Python 3.9+ and uses the llama-cpp-python package for local models and OpenAI's API to access remote models like GPT-4.
Install Sibila from PyPI by running:
pip install sibila\n
If you only plan to use remote models (OpenAI), there's nothing else you need to do. See First Run to get it going.
Installation in edit mode Alternatively you can install Sibila in edit mode by downloading the GitHub repository and running the following in the base folder of the repository:
pip install -e .\n
"},{"location":"installing/#enabling-llamacpp-hardware-acceleration","title":"Enabling llama.cpp hardware acceleration","text":"Local models will run faster with hardware acceleration enabled. Sibila uses llama-cpp-python, a python wrapper for llama.cpp and it's a good idea to make sure it was installed with the best optimization your computer can offer.
See the following sections: depending on which hardware you have, you can run the listed command which will reinstall llama-cpp-python with the selected optimization. If any error occurs you can always install the non-accelerated version, as listed at the end.
"},{"location":"installing/#for-cuda-nvidia-gpus","title":"For CUDA - NVIDIA GPUs","text":"For CUDA acceleration in NVIDA GPUs, you'll need to Install the NVIDIA CUDA Toolkit.
LinuxWindows CMAKE_ARGS=\"-DLLAMA_CUBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_CUBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
Installing llama-cpp-python with NVIDIA GPU Acceleration on Windows: A Short Guide More info: Installing llama-cpp-python with GPU Support.
"},{"location":"installing/#for-metal-apple-silicon-macs","title":"For Metal - Apple silicon macs","text":"Mac CMAKE_ARGS=\"-DLLAMA_METAL=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-rocm-amd-gpus","title":"For ROCm AMD GPUS","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_HIPBLAS=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_HIPBLAS=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#for-vulkan-supporting-gpus","title":"For Vulkan supporting GPUs","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_VULKAN=on\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_VULKAN=on\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"installing/#cpu-acceleration-if-none-of-the-above","title":"CPU acceleration (if none of the above)","text":"Linux and MacWindows CMAKE_ARGS=\"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\" \\\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
$env:CMAKE_ARGS = \"-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS\"\npip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
If you get an error running the above commands, please see llama-cpp-python's Installation configuration.
"},{"location":"installing/#non-accelerated","title":"Non-accelerated","text":"In any case, you can always install llama-cpp-python without acceleration by running:
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir\n
"},{"location":"tips/","title":"Tips and Tricks","text":"Some general tips from experience with constrained model output in Sibila.
"},{"location":"tips/#temperature","title":"Temperature","text":"Sibila aims at exact results, so generation temperature defaults to 0. You should get the same results from the same model at all times.
For \"creative\" outputs, you can set the temperature to a non-zero value. This is done in GenConf, which can be passed in many places, for example during actual generation/extraction:
Example
from sibila import (Models, GenConf)\n\nModels.setup(\"../models\")\n\nmodel = Models.create(\"llamacpp:openchat\") # default GenConf could be passed here\n\nfor i in range(10):\n print(model.extract(int,\n \"Think of a random number between 1 and 100\",\n genconf=GenConf(temperature=2.)))\n
Result
72\n78\n75\n68\n39\n47\n53\n82\n72\n63\n
"},{"location":"tips/#split-entities-into-separate-classes","title":"Split entities into separate classes","text":"Suppose you want to extract a list of person names from a group. You could use the following class:
class Group(BaseModel):\n persons: list[str] = Field(description=\"List of persons\")\n group_info: str\n\nout = model.extract(Group, in_text)\n
But it tends to work better to separate the Person entity into its own class and leave the list in Group:
class Person(BaseModel):\n name: str\n\nclass Group(BaseModel):\n persons: list[Person]\n group_info: str\n\nout = model.extract(Group, in_text)\n
The same applies to the equivalent dataclass definitions.
Adding descriptions seems to always help, specially for non-trivial extraction. Without descriptions, the model can only look into variable names for clues on what's wanted, so it's important to tell it what we want by adding field descriptions.
"},{"location":"tools/","title":"Tools","text":"The tools module includes some utilities to simplify common tasks.
"},{"location":"tools/#interact","title":"Interact","text":"The interact() function allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done inside interact(), which can manage thread messages and delete older ones when the context maximum length is reached.
Example
from sibila import Models, GenConf\nfrom sibila.tools import interact\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nprint(\"Help available by typing '!'. Enter an empty line to quit.\")\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
Result
Help available by typing '!'. Enter an empty line to quit.\n\n> Hello there!\n\nHello! How can I help you today?\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\nYes, I am familiar with the AI language model named Sibila. It is a large language \nmodel developed by a Spanish company called Wall Street Horus. Sibila is designed to \nunderstand and generate text in multiple languages, particularly those spoken in \nSpain and Latin America. It is similar to other language models like OpenAI's GPT-3 \nand Google's BERT, but it has been specifically fine-tuned for the Spanish language.\nIf you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n(NOTE: ignore the above model-generated paragraph, it's total hallucination)\n\n\n> No way! I've just searched and there's no such company nor model. \nYou are hallucinating, Mr. Model!\n\nApologies for the confusion in my previous response. It appears that Sibila is not \nan existing AI language model or a known company. I may have mixed up information \nfrom different sources. If you have any other questions or need assistance with a \ndifferent topic, please feel free to ask!\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining \na chat with you. It stores all user and model messages in a sibila.Context class.\n\nGot it! So, you are using a function from the Sibila package to have a conversation \nwith me, and it's storing our messages in a sibila.Context object. If you need any help \nor information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n(Continues...)\n
Check the Interact example to see a chat created with this function.
"},{"location":"tools/#multigen","title":"Multigen","text":"The multigen module includes functions that build a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
The query_multigen() function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
See the Compare example for a side-by-side comparison of a local and a remote model, answering to the same queries.
"},{"location":"what/","title":"What can you do with Sibila?","text":"LLM models answer your questions in the best way their training allows, but they always answer back in plain text (or tokens).
With Sibila, you can extract structured data from LLM models. Not whatever the model chose to output (even if you asked it to answer in a certain format), but the exact fields and types that you need.
This not only simplifies handling the model responses but can also open new possibilities: you can now deal with the model in a structured way.
"},{"location":"what/#extract-pydantic-dataclasses-or-simple-types","title":"Extract Pydantic, dataclasses or simple types","text":"To specify the structured output that you want from the model, you can use Pydantic's BaseModel derived classes, or the lightweight Python dataclasses, if you don't need the whole Pydantic.
With Sibila, you can also use simple data types like bool, int, str, enumerations or lists. For example, need to classify something?
Example
from sibila import Models\n\nmodel = Models.create(\"openai:gpt-4\")\n\nmodel.classify([\"good\", \"neutral\", \"bad\"], \n \"Running with scissors\")\n
Result
'bad'\n
How does it work? Extraction to the given data types is guaranteed by automatic JSON Schema grammars in local models, or by the Tools functionality of OpenAI API remote models.
"},{"location":"what/#from-your-models-or-openais","title":"From your models or OpenAI's","text":"Small downloadable 7B parameter models are getting better every month and they have reached a level where they are competent enough for most common data extraction or summarization tasks.
With 8Gb or more of RAM or GPU memory, you can get good structured output from models like OpenChat, Zephyr, Mistral 7B, or any other GGUF file.
You can use any paid OpenAI model, as well as any model that llama.cpp can run, with the same API. Choose the best model for each use, allowing you the freedom of choice.
"},{"location":"what/#with-model-management","title":"With model management","text":"Includes a Models factory that creates models from simple names instead of having to track model configurations, filenames or chat templates.
local_model = Models.create(\"llamacpp:openchat\")\n\nremote_model = Models.create(\"openai:gpt-4\") \n
This makes the switch to newer models much easier, and makes it simpler to compare model outputs.
Sibila includes a CLI tool to download GGUF models from Hugging Face model hub, and to manage its Models factory.
"},{"location":"api-reference/generation/","title":"Generation configs, results and errors","text":""},{"location":"api-reference/generation/#generation-configs","title":"Generation Configs","text":""},{"location":"api-reference/generation/#sibila.GenConf","title":"GenConf dataclass
","text":"Model generation configuration, used in Model.gen() and variants.
"},{"location":"api-reference/generation/#sibila.GenConf.max_tokens","title":"max_tokens class-attribute
instance-attribute
","text":"max_tokens = 0\n
Max generated token length. 0 means all available up to output context size (which equals: model.ctx_len - in_prompt_len)
"},{"location":"api-reference/generation/#sibila.GenConf.stop","title":"stop class-attribute
instance-attribute
","text":"stop = field(default_factory=list)\n
List of generation stop text sequences
"},{"location":"api-reference/generation/#sibila.GenConf.temperature","title":"temperature class-attribute
instance-attribute
","text":"temperature = 0.0\n
Generation temperature. Use 0 to always pick the most probable output, without random sampling. Larger positive values will produce more random outputs.
"},{"location":"api-reference/generation/#sibila.GenConf.top_p","title":"top_p class-attribute
instance-attribute
","text":"top_p = 0.9\n
Nucleus sampling top_p value. Only applies if temperature > 0.
"},{"location":"api-reference/generation/#sibila.GenConf.format","title":"format class-attribute
instance-attribute
","text":"format = 'text'\n
Output format: \"text\" or \"json\". For JSON output, text is validated as in json.loads(). Thread msgs must explicitly request JSON output or a warning will be emitted if string json not present (this is automatically done in Model.json() and related calls).
"},{"location":"api-reference/generation/#sibila.GenConf.json_schema","title":"json_schema class-attribute
instance-attribute
","text":"json_schema = None\n
A JSON schema to validate the JSON output. Thread msgs must list the JSON schema and request its use; must also set the format to \"json\".
"},{"location":"api-reference/generation/#sibila.GenConf.__call__","title":"__call__","text":"__call__(**kwargs)\n
Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.
Parameters:
Name Type Description Default **kwargs
Any
update settings of the same names in the returned copy.
{}
Raises:
Type Description KeyError
If key does not exist.
Returns:
Type Description Self
A copy of the current object with kwargs values updated. Doesn't modify object.
Source code in sibila/gen.py
def __call__(self,\n **kwargs: Any) -> Self:\n \"\"\"Return a copy of the current GenConf updated with values in kwargs. Doesn't modify object.\n\n Args:\n **kwargs: update settings of the same names in the returned copy.\n\n Raises:\n KeyError: If key does not exist.\n\n Returns:\n A copy of the current object with kwargs values updated. Doesn't modify object.\n \"\"\"\n\n ret = copy(self)\n\n for k,v in kwargs.items():\n if not hasattr(ret, k):\n raise KeyError(f\"No such key '{k}'\")\n setattr(ret, k,v)\n\n return ret\n
"},{"location":"api-reference/generation/#sibila.GenConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/gen.py
def clone(self) -> Self:\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.as_dict","title":"as_dict","text":"as_dict()\n
Return GenConf as a dict.
Source code in sibila/gen.py
def as_dict(self) -> dict:\n \"\"\"Return GenConf as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenConf.from_dict","title":"from_dict staticmethod
","text":"from_dict(dic)\n
Source code in sibila/gen.py
@staticmethod\ndef from_dict(dic: dict) -> Any: # Any = GenConf\n return GenConf(**dic)\n
"},{"location":"api-reference/generation/#sibila.JSchemaConf","title":"JSchemaConf dataclass
","text":"Configuration for JSON schema massaging and validation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.resolve_refs","title":"resolve_refs class-attribute
instance-attribute
","text":"resolve_refs = True\n
Set for $ref references to be resolved and replaced with actual definition.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.collapse_single_combines","title":"collapse_single_combines class-attribute
instance-attribute
","text":"collapse_single_combines = True\n
Any single-valued \"oneOf\"/\"anyOf\" is replaced with the actual value.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.description_from_title","title":"description_from_title class-attribute
instance-attribute
","text":"description_from_title = 0\n
If a value doesn't have a description entry, make one from its title or name.
- 0: don't make description from name
- 1: copy title or name to description
- 2: 1: + capitalize first letter and convert _ to space: class_label -> \"class label\".
"},{"location":"api-reference/generation/#sibila.JSchemaConf.force_all_required","title":"force_all_required class-attribute
instance-attribute
","text":"force_all_required = False\n
Force all entries in an object to be required (except removed defaults if remove_with_default=True).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.remove_with_default","title":"remove_with_default class-attribute
instance-attribute
","text":"remove_with_default = False\n
Delete any values that have a \"default\" annotation.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.default_to_last","title":"default_to_last class-attribute
instance-attribute
","text":"default_to_last = True\n
Move any default value entry into the last position of properties dict.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.additional_allowed_root_keys","title":"additional_allowed_root_keys class-attribute
instance-attribute
","text":"additional_allowed_root_keys = field(default_factory=list)\n
By default only the following properties are allowed in schema's root: description, properties, type, required, additionalProperties, allOf, anyOf, oneOf, not Add to this list to allow additional root properties.
"},{"location":"api-reference/generation/#sibila.JSchemaConf.pydantic_strict_validation","title":"pydantic_strict_validation class-attribute
instance-attribute
","text":"pydantic_strict_validation = None\n
Validate JSON values in a strict manner or not. None means validate individually per each value in the obj. (for example in pydantic with: Field(strict=True)).
"},{"location":"api-reference/generation/#sibila.JSchemaConf.clone","title":"clone","text":"clone()\n
Return a copy of this configuration.
Source code in sibila/json_schema.py
def clone(self):\n \"\"\"Return a copy of this configuration.\"\"\"\n return copy(self)\n
"},{"location":"api-reference/generation/#results","title":"Results","text":""},{"location":"api-reference/generation/#sibila.GenRes","title":"GenRes","text":"Model generation result.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_STOP","title":"OK_STOP class-attribute
instance-attribute
","text":"OK_STOP = 1\n
Generation complete without errors.
"},{"location":"api-reference/generation/#sibila.GenRes.OK_LENGTH","title":"OK_LENGTH class-attribute
instance-attribute
","text":"OK_LENGTH = 0\n
Generation stopped due to reaching max_tokens.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON","title":"ERROR_JSON class-attribute
instance-attribute
","text":"ERROR_JSON = -1\n
Invalid JSON: this is often due to the model returning OK_LENGTH (finished due to max_tokens reached), which cuts off the JSON text.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_VAL","title":"ERROR_JSON_SCHEMA_VAL class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_VAL = -2\n
Failed JSON schema validation.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_JSON_SCHEMA_ERROR","title":"ERROR_JSON_SCHEMA_ERROR class-attribute
instance-attribute
","text":"ERROR_JSON_SCHEMA_ERROR = -2\n
JSON schema itself is not valid.
"},{"location":"api-reference/generation/#sibila.GenRes.ERROR_MODEL","title":"ERROR_MODEL class-attribute
instance-attribute
","text":"ERROR_MODEL = -3\n
Other model internal error.
"},{"location":"api-reference/generation/#sibila.GenRes.from_finish_reason","title":"from_finish_reason staticmethod
","text":"from_finish_reason(finish)\n
Convert a ChatCompletion finish result into a GenRes.
Parameters:
Name Type Description Default finish
str
ChatCompletion finish result.
required Returns:
Type Description Any
A GenRes result.
Source code in sibila/gen.py
@staticmethod\ndef from_finish_reason(finish: str) -> Any: # Any=GenRes\n \"\"\"Convert a ChatCompletion finish result into a GenRes.\n\n Args:\n finish: ChatCompletion finish result.\n\n Returns:\n A GenRes result.\n \"\"\"\n if finish == 'stop':\n return GenRes.OK_STOP\n elif finish == 'length':\n return GenRes.OK_LENGTH\n elif finish == '!json':\n return GenRes.ERROR_JSON\n elif finish == '!json_schema_val':\n return GenRes.ERROR_JSON_SCHEMA_VAL\n elif finish == '!json_schema_error':\n return GenRes.ERROR_JSON_SCHEMA_ERROR\n else:\n return GenRes.ERROR_MODEL\n
"},{"location":"api-reference/generation/#sibila.GenRes.as_text","title":"as_text staticmethod
","text":"as_text(res)\n
Returns a friendlier description of the result.
Parameters:
Name Type Description Default res
Any
Model output result.
required Raises:
Type Description ValueError
If unknown GenRes.
Returns:
Type Description str
A friendlier description of the GenRes.
Source code in sibila/gen.py
@staticmethod\ndef as_text(res: Any) -> str: # Any=GenRes\n \"\"\"Returns a friendlier description of the result.\n\n Args:\n res: Model output result.\n\n Raises:\n ValueError: If unknown GenRes.\n\n Returns:\n A friendlier description of the GenRes.\n \"\"\"\n\n if res == GenRes.OK_STOP:\n return \"Stop\"\n elif res == GenRes.OK_LENGTH:\n return \"Length (output cut)\"\n elif res == GenRes.ERROR_JSON:\n return \"JSON decoding error\"\n\n elif res == GenRes.ERROR_JSON_SCHEMA_VAL:\n return \"JSON SCHEMA validation error\"\n elif res == GenRes.ERROR_JSON_SCHEMA_ERROR:\n return \"Error in JSON SCHEMA\"\n\n elif res == GenRes.ERROR_MODEL:\n return \"Model internal error\"\n else:\n raise ValueError(\"Bad/unknow GenRes\")\n
"},{"location":"api-reference/generation/#errors","title":"Errors","text":""},{"location":"api-reference/generation/#sibila.GenError","title":"GenError","text":"GenError(out)\n
Model generation exception, raised when the model was unable to return a response.
An error has happened during model generation.
Parameters:
Name Type Description Default out
GenOut
Model output
required Source code in sibila/gen.py
def __init__(self, \n out: GenOut):\n \"\"\"An error has happened during model generation.\n\n Args:\n out: Model output\n \"\"\"\n\n assert out.res != GenRes.OK_STOP, \"OK_STOP is not an error\" \n\n super().__init__()\n\n self.res = out.res\n self.text = out.text\n self.dic = out.dic\n self.value = out.value\n
"},{"location":"api-reference/generation/#sibila.GenError.raise_if_error","title":"raise_if_error staticmethod
","text":"raise_if_error(out, ok_length_is_error)\n
Raise an exception if the model returned an error
Parameters:
Name Type Description Default out
GenOut
Model returned info.
required ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error?
required Raises:
Type Description GenError
If an error was returned by model.
Source code in sibila/gen.py
@staticmethod\ndef raise_if_error(out: GenOut,\n ok_length_is_error: bool):\n \"\"\"Raise an exception if the model returned an error\n\n Args:\n out: Model returned info.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error?\n\n Raises:\n GenError: If an error was returned by model.\n \"\"\"\n\n if out.res != GenRes.OK_STOP:\n if out.res == GenRes.OK_LENGTH and not ok_length_is_error:\n return # OK_LENGTH to not be considered an error\n\n raise GenError(out)\n
"},{"location":"api-reference/generation/#sibila.GenOut","title":"GenOut dataclass
","text":"Model output, returned by gen_extract(), gen_json() and other model calls that don't raise exceptions.
"},{"location":"api-reference/generation/#sibila.GenOut.res","title":"res instance-attribute
","text":"res\n
Result of model generation.
"},{"location":"api-reference/generation/#sibila.GenOut.text","title":"text instance-attribute
","text":"text\n
Text generated by model.
"},{"location":"api-reference/generation/#sibila.GenOut.dic","title":"dic class-attribute
instance-attribute
","text":"dic = None\n
Python dictionary, output by the structured calls like gen_json().
"},{"location":"api-reference/generation/#sibila.GenOut.value","title":"value class-attribute
instance-attribute
","text":"value = None\n
Initialized instance value, dataclass or Pydantic BaseModel object, as returned in calls like extract().
"},{"location":"api-reference/generation/#sibila.GenOut.as_dict","title":"as_dict","text":"as_dict()\n
Return GenOut as a dict.
Source code in sibila/gen.py
def as_dict(self):\n \"\"\"Return GenOut as a dict.\"\"\"\n return asdict(self)\n
"},{"location":"api-reference/generation/#sibila.GenOut.__str__","title":"__str__","text":"__str__()\n
Source code in sibila/gen.py
def __str__(self):\n out = f\"Error={self.res.as_text(self.res)} text=\u2588{self.text}\u2588\"\n if self.dic is not None:\n out += f\" dic={self.dic}\"\n if self.value is not None:\n out += f\" value={self.value}\"\n return out\n
"},{"location":"api-reference/model/","title":"Model classes","text":""},{"location":"api-reference/model/#local-models","title":"Local models","text":""},{"location":"api-reference/model/#sibila.LlamaCppModel","title":"LlamaCppModel","text":"LlamaCppModel(\n path,\n format=None,\n format_search_order=[\n \"name\",\n \"meta_template\",\n \"folder_json\",\n ],\n *,\n genconf=None,\n schemaconf=None,\n tokenizer=None,\n ctx_len=2048,\n n_gpu_layers=-1,\n main_gpu=0,\n n_batch=512,\n seed=4294967295,\n verbose=False,\n **llamacpp_kwargs\n)\n
Use local GGUF format models via llama.cpp engine.
Supports grammar-constrained JSON output following a JSON schema.
Parameters:
Name Type Description Default path
str
File path to the GGUF file.
required format
Optional[str]
Chat template format to use with model. Leave as None for auto-detection.
None
format_search_order
list[str]
Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].
['name', 'meta_template', 'folder_json']
genconf
Optional[GenConf]
Default generation configuration, which can be used in gen() and related. Defaults to None.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
ctx_len
int
Maximum context length to be used (shared for input and output). Defaults to 2048.
2048
n_gpu_layers
int
Number of model layers to run in a GPU. Defaults to -1 for all.
-1
main_gpu
int
Index of the GPU to use. Defaults to 0.
0
n_batch
int
Prompt processing batch size. Defaults to 512.
512
seed
int
Random number generation seed, for non zero temperature inference. Defaults to 4294967295.
4294967295
verbose
bool
Emit (very) verbose output. Defaults to False.
False
Raises:
Type Description ImportError
If llama-cpp-python is not installed.
ValueError
If ctx_len is 0 or larger than the values supported by model.
Source code in sibila/llamacpp.py
def __init__(self,\n path: str,\n\n format: Optional[str] = None, \n format_search_order: list[str] = [\"name\", \"meta_template\", \"folder_json\"],\n\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n tokenizer: Optional[Tokenizer] = None,\n ctx_len: int = 2048,\n\n # important LlamaCpp-specific args\n n_gpu_layers: int = -1,\n main_gpu: int = 0,\n n_batch: int = 512,\n seed: int = 4294967295,\n verbose: bool = False,\n\n # other LlamaCpp-specific args\n **llamacpp_kwargs\n ):\n \"\"\"\n Args:\n path: File path to the GGUF file.\n format: Chat template format to use with model. Leave as None for auto-detection.\n format_search_order: Search order for auto-detecting format, \"name\" searches in the filename, \"meta_template\" looks in the model's metadata, \"folder_json\" looks for configs in file's folder. Defaults to [\"name\",\"meta_template\", \"folder_json\"].\n genconf: Default generation configuration, which can be used in gen() and related. Defaults to None.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). Defaults to 2048.\n n_gpu_layers: Number of model layers to run in a GPU. Defaults to -1 for all.\n main_gpu: Index of the GPU to use. Defaults to 0.\n n_batch: Prompt processing batch size. Defaults to 512.\n seed: Random number generation seed, for non zero temperature inference. Defaults to 4294967295.\n verbose: Emit (very) verbose output. Defaults to False.\n\n Raises:\n ImportError: If llama-cpp-python is not installed.\n ValueError: If ctx_len is 0 or larger than the values supported by model.\n \"\"\"\n\n self._llama = None # type: ignore[assignment]\n self.tokenizer = None # type: ignore[assignment]\n\n if not has_llama_cpp:\n raise ImportError(\"Please install llama-cpp-python by running: pip install llama-cpp-python\")\n\n if ctx_len == 0:\n raise ValueError(\"LlamaCppModel doesn't support ctx_len=0\")\n\n super().__init__(True,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # update kwargs from important args\n llamacpp_kwargs.update(n_ctx=ctx_len,\n n_batch=n_batch,\n n_gpu_layers=n_gpu_layers,\n main_gpu=main_gpu,\n seed=seed,\n verbose=verbose\n )\n\n logger.debug(f\"Creating inner Llama with model_path='{path}', llamacpp_kwargs={llamacpp_kwargs}\")\n\n with normalize_notebook_stdout_stderr(not verbose):\n self._llama = Llama(model_path=path, **llamacpp_kwargs)\n\n self._model_path = path\n\n # correct super __init__ values\n self._ctx_len = self._llama.n_ctx()\n\n n_ctx_train = self._llama._model.n_ctx_train() \n if self.ctx_len > n_ctx_train:\n raise ValueError(f\"ctx_len ({self.ctx_len}) is greater than n_ctx_train ({n_ctx_train})\")\n\n\n if self.tokenizer is None:\n self.tokenizer = LlamaCppTokenizer(self._llama)\n\n try:\n self.init_format(format,\n format_search_order,\n {\"name\": os.path.basename(self._model_path),\n \"path\": self._model_path,\n \"meta_template_name\": \"tokenizer.chat_template\"}\n )\n except Exception as e:\n del self.tokenizer\n del self._llama\n raise e\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Free type constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n\n \"\"\"Free type constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.json","title":"json","text":"json(\n json_schema,\n query,\n *,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self, \n json_schema: Union[dict,str,None],\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(json_schema, \n thread,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
Raises:
Type Description ValueError
If trying to generate from an empty prompt.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
Source code in sibila/model.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n\n Raises:\n ValueError: If trying to generate from an empty prompt.\n\n Returns:\n A GenOut object with result, generated text, etc. \n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n thread = self._prepare_gen_in(thread, genconf)\n\n prompt = self.text_from_thread(thread)\n\n if not prompt:\n raise ValueError(\"Cannot generate from an empty prompt\")\n\n logger.debug(f\"Prompt: \u2588{prompt}\u2588\")\n\n text,finish = self._gen_text(prompt, genconf)\n\n out = self._prepare_gen_out(text, finish, genconf)\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_json","title":"gen_json","text":"gen_json(\n json_schema,\n thread,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n json_schema: Union[dict,str,None],\n\n thread: Thread,\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n if genconf.json_schema is not None and json_schema is not None:\n logger.warn(\"Both arg json_schema and genconf.json_schema are set: using json_schema arg\")\n\n if json_schema is not None:\n if schemaconf is None:\n schemaconf = self.schemaconf\n\n logger.debug(\"JSON schema conf:\\n\" + pformat(schemaconf))\n\n if massage_schema:\n if not isinstance(json_schema, dict):\n json_schema = json.loads(json_schema)\n\n json_schema = json_schema_massage(json_schema, schemaconf) # type: ignore[arg-type]\n logger.debug(\"Massaged JSON schema:\\n\" + pformat(json_schema))\n\n out = self.gen(thread, \n genconf(format=\"json\", \n json_schema=json_schema))\n\n return out \n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n if is_dataclass(cls):\n schema = build_dataclass_object_json_schema(cls)\n else:\n raise TypeError(\"Only dataclass allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = create_final_instance(cls, \n is_list=False,\n val=out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error initializing object from JSON\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n if is_subclass_of(cls, BaseModel):\n schema = json_schema_from_pydantic(cls)\n else:\n raise TypeError(\"Only pydantic BaseModel allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = pydantic_obj_from_json(cls, \n out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error validating for object (by Pydantic), but JSON is valid for its schema\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.token_len","title":"token_len","text":"token_len(thread, _=None)\n
Calculate token length for a Thread.
Parameters:
Name Type Description Default thread
Thread
For token length calculation.
required Returns:
Type Description int
Number of tokens the thread will use.
Source code in sibila/model.py
def token_len(self,\n thread: Thread,\n _: Optional[GenConf] = None) -> int:\n \"\"\"Calculate token length for a Thread.\n\n Args:\n thread: For token length calculation.\n\n Returns:\n Number of tokens the thread will use.\n \"\"\"\n\n text = self.text_from_thread(thread)\n return self.tokenizer.token_len(text)\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = None\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.ctx_len","title":"ctx_len property
","text":"ctx_len\n
Maximum context length, shared for input + output. We assume a common in+out context where total token length must always be less than this number.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.known_models","title":"known_models classmethod
","text":"known_models()\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if it can accept any model.
Source code in sibila/model.py
@classmethod\ndef known_models(cls) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Returns:\n Returns a list of known models or None if it can accept any model.\n \"\"\"\n return None\n
"},{"location":"api-reference/model/#sibila.LlamaCppModel.desc","title":"desc property
","text":"desc\n
Model description.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.n_embd","title":"n_embd property
","text":"n_embd\n
Embedding size of model.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.n_params","title":"n_params property
","text":"n_params\n
Total number of model parameters.
"},{"location":"api-reference/model/#sibila.LlamaCppModel.get_metadata","title":"get_metadata","text":"get_metadata()\n
Returns model metadata.
Source code in sibila/llamacpp.py
def get_metadata(self):\n \"\"\"Returns model metadata.\"\"\"\n out = {}\n buf = bytes(16 * 1024)\n lmodel = self._llama.model\n count = llama_cpp.llama_model_meta_count(lmodel)\n for i in range(count):\n res = llama_cpp.llama_model_meta_key_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n key = buf[:res].decode('utf-8')\n res = llama_cpp.llama_model_meta_val_str_by_index(lmodel, i, buf,len(buf))\n if res >= 0:\n value = buf[:res].decode('utf-8')\n out[key] = value\n return out\n
"},{"location":"api-reference/model/#remote-models","title":"Remote models","text":""},{"location":"api-reference/model/#sibila.OpenAIModel","title":"OpenAIModel","text":"OpenAIModel(\n name,\n unknown_name_mask=2,\n *,\n genconf=None,\n schemaconf=None,\n tokenizer=None,\n ctx_len=0,\n api_key=None,\n base_url=None,\n openai_init_kwargs={}\n)\n
Access an OpenAI model.
Supports constrained JSON output, via the OpenAI API tools mechanism. Ref: https://platform.openai.com/docs/api-reference/chat/create
Create an OpenAI remote model. Name resolution depends on unknown_name_mask and will keep removing letters from the end of name and searching existing entries in penAIModel.known_models().
Parameters:
Name Type Description Default name
str
Model name to resolve into an existing model.
required unknown_name_mask
int
How to deal with unmatched names in name resolution, a mask of:
- 2: Raise NameError if exact name not found (no generics)
- 1: Only allow versioned names - raise NameError if generic non-versioned model name used
- 0: Accept any name that can be resolved from OpenAIModel.known_models()
2
genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
tokenizer
Optional[Tokenizer]
An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.
None
ctx_len
int
Maximum context length to be used (shared for input and output). Defaults to 0 which means model's maximum.
0
api_key
Optional[str]
OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.
None
base_url
Optional[str]
Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL.
None
openai_init_kwargs
dict
Extra args for OpenAI.OpenAI() initialization. Defaults to {}.
{}
Raises:
Type Description ImportError
If OpenAI API is not installed.
Source code in sibila/openai.py
def __init__(self,\n name: str,\n unknown_name_mask: int = 2,\n *,\n\n # common base model args\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None,\n tokenizer: Optional[Tokenizer] = None,\n ctx_len: int = 0,\n\n # most important OpenAI-specific args\n api_key: Optional[str] = None,\n base_url: Optional[str] = None,\n\n # OpenAI-specific args\n openai_init_kwargs: dict = {},\n ):\n \"\"\"Create an OpenAI remote model.\n Name resolution depends on unknown_name_mask and will keep removing letters \n from the end of name and searching existing entries in penAIModel.known_models().\n\n Args:\n name: Model name to resolve into an existing model.\n unknown_name_mask: How to deal with unmatched names in name resolution, a mask of:\n\n - 2: Raise NameError if exact name not found (no generics)\n - 1: Only allow versioned names - raise NameError if generic non-versioned model name used\n - 0: Accept any name that can be resolved from OpenAIModel.known_models()\n\n genconf: Model generation configuration. Defaults to None.\n tokenizer: An external initialized tokenizer to use instead of the created from the GGUF file. Defaults to None.\n ctx_len: Maximum context length to be used (shared for input and output). Defaults to 0 which means model's maximum.\n api_key: OpenAI API key. Defaults to None, which will use env variable OPENAI_API_KEY.\n base_url: Base location for API access. Defaults to None, which will use env variable OPENAI_BASE_URL.\n openai_init_kwargs: Extra args for OpenAI.OpenAI() initialization. Defaults to {}.\n\n Raises:\n ImportError: If OpenAI API is not installed.\n \"\"\"\n\n\n if not has_openai:\n raise ImportError(\"Please install openai by running: pip install openai\")\n\n self._model_name, max_ctx_len, self._tokens_per_message, self._tokens_per_name = resolve_model(\n name,\n unknown_name_mask\n )\n\n\n super().__init__(False,\n genconf,\n schemaconf,\n tokenizer\n )\n\n # only check for \"json\" text presence as json schema is requested with the tools facility.\n self.json_format_instructors[\"json_schema\"] = self.json_format_instructors[\"json\"]\n\n logger.debug(f\"Creating inner OpenAI with base_url={base_url}, openai_init_kwargs={openai_init_kwargs}\")\n\n self._client = openai.OpenAI(api_key=api_key,\n base_url=base_url,\n\n **openai_init_kwargs\n )\n\n\n # correct super __init__ values\n if self.tokenizer is None:\n self.tokenizer = OpenAITokenizer(self._model_name)\n\n if ctx_len == 0:\n self._ctx_len = max_ctx_len\n else:\n self._ctx_len = ctx_len\n\n\n self.TOOLS_TOKEN_LEN_FACTOR = self.DEFAULT_TOOLS_TOKEN_LEN_FACTOR\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.extract","title":"extract","text":"extract(\n target,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Free type constrained generation: an instance of the given type will be initialized with the model's output. The following target types are accepted:
-
prim_type:
-
enums:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
-
datetime/date/time
-
a list in the form:
For example list[int]. The list can be annotated: Annotated[list[T], \"List desc\"] And/or the list item type can be annotated: list[Annotated[T, \"Item desc\"]]
-
dataclass with fields of the above supported types (or dataclass).
-
Pydantic BaseModel
All types can be Annotated[T, \"Desc\"], for example: count: int Can be annotated as: count: Annotated[int, \"How many units?\"]
Parameters:
Name Type Description Default target
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
A value of target arg type instantiated with the model's output.
Source code in sibila/model.py
def extract(self,\n target: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n\n \"\"\"Free type constrained generation: an instance of the given type will be initialized with the model's output.\n The following target types are accepted:\n\n - prim_type:\n\n - bool\n - int\n - float\n - str\n\n - enums:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n - datetime/date/time\n\n - a list in the form:\n - list[type]\n\n For example list[int]. The list can be annotated:\n Annotated[list[T], \"List desc\"]\n And/or the list item type can be annotated:\n list[Annotated[T, \"Item desc\"]]\n\n - dataclass with fields of the above supported types (or dataclass).\n\n - Pydantic BaseModel\n\n All types can be Annotated[T, \"Desc\"], for example: \n count: int\n Can be annotated as:\n count: Annotated[int, \"How many units?\"]\n\n Args:\n target: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n A value of target arg type instantiated with the model's output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_extract(target,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.classify","title":"classify","text":"classify(\n labels,\n query,\n *,\n inst=None,\n genconf=None,\n schemaconf=None\n)\n
Returns a classification from one of the given enumeration values The following ways to specify the valid labels are accepted:
- [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type
- Literal['year', 'name'] - all items of the same prim_type
- Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type
Parameters:
Name Type Description Default labels
Any
One of the above types.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred. See GenError.
Returns:
Type Description Any
One of the given labels, as classified by the model.
Source code in sibila/model.py
def classify(self,\n labels: Any,\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any:\n \"\"\"Returns a classification from one of the given enumeration values\n The following ways to specify the valid labels are accepted:\n\n - [1, 2, 3] or [\"a\",\"b\"] - all items of the same prim_type\n - Literal['year', 'name'] - all items of the same prim_type\n - Enum, EnumInt, EnumStr, (Enum, int),... - all items of the same prim_type\n\n Args:\n labels: One of the above types.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred. See GenError.\n\n Returns:\n One of the given labels, as classified by the model.\n \"\"\"\n\n # verify it's a valid enum \"type\"\n type_,_ = get_enum_type(labels)\n if type_ is None:\n raise TypeError(\"Arg labels must be one of Literal, Enum class or a list of str, float or int items\")\n\n return self.extract(labels,\n query,\n inst=inst,\n genconf=genconf,\n schemaconf=schemaconf)\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen","title":"gen","text":"gen(thread, genconf=None)\n
Text generation from a Thread, used by the other model generation methods. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Raises:
Type Description NotImplementedError
If method was not defined by a derived class.
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc.
GenOut
The output text is in GenOut.text.
Source code in sibila/openai.py
def gen(self, \n thread: Thread,\n genconf: Optional[GenConf] = None,\n ) -> GenOut:\n \"\"\"Text generation from a Thread, used by the other model generation methods.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None.\n\n Raises:\n NotImplementedError: If method was not defined by a derived class.\n\n Returns:\n A GenOut object with result, generated text, etc.\n The output text is in GenOut.text.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n thread = self._prepare_gen_in(thread, genconf)\n\n token_len = self.token_len(thread, genconf)\n if genconf.max_tokens == 0:\n max_tokens = self.ctx_len - token_len \n genconf = genconf(max_tokens=max_tokens)\n\n elif token_len + genconf.max_tokens > self.ctx_len:\n # this is not true for all models: 1106 models have 128k max input and 4k max output (in and out ctx are not shared)\n # so we assume the smaller max ctx length for the model\n logger.warn(f\"Token length + genconf.max_tokens ({token_len + genconf.max_tokens}) is greater than model's context window length ({self.ctx_len})\")\n\n fn_name = \"json_out\"\n\n json_kwargs: dict = {}\n format = genconf.format\n if format == \"json\":\n\n if genconf.json_schema is None:\n json_kwargs[\"response_format\"] = {\"type\": \"json_object\"}\n\n else:\n # use json_schema in OpenAi's tool API\n json_kwargs[\"tool_choice\"] = {\n \"type\": \"function\",\n \"function\": {\"name\": fn_name},\n }\n\n if isinstance(genconf.json_schema, str):\n params = json.loads(genconf.json_schema)\n else:\n params = genconf.json_schema\n\n json_kwargs[\"tools\"] = [\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": fn_name,\n \"parameters\": params\n }\n }\n ]\n\n logger.debug(f\"OpenAI json args: {json_kwargs}\")\n\n msgs = thread.as_chatml()\n\n # https://platform.openai.com/docs/api-reference/chat/create\n response = self._client.chat.completions.create(model=self._model_name,\n messages=msgs, # type: ignore[arg-type]\n\n max_tokens=genconf.max_tokens,\n stop=genconf.stop,\n temperature=genconf.temperature,\n top_p=genconf.top_p,\n **json_kwargs,\n\n n=1\n )\n\n logger.debug(f\"OpenAI response: {response}\")\n\n choice = response.choices[0]\n finish = choice.finish_reason\n message = choice.message\n\n if \"tool_choice\" in json_kwargs:\n\n # json schema generation via the tools API:\n if message.tool_calls is not None:\n fn = message.tool_calls[0].function\n if fn.name != fn_name:\n logger.debug(f\"OpenAIModel: different returned JSON function name ({fn.name})\")\n\n text = fn.arguments\n else: # use content instead\n text = message.content # type: ignore[assignment]\n\n else:\n # text or simple json format\n text = message.content # type: ignore[assignment]\n\n out = self._prepare_gen_out(text, finish, genconf)\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.json","title":"json","text":"json(\n json_schema,\n query,\n *,\n inst=None,\n genconf=None,\n massage_schema=True,\n schemaconf=None\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema. Raises GenError if unable to get a valid/schema-validated JSON.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid JSON schema output error. See GenError.
Returns:
Type Description dict
A dict from model's JSON response, following genconf.jsonschema, if provided.
Source code in sibila/model.py
def json(self, \n json_schema: Union[dict,str,None],\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> dict:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, constrained or not by a JSON schema.\n Raises GenError if unable to get a valid/schema-validated JSON.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid JSON schema output error. See GenError.\n\n Returns:\n A dict from model's JSON response, following genconf.jsonschema, if provided.\n \"\"\" \n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_json(json_schema, \n thread,\n genconf,\n massage_schema,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.dic # type: ignore[return-value]\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.dataclass","title":"dataclass","text":"dataclass(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a dataclass definition, resulting in an object initialized with the model's response. Raises GenError if unable to get a valid response that follows the dataclass definition.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example invalid object initialization. See GenError.
Returns:
Type Description Any
An object of class cls (derived from dataclass) initialized from the constrained JSON output.
Source code in sibila/model.py
def dataclass(self, # noqa: E811\n cls: Any, # a dataclass definition\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a dataclass object\n \"\"\"Constrained generation after a dataclass definition, resulting in an object initialized with the model's response.\n Raises GenError if unable to get a valid response that follows the dataclass definition.\n\n Args:\n cls: A dataclass definition.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example invalid object initialization. See GenError.\n\n Returns:\n An object of class cls (derived from dataclass) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_dataclass(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.pydantic","title":"pydantic","text":"pydantic(\n cls, query, *, inst=None, genconf=None, schemaconf=None\n)\n
Constrained generation after a Pydantic BaseModel-derived class definition. Results in an object initialized with the model response. Raises GenError if unable to get a valid dict that follows the BaseModel class definition.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Raises:
Type Description GenError
If an error occurred, for example an invalid BaseModel object. See GenError.
Returns:
Type Description Any
A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.
Source code in sibila/model.py
def pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n\n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> Any: # a Pydantic BaseModel object\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n Results in an object initialized with the model response.\n Raises GenError if unable to get a valid dict that follows the BaseModel class definition.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Raises:\n GenError: If an error occurred, for example an invalid BaseModel object. See GenError.\n\n Returns:\n A Pydantic object of class cls (derived from BaseModel) initialized from the constrained JSON output.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen_pydantic(cls,\n thread,\n genconf,\n schemaconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=False) # as valid JSON can still be produced\n\n return out.value\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.__call__","title":"__call__","text":"__call__(\n query,\n *,\n inst=None,\n genconf=None,\n ok_length_is_error=False\n)\n
Text generation from a Thread or plain text, used by the other model generation methods.
Parameters:
Name Type Description Default query
Union[str, Thread]
Thread or an str with the text of a single IN message to use as model input.
required inst
Optional[str]
Instruction message for model. Will override Thread's inst, if set. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
ok_length_is_error
bool
Should a result of GenRes.OK_LENGTH be considered an error and raise?
False
Raises:
Type Description GenError
If an error occurred. This can be a model error, or an invalid JSON output error.
Returns:
Type Description str
Text generated by model.
Source code in sibila/model.py
def __call__(self, \n query: Union[str,Thread],\n *,\n inst: Optional[str] = None,\n\n genconf: Optional[GenConf] = None,\n ok_length_is_error: bool = False\n ) -> str:\n \"\"\"Text generation from a Thread or plain text, used by the other model generation methods.\n\n Args:\n query: Thread or an str with the text of a single IN message to use as model input.\n inst: Instruction message for model. Will override Thread's inst, if set. Defaults to None.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n ok_length_is_error: Should a result of GenRes.OK_LENGTH be considered an error and raise?\n\n Raises:\n GenError: If an error occurred. This can be a model error, or an invalid JSON output error.\n\n Returns:\n Text generated by model.\n \"\"\"\n\n thread = Thread.ensure(query, inst)\n\n out = self.gen(thread=thread, \n genconf=genconf)\n\n GenError.raise_if_error(out,\n ok_length_is_error=ok_length_is_error)\n\n return out.text\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_json","title":"gen_json","text":"gen_json(\n json_schema,\n thread,\n genconf=None,\n massage_schema=True,\n schemaconf=None,\n)\n
JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema. Doesn't raise an exception if an error occurs, always returns GenOut.
Parameters:
Name Type Description Default json_schema
Union[dict, str, None]
A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
massage_schema
bool
Simplify schema. Defaults to True.
True
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.
Source code in sibila/model.py
def gen_json(self,\n json_schema: Union[dict,str,None],\n\n thread: Thread,\n genconf: Optional[GenConf] = None,\n\n massage_schema: bool = True,\n schemaconf: Optional[JSchemaConf] = None,\n ) -> GenOut:\n \"\"\"JSON/JSON-schema constrained generation, returning a Python dict of values, conditioned or not by a JSON schema.\n Doesn't raise an exception if an error occurs, always returns GenOut.\n\n Args:\n json_schema: A JSON schema describing the dict fields that will be output. None means no schema (free JSON output).\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n massage_schema: Simplify schema. Defaults to True.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The output dict is in GenOut.dic.\n \"\"\"\n\n if genconf is None:\n genconf = self.genconf\n\n if genconf.json_schema is not None and json_schema is not None:\n logger.warn(\"Both arg json_schema and genconf.json_schema are set: using json_schema arg\")\n\n if json_schema is not None:\n if schemaconf is None:\n schemaconf = self.schemaconf\n\n logger.debug(\"JSON schema conf:\\n\" + pformat(schemaconf))\n\n if massage_schema:\n if not isinstance(json_schema, dict):\n json_schema = json.loads(json_schema)\n\n json_schema = json_schema_massage(json_schema, schemaconf) # type: ignore[arg-type]\n logger.debug(\"Massaged JSON schema:\\n\" + pformat(json_schema))\n\n out = self.gen(thread, \n genconf(format=\"json\", \n json_schema=json_schema))\n\n return out \n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_dataclass","title":"gen_dataclass","text":"gen_dataclass(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a dataclass definition. An initialized dataclass object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A dataclass definition.
required thread
Thread
The Thread object to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.
Source code in sibila/model.py
def gen_dataclass(self,\n cls: Any, # a dataclass\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a dataclass definition.\n An initialized dataclass object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A dataclass definition.\n thread: The Thread object to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized dataclass object is in GenOut.value.\n \"\"\"\n\n if is_dataclass(cls):\n schema = build_dataclass_object_json_schema(cls)\n else:\n raise TypeError(\"Only dataclass allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = create_final_instance(cls, \n is_list=False,\n val=out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error initializing object from JSON\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.gen_pydantic","title":"gen_pydantic","text":"gen_pydantic(cls, thread, genconf=None, schemaconf=None)\n
Constrained generation after a Pydantic BaseModel-derived class definition. An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict. Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.
Parameters:
Name Type Description Default cls
Any
A class derived from a Pydantic BaseModel class.
required thread
Thread
The Thread to use as model input.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None, which uses model's default.
None
schemaconf
Optional[JSchemaConf]
JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.
None
Returns:
Type Description GenOut
A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.
Source code in sibila/model.py
def gen_pydantic(self,\n cls: Any, # a Pydantic BaseModel class\n thread: Thread,\n genconf: Optional[GenConf] = None,\n schemaconf: Optional[JSchemaConf] = None\n ) -> GenOut:\n \"\"\"Constrained generation after a Pydantic BaseModel-derived class definition.\n An initialized Pydantic BaseModel object is returned in the \"value\" field of the returned dict.\n Doesn't raise an exception if an error occurs, always returns GenOut containing the created object.\n\n Args:\n cls: A class derived from a Pydantic BaseModel class.\n thread: The Thread to use as model input.\n genconf: Model generation configuration. Defaults to None, which uses model's default.\n schemaconf: JSchemaConf object that controls schema simplification. Defaults to None, which uses model's default.\n\n Returns:\n A GenOut object with result, generated text, etc. The initialized Pydantic BaseModel-derived object is in GenOut.value.\n \"\"\"\n\n if is_subclass_of(cls, BaseModel):\n schema = json_schema_from_pydantic(cls)\n else:\n raise TypeError(\"Only pydantic BaseModel allowed for argument cls\")\n\n out = self.gen_json(schema,\n thread,\n genconf,\n massage_schema=True,\n schemaconf=schemaconf)\n\n if out.dic is not None:\n try:\n obj = pydantic_obj_from_json(cls, \n out.dic,\n schemaconf=schemaconf)\n out.value = obj\n\n except TypeError as e:\n out.res = GenRes.ERROR_JSON_SCHEMA_VAL # error validating for object (by Pydantic), but JSON is valid for its schema\n out.text += f\"\\nJSON Schema error: {e}\"\n else:\n # out.res already holds the right error\n ...\n\n return out\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.token_len","title":"token_len","text":"token_len(thread, genconf=None)\n
Calculate the number of tokens used by a list of messages. If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.
From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
More info on calculating function_call (and tools?) tokens:
https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24
https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5
Parameters:
Name Type Description Default thread
Thread
For token length calculation.
required genconf
Optional[GenConf]
Model generation configuration. Defaults to None.
None
Returns:
Type Description int
Estimated number of tokens the thread will use.
Source code in sibila/openai.py
def token_len(self,\n thread: Thread,\n genconf: Optional[GenConf] = None) -> int:\n \"\"\"Calculate the number of tokens used by a list of messages.\n If a json_schema is provided in genconf, we use its string's token_len as upper bound for the extra prompt tokens.\n\n From https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\n\n More info on calculating function_call (and tools?) tokens:\n\n https://community.openai.com/t/how-to-calculate-the-tokens-when-using-function-call/266573/24\n\n https://gist.github.com/CGamesPlay/dd4f108f27e2eec145eedf5c717318f5\n\n Args:\n thread: For token length calculation.\n genconf: Model generation configuration. Defaults to None.\n\n Returns:\n Estimated number of tokens the thread will use.\n \"\"\"\n\n # name = self._model_name\n\n num_tokens = 0\n for index in range(-1, len(thread)): # -1 for system message\n message = thread.msg_as_chatml(index)\n # print(message)\n num_tokens += self._tokens_per_message\n for key, value in message.items():\n num_tokens += len(self.tokenizer.encode(value))\n # if key == \"name\":\n # num_tokens += self._tokens_per_name\n\n num_tokens += 3 # every reply is primed with <|start|>assistant<|message|>\n num_tokens += 10 # match API return counts\n\n # print(\"text token_len\", num_tokens)\n\n if genconf is not None and genconf.json_schema is not None:\n if isinstance(genconf.json_schema, str):\n js_str = genconf.json_schema\n else:\n js_str = json.dumps(genconf.json_schema)\n\n tools_num_tokens = self.tokenizer.token_len(js_str)\n\n # this is an upper bound, as empirically tested with the api.\n tools_num_tokens = int(tools_num_tokens * self.TOOLS_TOKEN_LEN_FACTOR)\n\n # print(\"tools token_len\", tools_num_tokens)\n\n num_tokens += tools_num_tokens\n\n return num_tokens\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.tokenizer","title":"tokenizer instance-attribute
","text":"tokenizer = OpenAITokenizer(_model_name)\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.ctx_len","title":"ctx_len property
","text":"ctx_len\n
Maximum context length, shared for input + output. We assume a common in+out context where total token length must always be less than this number.
"},{"location":"api-reference/model/#sibila.OpenAIModel.known_models","title":"known_models classmethod
","text":"known_models()\n
If the model can only use a fixed set of models, return their names. Otherwise, return None.
Returns:
Type Description Union[list[str], None]
Returns a list of known models or None if it can accept any model.
Source code in sibila/openai.py
@classmethod\ndef known_models(cls) -> Union[list[str], None]:\n \"\"\"If the model can only use a fixed set of models, return their names. Otherwise, return None.\n\n Returns:\n Returns a list of known models or None if it can accept any model.\n \"\"\"\n return list(KNOWN_MODELS.keys())\n
"},{"location":"api-reference/model/#sibila.OpenAIModel.desc","title":"desc property
","text":"desc\n
Model description.
"},{"location":"api-reference/models/","title":"Models factory","text":""},{"location":"api-reference/models/#sibila.Models","title":"Models","text":"Model and template format directory that unifies (and simplifies) model access and configuration.
This env variable is checked and used during initialization SIBILA_MODELS: ';'-delimited list of folders where to find: models.json, formats.json and model files.
= Model Directory =
Useful to create models from resource names like \"llamacpp:openchat\" or \"openai:gpt-4\". This makes it simple to change a model, store model settings, to compare model outputs, etc.
User can add new entries from script or with JSON filenames, via the add() call. New directory entries with the same name are merged into existing ones for each added config.
Uses file \"sibila/res/base_models.json\" for the initial defaults, which the user can augment by calling setup() with own config files or directly adding model config with set_model().
An example of a model directory JSON config file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
= Format Directory =
Detects chat templates from model name/filename or uses from metadata if possible.
This directory can be setup from a JSON file or by calling set_format().
Any new entries with the same name replace previous ones on each new call.
Initializes from file \"sibila/res/base_formats.json\".
Example of a \"formats.json\" file:
{\n \"chatml\": {\n # template is a Jinja template for this model\n \"template\": \"{% for message in messages %}...\"\n },\n\n \"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n }, \n\n \"phi\": {\n \"match\": \"phi\",\n \"template\": \"...\"\n },\n\n \"phi2\": \"phi\",\n # this is a link: \"phi2\" -> \"phi\"\n}\n
Jinja2 templates receive a standard ChatML messages list (created from a Thread) and must deal with the following:
-
In models that don't use a system message, template must take care of prepending it to first user message.
-
The add_generation_prompt template variable is always set as True.
"},{"location":"api-reference/models/#sibila.Models.setup","title":"setup classmethod
","text":"setup(\n path=None, clear=False, add_cwd=True, load_from_env=True\n)\n
Initialize models and formats directory from given model files folder and/or contained configuration files. Path can start with \"~/\" current account's home directory.
Parameters:
Name Type Description Default path
Optional[Union[str, list[str]]]
Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.
None
clear
bool
Set to clear existing directories before loading from path arg.
False
add_cwd
bool
Add current working directory to search path.
True
load_from_env
bool
Load from SIBILA_MODELS env variable?
True
Source code in sibila/models.py
@classmethod\ndef setup(cls,\n path: Optional[Union[str,list[str]]] = None,\n clear: bool = False,\n add_cwd: bool = True,\n load_from_env: bool = True):\n \"\"\"Initialize models and formats directory from given model files folder and/or contained configuration files.\n Path can start with \"~/\" current account's home directory.\n\n Args:\n path: Path to a folder or to \"models.json\" or \"formats.json\" configuration files. Defaults to None which tries to initialize from defaults and env variable.\n clear: Set to clear existing directories before loading from path arg.\n add_cwd: Add current working directory to search path.\n load_from_env: Load from SIBILA_MODELS env variable?\n \"\"\"\n\n if clear:\n cls.clear()\n\n cls._ensure(add_cwd, \n load_from_env)\n\n if path is not None:\n if isinstance(path, str):\n path_list = [path]\n else:\n path_list = path\n\n cls._read_any(path_list)\n
"},{"location":"api-reference/models/#sibila.Models.create","title":"create classmethod
","text":"create(res_name, genconf=None, ctx_len=None, **over_args)\n
Create a model.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required genconf
Optional[GenConf]
Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.
None
ctx_len
Optional[int]
Maximum context length to be used. Overrides directory defaults. Defaults to None.
None
over_args
Union[Any]
Model-specific creation args, which will override default args set in model directory.
{}
Returns:
Name Type Description Model
Model
the initialized model.
Source code in sibila/models.py
@classmethod\ndef create(cls,\n res_name: str,\n\n # common to all providers\n genconf: Optional[GenConf] = None,\n ctx_len: Optional[int] = None,\n\n # model-specific overriding:\n **over_args: Union[Any]) -> Model:\n \"\"\"Create a model.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n genconf: Optional model generation configuration. Overrides set_genconf() value and any directory defaults. Defaults to None.\n ctx_len: Maximum context length to be used. Overrides directory defaults. Defaults to None.\n over_args: Model-specific creation args, which will override default args set in model directory.\n\n Returns:\n Model: the initialized model.\n \"\"\"\n\n cls._ensure() \n\n models_dir = cls.fused_models_dir()\n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = models_dir[provider]\n\n if name in prov:\n model_args = prov[name]\n\n # _default(if any) <- model_args <- over_args\n args = (prov.get(cls.DEFAULT_ENTRY_NAME)).copy() or {}\n args.update(model_args) \n args.update(over_args)\n\n else: \n prov_conf = cls.PROVIDER_CONF[provider] \n\n if \"name_passthrough\" in prov_conf[\"flags\"]:\n model_args = {\n \"name\": name \n }\n else:\n raise ValueError(f\"Model '{name}' not found in provider '{provider}'\")\n\n args = {}\n args.update(model_args)\n args.update(over_args)\n\n\n # override genconf, ctx_len\n if genconf is None:\n genconf = cls.genconf\n\n if genconf is not None:\n args[\"genconf\"] = genconf\n\n elif \"genconf\" in args and isinstance(args[\"genconf\"], dict):\n # transform dict into a GenConf instance:\n args[\"genconf\"] = GenConf.from_dict(args[\"genconf\"])\n\n if ctx_len is not None:\n args[\"ctx_len\"] = ctx_len\n\n logger.debug(f\"Creating model '{provider}:{name}' with resolved args: {args}\")\n\n\n model: Model\n if provider == \"llamacpp\":\n\n # resolve filename -> path\n path = cls._locate_file(args[\"name\"])\n if path is None:\n raise FileNotFoundError(f\"File not found in '{res_name}' while looking for file '{args['name']}'. Make sure you called Models.setup() with a path to the file's folder\")\n\n logger.debug(f\"Resolved llamacpp model '{args['name']}' to '{path}'\")\n\n del args[\"name\"]\n args[\"path\"] = path\n\n from .llamacpp import LlamaCppModel\n\n model = LlamaCppModel(**args)\n\n\n elif provider == \"openai\":\n\n from .openai import OpenAIModel\n\n model = OpenAIModel(**args)\n\n \"\"\"\n elif provider == \"hf\":\n from .hf import HFModel\n\n model = HFModel(**args)\n \"\"\"\n\n return model\n
"},{"location":"api-reference/models/#sibila.Models.add_models_search_path","title":"add_models_search_path classmethod
","text":"add_models_search_path(path)\n
Prepends new paths to model files search path.
Parameters:
Name Type Description Default path
Union[str, list[str]]
A path or list of paths to add to model search path.
required Source code in sibila/models.py
@classmethod\ndef add_models_search_path(cls,\n path: Union[str,list[str]]):\n \"\"\"Prepends new paths to model files search path.\n\n Args:\n path: A path or list of paths to add to model search path.\n \"\"\"\n\n cls._ensure()\n\n prepend_path(cls.models_search_path, path)\n\n logger.debug(f\"Adding '{path}' to search_path\")\n
"},{"location":"api-reference/models/#sibila.Models.set_genconf","title":"set_genconf classmethod
","text":"set_genconf(genconf)\n
Set the GenConf to use as default for model creation.
Parameters:
Name Type Description Default genconf
GenConf
Model generation configuration.
required Source code in sibila/models.py
@classmethod\ndef set_genconf(cls,\n genconf: GenConf):\n \"\"\"Set the GenConf to use as default for model creation.\n\n Args:\n genconf: Model generation configuration.\n \"\"\"\n cls.genconf = genconf\n
"},{"location":"api-reference/models/#sibila.Models.list_models","title":"list_models classmethod
","text":"list_models(\n name_query, providers, include_base, resolved_values\n)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match model names. Empty string for all.
required providers
list[str]
Filter by these exact provider names. Empty list for all.
required include_base
bool
Also list fused values from base_models_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are model res_names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_models(cls,\n name_query: str,\n providers: list[str],\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match model names. Empty string for all.\n providers: Filter by these exact provider names. Empty list for all.\n include_base: Also list fused values from base_models_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are model res_names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n out = {}\n\n name_query = name_query.lower()\n\n for prov_name in models_dir:\n\n if providers and prov_name not in providers:\n continue\n\n prov_dic = models_dir[prov_name]\n\n for name in prov_dic:\n\n if name == cls.DEFAULT_ENTRY_NAME:\n continue\n\n if name_query and name_query not in name.lower():\n continue\n\n entry_res_name = prov_name + \":\" + name\n\n if resolved_values:\n # okay to use get_model_entry() because it resolves to fused\n res = cls.get_model_entry(entry_res_name) # type: ignore[assignment]\n if res is None:\n continue\n else:\n val = res[1]\n else:\n val = prov_dic[name]\n\n out[entry_res_name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_model_entry","title":"get_model_entry classmethod
","text":"get_model_entry(res_name)\n
Get a resolved model entry. Resolved means following any links.
Parameters:
Name Type Description Default res_name
str
Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".
required Returns:
Type Description Union[tuple[str, dict], None]
Resolved entry (res_name,dict) or None if not found.
Source code in sibila/models.py
@classmethod\ndef get_model_entry(cls,\n res_name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved model entry. Resolved means following any links.\n\n Args:\n res_name: Resource name in the format: provider:model_name, for example \"llamacpp:openchat\".\n\n Returns:\n Resolved entry (res_name,dict) or None if not found.\n \"\"\"\n\n cls._ensure() \n\n models_dir = cls.fused_models_dir()\n\n # resolve \"alias:name\" res names, or \"name\": \"link_name\" links\n provider,name = resolve_model(models_dir, res_name, cls.ALL_PROVIDER_NAMES)\n # arriving here, prov as a non-link dict entry\n logger.debug(f\"Resolved model '{res_name}' to '{provider}','{name}'\")\n\n prov = models_dir[provider]\n\n if name in prov:\n return provider + \":\" + name, prov[name]\n else:\n return None\n
"},{"location":"api-reference/models/#sibila.Models.has_model_entry","title":"has_model_entry classmethod
","text":"has_model_entry(res_name)\n
Source code in sibila/models.py
@classmethod\ndef has_model_entry(cls,\n res_name: str) -> bool:\n return cls.get_model_entry(res_name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.set_model","title":"set_model classmethod
","text":"set_model(\n res_name, model_name, format_name=None, genconf=None\n)\n
Add model configuration for given res_name.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
str
Model name or filename identifier.
required format_name
Optional[str]
Format name used by model. Defaults to None.
None
genconf
Optional[GenConf]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model(cls,\n res_name: str,\n model_name: str,\n format_name: Optional[str] = None,\n genconf: Optional[GenConf] = None):\n \"\"\"Add model configuration for given res_name.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier.\n format_name: Format name used by model. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry: dict = {\n \"name\": model_name\n }\n\n if format_name:\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n\n if genconf:\n entry[\"genconf\"] = genconf.as_dict()\n\n cls.models_dir[provider][name] = entry\n
"},{"location":"api-reference/models/#sibila.Models.update_model","title":"update_model classmethod
","text":"update_model(\n res_name,\n model_name=None,\n format_name=None,\n genconf=None,\n)\n
update model fields
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required model_name
Optional[str]
Model name or filename identifier. Defaults to None.
None
format_name
Optional[str]
Format name used by model. Use \"\" to delete. Defaults to None.
None
genconf
Union[GenConf, str, None]
Base GenConf to use when creating model. Defaults to None.
None
Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef update_model(cls,\n res_name: str,\n model_name: Optional[str] = None,\n format_name: Optional[str] = None,\n genconf: Union[GenConf,str,None] = None):\n\n \"\"\"update model fields\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n model_name: Model name or filename identifier. Defaults to None.\n format_name: Format name used by model. Use \"\" to delete. Defaults to None.\n genconf: Base GenConf to use when creating model. Defaults to None.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, False)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n entry = cls.models_dir[provider][name]\n\n if model_name:\n entry[\"name\"] = model_name\n\n if format_name is not None:\n if format_name != \"\":\n if not cls.has_format_entry(format_name):\n raise ValueError(f\"Could not find format '{format_name}'\")\n entry[\"format\"] = format_name\n else:\n del entry[\"format\"]\n\n if genconf is not None:\n if genconf != \"\":\n entry[\"genconf\"] = genconf\n else:\n del entry[\"genconf\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_model_link","title":"set_model_link classmethod
","text":"set_model_link(res_name, link_name)\n
Create a model link into another model.
Parameters:
Name Type Description Default res_name
str
A name in the form \"provider:model_name\", for example \"openai:gtp-4\".
required link_name
str
Name of model this entry links to.
required Raises:
Type Description ValueError
If unknown provider.
Source code in sibila/models.py
@classmethod\ndef set_model_link(cls,\n res_name: str,\n link_name: str):\n \"\"\"Create a model link into another model.\n\n Args:\n res_name: A name in the form \"provider:model_name\", for example \"openai:gtp-4\".\n link_name: Name of model this entry links to.\n\n Raises:\n ValueError: If unknown provider.\n \"\"\"\n\n cls._ensure()\n\n provider,name = provider_name_from_urn(res_name, True)\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}' in '{res_name}'\")\n\n # first: ensure link_name is a res_name\n if ':' not in link_name:\n link_name = provider + \":\" + link_name\n\n if not cls.has_model_entry(link_name):\n raise ValueError(f\"Could not find linked model '{link_name}'\")\n\n # second: check link name is without provider if same\n link_split = link_name.split(\":\")\n if len(link_split) == 2:\n if link_split[0] == provider: # remove same \"provider:\"\n link_name = link_split[1]\n\n cls.models_dir[provider][name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_model","title":"delete_model classmethod
","text":"delete_model(res_name)\n
Delete a model entry.
Parameters:
Name Type Description Default res_name
str
Model entry in the form \"provider:name\".
required Source code in sibila/models.py
@classmethod\ndef delete_model(cls,\n res_name: str):\n \"\"\"Delete a model entry.\n\n Args:\n res_name: Model entry in the form \"provider:name\".\n \"\"\"\n\n cls._ensure()\n\n provider, name = provider_name_from_urn(res_name,\n allow_alias_provider=False)\n\n if provider not in cls.ALL_PROVIDER_NAMES:\n raise ValueError(f\"Unknown provider '{provider}', must be one of: {cls.ALL_PROVIDER_NAMES}\")\n\n prov = cls.models_dir[provider] \n if name not in prov:\n raise ValueError(f\"Model '{res_name}' not found\")\n\n # verify if any entry links to name:\n def check_link_to(link_to_name: str, \n provider: str) -> Union[str, None]:\n\n for name,entry in cls.models_dir[provider].items():\n if isinstance(entry,str) and entry == link_to_name:\n return name\n return None\n\n offender = check_link_to(name, provider)\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry '{provider}:{offender}' links to it\")\n\n offender = check_link_to(name, \"alias\")\n if offender is not None:\n raise ValueError(f\"Cannot delete '{res_name}', as entry 'alias:{offender}' links to it\")\n\n del prov[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_models","title":"save_models classmethod
","text":"save_models(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_models(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"models.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n\n # clear providers with no models:\n for provider in cls.ALL_PROVIDER_NAMES:\n if provider in models_dir and not models_dir[provider]:\n del models_dir[provider]\n\n json.dump(models_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.list_formats","title":"list_formats classmethod
","text":"list_formats(name_query, include_base, resolved_values)\n
List format entries matching query.
Parameters:
Name Type Description Default name_query
str
Case-insensitive substring to match format names. Empty string for all.
required include_base
bool
Also list base_formats_dir.
required resolved_values
bool
Return resolved entries or raw ones.
required Returns:
Type Description dict
A dict where keys are format names and values are respective entries.
Source code in sibila/models.py
@classmethod\ndef list_formats(cls,\n name_query: str,\n include_base: bool,\n resolved_values: bool) -> dict:\n \"\"\"List format entries matching query.\n\n Args:\n name_query: Case-insensitive substring to match format names. Empty string for all.\n include_base: Also list base_formats_dir.\n resolved_values: Return resolved entries or raw ones.\n\n Returns:\n A dict where keys are format names and values are respective entries.\n \"\"\"\n\n cls._ensure()\n\n out = {}\n\n name_query = name_query.lower()\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n for name in formats_dir.keys():\n\n if name_query and name_query not in name.lower():\n continue\n\n val = formats_dir[name]\n\n if resolved_values:\n res = cls.get_format_entry(name)\n if res is None:\n continue\n else:\n val = res[1]\n\n out[name] = val\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.get_format_entry","title":"get_format_entry classmethod
","text":"get_format_entry(name)\n
Get a resolved format entry by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple of (resolved_name, format_entry).
Source code in sibila/models.py
@classmethod\ndef get_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Get a resolved format entry by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Tuple of (resolved_name, format_entry).\n \"\"\"\n\n cls._ensure()\n\n return get_format_entry(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.has_format_entry","title":"has_format_entry classmethod
","text":"has_format_entry(name)\n
Source code in sibila/models.py
@classmethod\ndef has_format_entry(cls,\n name: str) -> bool:\n return cls.get_format_entry(name) is not None\n
"},{"location":"api-reference/models/#sibila.Models.get_format_template","title":"get_format_template classmethod
","text":"get_format_template(name)\n
Get a format template by name, following links if required.
Parameters:
Name Type Description Default name
str
Format name.
required Returns:
Type Description Union[str, None]
Resolved format template str.
Source code in sibila/models.py
@classmethod\ndef get_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Get a format template by name, following links if required.\n\n Args:\n name: Format name.\n\n Returns:\n Resolved format template str.\n \"\"\"\n\n res = cls.get_format_entry(name)\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.match_format_entry","title":"match_format_entry classmethod
","text":"match_format_entry(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[tuple[str, dict], None]
Tuple (name, format_entry) where name is a resolved name. Or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_entry(cls,\n name: str) -> Union[tuple[str,dict],None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Tuple (name, format_entry) where name is a resolved name. Or None if none found.\n \"\"\"\n\n cls._ensure()\n\n return search_format(cls.fused_formats_dir(), name)\n
"},{"location":"api-reference/models/#sibila.Models.match_format_template","title":"match_format_template classmethod
","text":"match_format_template(name)\n
Search the formats registry, based on model name or filename.
Parameters:
Name Type Description Default name
str
Name or filename of model.
required Returns:
Type Description Union[str, None]
Format template or None if none found.
Source code in sibila/models.py
@classmethod\ndef match_format_template(cls,\n name: str) -> Union[str,None]:\n \"\"\"Search the formats registry, based on model name or filename.\n\n Args:\n name: Name or filename of model.\n\n Returns:\n Format template or None if none found.\n \"\"\"\n\n res = cls.match_format_entry(name)\n\n return None if res is None else res[1][\"template\"]\n
"},{"location":"api-reference/models/#sibila.Models.set_format","title":"set_format classmethod
","text":"set_format(name, template, match=None)\n
Add a format entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required template
str
The Chat template format in Jinja2 format
required match
Optional[str]
Regex that matches names/filenames that use this format. Default is None.
None
Source code in sibila/models.py
@classmethod\ndef set_format(cls,\n name: str,\n template: str,\n match: Optional[str] = None):\n \"\"\"Add a format entry to the format directory.\n\n Args:\n name: Format entry name.\n template: The Chat template format in Jinja2 format\n match: Regex that matches names/filenames that use this format. Default is None.\n \"\"\"\n\n cls._ensure()\n\n if \"{{\" not in template: # a link_name for the template\n if not cls.has_format_entry(template):\n raise ValueError(f\"Could not find linked template entry '{template}'.\")\n\n entry = {\n \"template\": template\n }\n if match is not None:\n entry[\"match\"] = match\n\n cls.formats_dir[name] = entry \n
"},{"location":"api-reference/models/#sibila.Models.set_format_link","title":"set_format_link classmethod
","text":"set_format_link(name, link_name)\n
Add a format link entry to the format directory.
Parameters:
Name Type Description Default name
str
Format entry name.
required link_name
str
Name of format that this entry links to.
required Source code in sibila/models.py
@classmethod\ndef set_format_link(cls,\n name: str,\n link_name: str):\n \"\"\"Add a format link entry to the format directory.\n\n Args:\n name: Format entry name.\n link_name: Name of format that this entry links to.\n \"\"\"\n\n cls._ensure()\n\n if not cls.has_format_entry(link_name):\n raise ValueError(f\"Could not find linked entry '{link_name}'.\")\n\n cls.formats_dir[name] = link_name\n
"},{"location":"api-reference/models/#sibila.Models.delete_format","title":"delete_format classmethod
","text":"delete_format(name)\n
Delete a format entry.
Parameters:
Name Type Description Default name
str
Format entry name.
required Source code in sibila/models.py
@classmethod\ndef delete_format(cls,\n name: str):\n \"\"\"Delete a format entry.\n\n Args:\n name: Format entry name.\n \"\"\"\n\n cls._ensure()\n\n if name not in cls.formats_dir:\n raise ValueError(f\"Format name '{name}' not found.\")\n\n for check_name,entry in cls.formats_dir.items():\n if isinstance(entry,str) and entry == name:\n raise ValueError(f\"Cannot delete '{name}', as entry '{check_name}' links to it\")\n\n del cls.formats_dir[name]\n
"},{"location":"api-reference/models/#sibila.Models.save_formats","title":"save_formats classmethod
","text":"save_formats(path=None, include_base=False)\n
Source code in sibila/models.py
@classmethod\ndef save_formats(cls,\n path: Optional[str] = None,\n include_base: bool = False):\n\n cls._ensure()\n\n if path is None:\n if len(cls.models_search_path) != 1:\n raise ValueError(\"No path arg provided and multiple path in cls.search_path. Don't know where to save.\")\n\n path = os.path.join(cls.models_search_path[0], \"formats.json\")\n\n with open(path, \"w\", encoding=\"utf-8\") as f:\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n json.dump(formats_dir, f, indent=4)\n\n return path\n
"},{"location":"api-reference/models/#sibila.Models.info","title":"info classmethod
","text":"info(include_base=True, verbose=False)\n
Return information about current setup.
Parameters:
Name Type Description Default verbose
bool
If False, formats directory values are abbreviated. Defaults to False.
False
Returns:
Type Description str
Textual information about the current setup.
Source code in sibila/models.py
@classmethod\ndef info(cls,\n include_base: bool = True,\n verbose: bool = False) -> str:\n \"\"\"Return information about current setup.\n\n Args:\n verbose: If False, formats directory values are abbreviated. Defaults to False.\n\n Returns:\n Textual information about the current setup.\n \"\"\"\n\n cls._ensure()\n\n out = \"\"\n\n out += f\"Models search path: {cls.models_search_path}\\n\"\n\n models_dir = cls.fused_models_dir() if include_base else cls.models_dir\n out += f\"Models directory:\\n{pformat(models_dir, sort_dicts=False)}\\n\"\n\n out += f\"Model Genconf:\\n{cls.genconf}\\n\"\n\n formats_dir = cls.fused_formats_dir() if include_base else cls.formats_dir\n\n if not verbose:\n fordir = {}\n for key in formats_dir:\n fordir[key] = copy(formats_dir[key])\n if isinstance(fordir[key], dict) and \"template\" in fordir[key]:\n fordir[key][\"template\"] = fordir[key][\"template\"][:14] + \"...\"\n else:\n fordir = formats_dir\n\n out += f\"Formats directory:\\n{pformat(fordir)}\"\n\n return out\n
"},{"location":"api-reference/models/#sibila.Models.clear","title":"clear classmethod
","text":"clear()\n
Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.
Source code in sibila/models.py
@classmethod\ndef clear(cls):\n \"\"\"Clear directories. Members base_models_dir and base_formats_dir and genconf are not cleared.\"\"\"\n cls.models_dir = None\n cls.models_search_path = []\n cls.formats_dir = None\n
"},{"location":"api-reference/multigen/","title":"Multigen","text":""},{"location":"api-reference/multigen/#sibila.multigen","title":"multigen","text":"Functions for comparing output across models.
- thread_multigen(), query_multigen() and multigen(): Compare outputs across models.
- cycle_gen_print(): For a list of models, sequentially grow a Thread with model responses to given IN messages.
"},{"location":"api-reference/multigen/#sibila.multigen.thread_multigen","title":"thread_multigen","text":"thread_multigen(\n threads,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n thread_titles=None,\n)\n
Generate a single thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
thread_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def thread_multigen(threads: list[Thread],\n model_names: list[str],\n\n text: Union[str,list[str],None] = None,\n csv: Union[str,list[str],None] = None,\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n thread_titles: Optional[list[str]] = None \n ) -> list[list[GenOut]]:\n \"\"\"Generate a single thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n thread_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n table = multigen(threads,\n model_names=model_names, \n gencall=gencall,\n genconf=genconf)\n\n # table[threads,models]\n\n if thread_titles is None:\n thread_titles = [str(th) for th in threads]\n\n def format(format_fn, cmds):\n if cmds is None or not cmds:\n return\n\n f = StringIO(newline='')\n\n format_fn(f,\n table, \n title_list=thread_titles,\n model_names=model_names,\n out_keys=out_keys)\n fmtd = f.getvalue()\n\n if not isinstance(cmds, list):\n cmds = [cmds]\n for c in cmds:\n if c == 'print':\n print(fmtd)\n else: # path\n with open(c, \"w\", encoding=\"utf-8\") as f:\n f.write(fmtd)\n\n format(format_text, text)\n format(format_csv, csv)\n\n return table\n
"},{"location":"api-reference/multigen/#sibila.multigen.query_multigen","title":"query_multigen","text":"query_multigen(\n in_list,\n inst_text,\n model_names,\n text=None,\n csv=None,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n in_titles=None,\n)\n
Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required text
Union[str, list[str], None]
An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.
None
csv
Union[str, list[str], None]
An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.
None
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
in_titles
Optional[list[str]]
A human-friendly title for each Thread. Defaults to None.
None
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)).
list[list[GenOut]]
For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def query_multigen(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n text: Union[str,list[str],None] = None, # \"print\", path\n csv: Union[str,list[str],None] = None, # \"print\", path\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n in_titles: Optional[list[str]] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate an INST+IN thread on a list of models, returning/saving results in text/CSV.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n text: An str list with \"print\"=print results, path=a path to output a text file with results. Defaults to None.\n csv: An str list with \"print\"=print CSV results, path=a path to output a CSV file with results. Defaults to None.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n in_titles: A human-friendly title for each Thread. Defaults to None.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). \n For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\" \n\n th_list = []\n for in_text in in_list:\n th = Thread.make_INST_IN(inst_text, in_text)\n th_list.append(th)\n\n if in_titles is None:\n in_titles = in_list\n\n out = thread_multigen(th_list, \n model_names=model_names, \n text=text,\n csv=csv,\n gencall=gencall,\n genconf=genconf,\n out_keys=out_keys,\n thread_titles=in_titles)\n\n return out\n
"},{"location":"api-reference/multigen/#sibila.multigen.multigen","title":"multigen","text":"multigen(\n threads,\n *,\n models=None,\n model_names=None,\n model_names_del_after=True,\n gencall=None,\n genconf=None\n)\n
Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.
Actual generation for each model is implemented by the gencall arg Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default threads
list[Thread]
List of threads to input into each model.
required models
Optional[list[Model]]
A list of initialized models. Defaults to None.
None
model_names
Optional[list[str]]
--Or-- A list of Models names. Defaults to None.
None
model_names_del_after
bool
Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.
True
gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
Raises:
Type Description ValueError
Only one of models or model_names can be given.
Returns:
Type Description list[list[GenOut]]
A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...
Source code in sibila/multigen.py
def multigen(threads: list[Thread],\n *,\n models: Optional[list[Model]] = None, # existing models\n\n model_names: Optional[list[str]] = None,\n model_names_del_after: bool = True,\n\n gencall: Optional[Callable] = None,\n genconf: Optional[GenConf] = None\n ) -> list[list[GenOut]]:\n \"\"\"Generate a list of Threads in multiple models, returning the GenOut for each [thread,model] combination.\n\n Actual generation for each model is implemented by the gencall arg Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n threads: List of threads to input into each model.\n models: A list of initialized models. Defaults to None.\n model_names: --Or-- A list of Models names. Defaults to None.\n model_names_del_after: Delete model_names models after using them: important or an out-of-memory error will eventually happen. Defaults to True.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n\n Raises:\n ValueError: Only one of models or model_names can be given.\n\n Returns:\n A list of lists in the format [thread,model] of shape (len(threads), len(models)). For example: out[0] holds threads[0] results on all models, out[1]: threads[1] on all models, ...\n \"\"\"\n\n if not ((models is None) ^ ((model_names is None))):\n raise ValueError(\"Only one of models or model_names can be given\")\n\n if gencall is None:\n gencall = _default_gencall_text\n\n mod_count = len(models) if models is not None else len(model_names) # type: ignore[arg-type]\n\n all_out = []\n\n for i in range(mod_count):\n if models is not None:\n model = models[i]\n logger.debug(f\"Model: {model.desc}\")\n else:\n name = model_names[i] # type: ignore[index]\n model = Models.create(name)\n logger.info(f\"Model: {name} -> {model.desc}\")\n\n mod_out = []\n for th in threads:\n out = gencall(model, th, genconf)\n\n mod_out.append(out)\n\n all_out.append(mod_out)\n\n if model_names_del_after and models is None:\n del model\n\n # all_out is currently shaped (M,T) -> transpose to (T,M), so that each row contains thread t for all models\n tout = []\n for t in range(len(threads)):\n tmout = [] # thread t for all models\n for m in range(mod_count):\n tmout.append(all_out[m][t])\n\n tout.append(tmout)\n\n return tout\n
"},{"location":"api-reference/multigen/#sibila.multigen.cycle_gen_print","title":"cycle_gen_print","text":"cycle_gen_print(\n in_list,\n inst_text,\n model_names,\n gencall=None,\n genconf=None,\n out_keys=[\"text\", \"dic\", \"value\"],\n json_kwargs={\n \"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False,\n },\n)\n
For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.
Works by doing:
- Generate an INST+IN prompt for a list of models. (Same INST for all).
- Append the output of each model to its own Thread.
- Append the next IN prompt and generate again. Back to 2.
Actual generation for each model is implemented by an optional Callable with this signature def gencall(model: Model, thread: Thread, genconf: GenConf) -> GenOut
Parameters:
Name Type Description Default in_list
list[str]
List of IN messages to initialize Threads.
required inst_text
str
The common INST to use in all models.
required model_names
list[str]
A list of Models names.
required gencall
Optional[Callable]
Callable function that does the actual generation. Defaults to None, which will use a text generation default function.
None
genconf
Optional[GenConf]
Model generation configuration to use in models. Defaults to None, meaning default values.
None
out_keys
list[str]
A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].
['text', 'dic', 'value']
json_kwargs
dict
JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.
{'indent': 2, 'sort_keys': False, 'ensure_ascii': False}
Source code in sibila/multigen.py
def cycle_gen_print(in_list: list[str],\n inst_text: str, \n model_names: list[str],\n\n gencall: Optional[Callable] = None, \n genconf: Optional[GenConf] = None,\n\n out_keys: list[str] = [\"text\",\"dic\", \"value\"],\n\n json_kwargs: dict = {\"indent\": 2,\n \"sort_keys\": False,\n \"ensure_ascii\": False}\n ):\n \"\"\"For a list of models, sequentially grow a Thread with model responses to given IN messages and print the results.\n\n Works by doing:\n\n 1. Generate an INST+IN prompt for a list of models. (Same INST for all).\n 2. Append the output of each model to its own Thread.\n 3. Append the next IN prompt and generate again. Back to 2.\n\n Actual generation for each model is implemented by an optional Callable with this signature:\n def gencall(model: Model,\n thread: Thread,\n genconf: GenConf) -> GenOut\n\n Args:\n in_list: List of IN messages to initialize Threads.\n inst_text: The common INST to use in all models.\n model_names: A list of Models names.\n gencall: Callable function that does the actual generation. Defaults to None, which will use a text generation default function.\n genconf: Model generation configuration to use in models. Defaults to None, meaning default values.\n out_keys: A list with GenOut members to output. Defaults to [\"text\",\"dic\", \"value\"].\n json_kwargs: JSON dumps() configuration. Defaults to {\"indent\": 2, \"sort_keys\": False, \"ensure_ascii\": False }.\n \"\"\"\n\n assert isinstance(model_names, list), \"model_names must be a list of strings\"\n\n if gencall is None:\n gencall = _default_gencall_text\n\n\n n_model = len(model_names)\n n_ins = len(in_list)\n\n for m in range(n_model):\n\n name = model_names[m]\n model = Models.create(name)\n\n print('=' * 80)\n print(f\"Model: {name} -> {model.desc}\")\n\n th = Thread(inst=inst_text)\n\n for i in range(n_ins):\n in_text = in_list[i]\n print(f\"IN: {in_text}\")\n\n th += (MsgKind.IN, in_text)\n\n out = gencall(model, th, genconf)\n\n out_dict = out.as_dict()\n\n print(\"OUT\")\n\n for k in out_keys:\n\n if k in out_dict and out_dict[k] is not None:\n\n if k != out_keys[0]: # not first\n print(\"-\" * 20)\n\n val = nice_print(k, out_dict[k], json_kwargs)\n print(val)\n\n th += (MsgKind.OUT, out.text)\n\n del model\n
"},{"location":"api-reference/thread/","title":"Threads, messages, context","text":""},{"location":"api-reference/thread/#sibila.Thread","title":"Thread","text":"Thread(t=None, inst='', join_sep='\\n')\n
A sequence of messages alternating between IN (\"user\" role) and OUT (\"assistant\" role).
Stores a special initial INST information (known as \"system\" role in ChatML) providing instructions to the model. Some models don't use system instructions - in those cases it's prepended to first IN message.
Messages are kept in a strict IN,OUT,IN,OUT,... order. To enforce this, if two IN messages are added, the second just appends to the text of the first.
Examples:
Creation with a list of messages
>>> from sibila import Thread, MsgKind\n>>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")],\n... inst=\"Be helpful.\")\n>>> print(th)\ninst=\u2588Be helpful.\u2588, sep='\\n', len=2\n0: IN=\u2588Hello model!\u2588\n1: OUT=\u2588Hello there human!\u2588\n
Adding messages
>>> from sibila import Thread, MsgKind\n>>> th = Thread(inst=\"Be helpful.\")\n>>> th.add(MsgKind.IN, \"Can you teach me how to cook?\")\n>>> th.add_IN(\"I mean really cook as a chef?\") # gets appended\n>>> print(th)\ninst=\u2588Be helpful.\u2588, sep='\\n', len=1\n0: IN=\u2588Can you teach me how to cook?\\nI mean really cook as a chef?\u2588\n
Another way to add a message
>>> from sibila import Thread, MsgKind\n>>> th = Thread(inst=\"Be informative.\")\n>>> th.add_IN(\"Tell me about kangaroos, please?\")\n>>> th += \"They are so impressive.\" # appends text to last message\n>>> print(th)\ninst=\u2588Be informative.\u2588, sep='\\n', len=1\n0: IN=\u2588Tell me about kangaroos, please?\\nThey are so impressive.\u2588\n
Return thread as a ChatML message list
>>> from sibila import Thread, MsgKind\n>>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")], \n... inst=\"Be helpful.\")\n>>> th.as_chatml()\n[{'role': 'system', 'content': 'Be helpful.'},\n {'role': 'user', 'content': 'Hello model!'},\n {'role': 'assistant', 'content': 'Hello there human!'}]\n
Parameters:
Name Type Description Default t
Optional[Union[Any, list, str, dict, tuple]]
Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.
None
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Raises:
Type Description TypeError
On invalid args passed.
Source code in sibila/thread.py
def __init__(self,\n t: Optional[Union[Any,list,str,dict,tuple]] = None, # Any=Thread\n inst: str = '',\n join_sep: str = \"\\n\"):\n \"\"\"\n Examples:\n Creation with a list of messages\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")],\n ... inst=\"Be helpful.\")\n >>> print(th)\n inst=\u2588Be helpful.\u2588, sep='\\\\n', len=2\n 0: IN=\u2588Hello model!\u2588\n 1: OUT=\u2588Hello there human!\u2588\n\n Adding messages\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread(inst=\"Be helpful.\")\n >>> th.add(MsgKind.IN, \"Can you teach me how to cook?\")\n >>> th.add_IN(\"I mean really cook as a chef?\") # gets appended\n >>> print(th)\n inst=\u2588Be helpful.\u2588, sep='\\\\n', len=1\n 0: IN=\u2588Can you teach me how to cook?\\\\nI mean really cook as a chef?\u2588\n\n Another way to add a message\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread(inst=\"Be informative.\")\n >>> th.add_IN(\"Tell me about kangaroos, please?\")\n >>> th += \"They are so impressive.\" # appends text to last message\n >>> print(th)\n inst=\u2588Be informative.\u2588, sep='\\\\n', len=1\n 0: IN=\u2588Tell me about kangaroos, please?\\\\nThey are so impressive.\u2588\n\n Return thread as a ChatML message list\n\n >>> from sibila import Thread, MsgKind\n >>> th = Thread([(MsgKind.IN, \"Hello model!\"), (MsgKind.OUT, \"Hello there human!\")], \n ... inst=\"Be helpful.\")\n >>> th.as_chatml()\n [{'role': 'system', 'content': 'Be helpful.'},\n {'role': 'user', 'content': 'Hello model!'},\n {'role': 'assistant', 'content': 'Hello there human!'}]\n\n Args:\n t: Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n\n Raises:\n TypeError: On invalid args passed.\n \"\"\"\n\n if isinstance(t, Thread):\n self._msgs = t._msgs.copy()\n self.inst = t.inst\n self.join_sep = t.join_sep\n else:\n self._msgs = []\n self.inst = inst\n self.join_sep = join_sep\n\n if t is not None:\n self.concat(t)\n
"},{"location":"api-reference/thread/#sibila.Thread.clear","title":"clear","text":"clear()\n
Delete all messages and clear inst.
Source code in sibila/thread.py
def clear(self):\n \"\"\"Delete all messages and clear inst.\"\"\"\n self.inst = \"\"\n self._msgs = []\n
"},{"location":"api-reference/thread/#sibila.Thread.last_kind","title":"last_kind property
","text":"last_kind\n
Get kind of last message in thread .
Returns:
Type Description MsgKind
Kind of last message or MsgKind.IN if empty.
"},{"location":"api-reference/thread/#sibila.Thread.last_text","title":"last_text property
","text":"last_text\n
Get text of last message in thread .
Returns:
Type Description str
Last message text.
Raises:
Type Description IndexError
If thread is empty.
"},{"location":"api-reference/thread/#sibila.Thread.inst","title":"inst instance-attribute
","text":"inst\n
Text for system instructions, defaults to empty string
"},{"location":"api-reference/thread/#sibila.Thread.add","title":"add","text":"add(t, text=None)\n
Add a message to Thread by parsing a mix of types.
Accepts any of these argument combinations:
- t=MsgKind, text=str
- t=str, text=None -> uses last thread message's MsgKind
- (MsgKind, text)
- {\"kind\": \"...\", text: \"...\"}
- {\"role\": \"...\", content: \"...\"} - ChatML format
Parameters:
Name Type Description Default t
Union[str, tuple, dict, MsgKind]
One of the accepted types listed above.
required text
Optional[str]
Message text if first type is MsgKind. Defaults to None.
None
Source code in sibila/thread.py
def add(self, \n t: Union[str,tuple,dict,MsgKind],\n text: Optional[str] = None):\n \"\"\"Add a message to Thread by parsing a mix of types.\n\n Accepts any of these argument combinations:\n\n - t=MsgKind, text=str\n - t=str, text=None -> uses last thread message's MsgKind\n - (MsgKind, text)\n - {\"kind\": \"...\", text: \"...\"}\n - {\"role\": \"...\", content: \"...\"} - ChatML format\n\n Args:\n t: One of the accepted types listed above.\n text: Message text if first type is MsgKind. Defaults to None.\n \"\"\"\n\n kind, text = self._parse_msg(t, text)\n\n if kind == MsgKind.INST:\n self.inst = self.join_text(self.inst, text)\n else:\n if kind == self.last_kind and len(self._msgs):\n self._msgs[-1] = self.join_text(self._msgs[-1], text)\n else:\n self._msgs.append(text) # in new kind\n
"},{"location":"api-reference/thread/#sibila.Thread.addx","title":"addx","text":"addx(path=None, text=None, kind=None)\n
Add message with text from a supplied arg or loaded from a path.
Parameters:
Name Type Description Default path
Optional[str]
If given, text is loaded from an UTF-8 file in this path. Defaults to None.
None
text
Optional[str]
If given, text is added. Defaults to None.
None
kind
Optional[MsgKind]
MsgKind of message. If not given or the same as last thread message, it's appended to it. Defaults to None.
None
Source code in sibila/thread.py
def addx(self, \n path: Optional[str] = None, \n text: Optional[str] = None,\n kind: Optional[MsgKind] = None):\n \"\"\"Add message with text from a supplied arg or loaded from a path.\n\n Args:\n path: If given, text is loaded from an UTF-8 file in this path. Defaults to None.\n text: If given, text is added. Defaults to None.\n kind: MsgKind of message. If not given or the same as last thread message, it's appended to it. Defaults to None.\n \"\"\"\n\n assert (path is not None) ^ (text is not None), \"Only one of path or text\"\n\n if path is not None:\n with open(path, 'r', encoding=\"utf-8\") as f:\n text = f.read()\n\n if kind is None: # use last message role, so that it gets appended\n kind = self.last_kind\n\n self.add(kind, text)\n
"},{"location":"api-reference/thread/#sibila.Thread.get_text","title":"get_text","text":"get_text(index)\n
Return text for message at index.
Parameters:
Name Type Description Default index
int
Message index. Use -1 to get inst value.
required Returns:
Type Description str
Message text at index.
Source code in sibila/thread.py
def get_text(self,\n index: int) -> str:\n \"\"\"Return text for message at index.\n\n Args:\n index: Message index. Use -1 to get inst value.\n\n Returns:\n Message text at index.\n \"\"\" \n if index == -1:\n return self.inst\n else:\n return self._msgs[index]\n
"},{"location":"api-reference/thread/#sibila.Thread.set_text","title":"set_text","text":"set_text(index, text)\n
Set text for message at index.
Parameters:
Name Type Description Default index
int
Message index. Use -1 to set inst value.
required text
str
Text to replace in message at index.
required Source code in sibila/thread.py
def set_text(self,\n index: int,\n text: str): \n \"\"\"Set text for message at index.\n\n Args:\n index: Message index. Use -1 to set inst value.\n text: Text to replace in message at index.\n \"\"\"\n if index == -1:\n self.inst = text\n else:\n self._msgs[index] = text\n
"},{"location":"api-reference/thread/#sibila.Thread.concat","title":"concat","text":"concat(t)\n
Concatenate a Thread or list of messages to the current Thread.
Take care that the other list starts with an IN message, therefore, if last message in self is also an IN kind, their text will be joined as in add().
Parameters:
Name Type Description Default t
Optional[Union[Self, list, str, dict, tuple]]
A Thread or a list of messages. Otherwise a single message as in add().
required Raises:
Type Description TypeError
If bad arg types provided.
Source code in sibila/thread.py
def concat(self,\n t: Optional[Union[Self,list,str,dict,tuple]]):\n \"\"\"Concatenate a Thread or list of messages to the current Thread.\n\n Take care that the other list starts with an IN message, therefore, \n if last message in self is also an IN kind, their text will be joined as in add().\n\n Args:\n t: A Thread or a list of messages. Otherwise a single message as in add().\n\n Raises:\n TypeError: If bad arg types provided.\n \"\"\"\n if isinstance(t, Thread):\n for msg in t:\n self.add(msg)\n self.inst = self.join_text(self.inst, t.inst)\n\n elif isinstance(t, list): # message list\n for msg in t:\n self.add(msg)\n\n elif isinstance(t, str) or isinstance(t, dict) or isinstance(t, tuple): # single message\n self.add(t)\n\n else:\n raise TypeError(\"Arg t must be: Thread --or-- list[messages] --or-- an str, tuple or dict single message.\")\n
"},{"location":"api-reference/thread/#sibila.Thread.load","title":"load","text":"load(path)\n
Load this Thread from a JSON file.
Parameters:
Name Type Description Default path
str
Path of file to load.
required Source code in sibila/thread.py
def load(self,\n path: str):\n \"\"\"Load this Thread from a JSON file.\n\n Args:\n path: Path of file to load.\n \"\"\"\n\n with open(path, 'r', encoding='utf-8') as f:\n js = f.read()\n state = json.loads(js)\n\n self._msgs = state[\"_msgs\"]\n self.inst = state[\"inst\"]\n self.join_sep = state[\"join_sep\"]\n
"},{"location":"api-reference/thread/#sibila.Thread.save","title":"save","text":"save(path)\n
Serialize this Thread to JSON.
Parameters:
Name Type Description Default path
str
Path of file to save into.
required Source code in sibila/thread.py
def save(self,\n path: str):\n \"\"\"Serialize this Thread to JSON.\n\n Args:\n path: Path of file to save into.\n \"\"\"\n\n state = {\"_msgs\": self._msgs,\n \"inst\": self.inst,\n \"join_sep\": self.join_sep\n }\n\n json_str = json.dumps(state, indent=2, default=vars)\n\n with open(path, 'w', encoding='utf-8') as f:\n f.write(json_str)\n
"},{"location":"api-reference/thread/#sibila.Thread.msg_as_chatml","title":"msg_as_chatml","text":"msg_as_chatml(index)\n
Returns message in a ChatML dict.
Parameters:
Name Type Description Default index
int
Index of the message to return.
required Returns:
Type Description dict
A ChatML dict with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def msg_as_chatml(self,\n index: int) -> dict:\n \"\"\"Returns message in a ChatML dict.\n\n Args:\n index: Index of the message to return.\n\n Returns:\n A ChatML dict with \"role\" and \"content\" keys.\n \"\"\"\n\n kind = Thread._kind_from_pos(index)\n role = MsgKind.chatml_role_from_kind(kind)\n text = self._msgs[index] if index >= 0 else self.inst\n return {\"role\": role, \"content\": text}\n
"},{"location":"api-reference/thread/#sibila.Thread.as_chatml","title":"as_chatml","text":"as_chatml()\n
Returns Thread as a list of ChatML messages.
Returns:
Type Description list[dict]
A list of ChatML dict elements with \"role\" and \"content\" keys.
Source code in sibila/thread.py
def as_chatml(self) -> list[dict]:\n \"\"\"Returns Thread as a list of ChatML messages.\n\n Returns:\n A list of ChatML dict elements with \"role\" and \"content\" keys.\n \"\"\"\n msgs = []\n\n for index,msg in enumerate(self._msgs):\n if index == 0 and self.inst:\n msgs.append(self.msg_as_chatml(-1))\n msgs.append(self.msg_as_chatml(index))\n\n return msgs\n
"},{"location":"api-reference/thread/#sibila.Thread.has_text_lower","title":"has_text_lower","text":"has_text_lower(text_lower)\n
Can the lowercase text be found in one of the messages?
Parameters:
Name Type Description Default text_lower
str
The lowercase text to search for in messages.
required Returns:
Type Description bool
True if such text was found.
Source code in sibila/thread.py
def has_text_lower(self,\n text_lower: str) -> bool:\n \"\"\"Can the lowercase text be found in one of the messages?\n\n Args:\n text_lower: The lowercase text to search for in messages.\n\n Returns:\n True if such text was found.\n \"\"\"\n for msg in self._msgs:\n if text_lower in msg.lower():\n return True\n\n return False \n
"},{"location":"api-reference/thread/#sibila.MsgKind","title":"MsgKind","text":"Enumeration for kinds of messages in a Thread.
"},{"location":"api-reference/thread/#sibila.MsgKind.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 0\n
Input message, from user.
"},{"location":"api-reference/thread/#sibila.MsgKind.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 1\n
Model output message.
"},{"location":"api-reference/thread/#sibila.MsgKind.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 2\n
Initial instructions for model.
"},{"location":"api-reference/thread/#sibila.Context","title":"Context","text":"Context(\n t=None,\n max_token_len=None,\n pinned_inst_text=\"\",\n join_sep=\"\\n\",\n)\n
A class based on Thread that manages total token length, so that it's kept under a certain value. Also supports a persistent inst (instructions) text.
Parameters:
Name Type Description Default t
Optional[Union[Thread, list, str, dict, tuple]]
Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.
None
max_token_len
Optional[int]
Maximum token count to use when trimming. Defaults to None, which will use max model context length.
None
pinned_inst_text
str
Pinned inst text which survives clear(). Defaults to \"\".
''
join_sep
str
Separator used when message text needs to be joined. Defaults to \"\\n\".
'\\n'
Source code in sibila/context.py
def __init__(self, \n t: Optional[Union[Thread,list,str,dict,tuple]] = None, \n max_token_len: Optional[int] = None, \n pinned_inst_text: str = \"\",\n join_sep: str = \"\\n\"):\n \"\"\"\n Args:\n t: Can initialize from a Thread, from a list (containing messages in any format accepted in _parse_msg()) or a single message as an str, an (MsgKind,text) tuple or a dict. Defaults to None.\n max_token_len: Maximum token count to use when trimming. Defaults to None, which will use max model context length.\n pinned_inst_text: Pinned inst text which survives clear(). Defaults to \"\".\n join_sep: Separator used when message text needs to be joined. Defaults to \"\\\\n\".\n \"\"\"\n\n super().__init__(t,\n inst=pinned_inst_text,\n join_sep=join_sep)\n\n self.max_token_len = max_token_len\n\n self.pinned_inst_text = pinned_inst_text\n
"},{"location":"api-reference/thread/#sibila.Context.clear","title":"clear","text":"clear()\n
Delete all messages but reset inst to a pinned text if any.
Source code in sibila/context.py
def clear(self):\n \"\"\"Delete all messages but reset inst to a pinned text if any.\"\"\"\n super().clear() \n if self.pinned_inst_text is not None:\n self.inst = self.pinned_inst_text\n
"},{"location":"api-reference/thread/#sibila.Context.trim","title":"trim","text":"trim(trim_flags, model, *, max_token_len=None)\n
Trim context by selectively removing older messages until thread fits max_token_len.
Parameters:
Name Type Description Default trim_flags
Trim
Flags to guide selection of which messages to remove.
required model
Model
Model that will process the thread.
required max_token_len
Optional[int]
Cut messages until size is lower than this number. Defaults to None.
None
Raises:
Type Description RuntimeError
If unable to trim anything.
Returns:
Type Description bool
True if any context trimming occurred.
Source code in sibila/context.py
def trim(self,\n trim_flags: Trim,\n model: Model,\n *,\n max_token_len: Optional[int] = None,\n ) -> bool:\n \"\"\"Trim context by selectively removing older messages until thread fits max_token_len.\n\n Args:\n trim_flags: Flags to guide selection of which messages to remove.\n model: Model that will process the thread.\n max_token_len: Cut messages until size is lower than this number. Defaults to None.\n\n Raises:\n RuntimeError: If unable to trim anything.\n\n Returns:\n True if any context trimming occurred.\n \"\"\"\n\n if max_token_len is None:\n max_token_len = self.max_token_len\n\n if max_token_len is None:\n max_token_len = model.ctx_len\n\n # if genconf is None:\n # genconf = model.genconf \n # assert max_token_len < model.ctx_len, f\"max_token_len ({max_token_len}) must be < model's context size ({model.ctx_len}) - genconf.max_new_tokens\"\n\n if trim_flags == Trim.NONE: # no trimming\n return False\n\n thread = self.clone()\n\n any_trim = False\n\n while True:\n\n curr_len = model.token_len(thread)\n\n if curr_len <= max_token_len:\n break\n\n logger.debug(f\"len={curr_len} / max={max_token_len}\")\n\n if self.inst and trim_flags & Trim.INST:\n self.inst = ''\n any_trim = True\n logger.debug(f\"Cutting INST {self.inst[:80]} (...)\")\n continue\n\n # cut first possible message, starting from oldest first ones\n trimmed = False\n in_index = out_index = 0\n\n for index,m in enumerate(thread):\n kind,text = m\n\n if kind == MsgKind.IN:\n if trim_flags & Trim.IN:\n if not (trim_flags & Trim.KEEP_FIRST_IN and in_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting IN {text[:80]} (...)\")\n break\n in_index += 1\n\n elif kind == MsgKind.OUT:\n if trim_flags & Trim.OUT: \n if not (trim_flags & Trim.KEEP_FIRST_OUT and out_index == 0):\n del thread[index]\n trimmed = True\n logger.debug(f\"Cutting OUT {text[:80]} (...)\")\n break\n out_index += 1\n\n if not trimmed:\n # all thread messages were cycled but not a single could be cut, so size remains the same\n # arriving here we did all we could for trim_flags but could not remove any more\n raise RuntimeError(\"Unable to trim anything out of thread\")\n else:\n any_trim = True\n\n # while end\n\n\n if any_trim:\n self._msgs = thread._msgs\n\n return any_trim\n
"},{"location":"api-reference/thread/#sibila.Trim","title":"Trim","text":"Flags for Thread trimming.
"},{"location":"api-reference/thread/#sibila.Trim.NONE","title":"NONE class-attribute
instance-attribute
","text":"NONE = 0\n
No trimming.
"},{"location":"api-reference/thread/#sibila.Trim.INST","title":"INST class-attribute
instance-attribute
","text":"INST = 1\n
Can remove INST message.
"},{"location":"api-reference/thread/#sibila.Trim.IN","title":"IN class-attribute
instance-attribute
","text":"IN = 2\n
Can remove IN messages.
"},{"location":"api-reference/thread/#sibila.Trim.OUT","title":"OUT class-attribute
instance-attribute
","text":"OUT = 4\n
Can remove OUT messages.
"},{"location":"api-reference/thread/#sibila.Trim.KEEP_FIRST_IN","title":"KEEP_FIRST_IN class-attribute
instance-attribute
","text":"KEEP_FIRST_IN = 1024\n
If trimming IN messages, never remove first one.
"},{"location":"api-reference/thread/#sibila.Trim.KEEP_FIRST_OUT","title":"KEEP_FIRST_OUT class-attribute
instance-attribute
","text":"KEEP_FIRST_OUT = 2048\n
If trimming OUT messages, never remove first one.
"},{"location":"api-reference/tokenizer/","title":"Model tokenizers","text":""},{"location":"api-reference/tokenizer/#llamacpp","title":"LlamaCpp","text":""},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer","title":"LlamaCppTokenizer","text":"LlamaCppTokenizer(llama, reg_flags=None)\n
Tokenizer for llama.cpp loaded GGUF models.
Source code in sibila/llamacpp.py
def __init__(self, \n llama: Llama, \n reg_flags: Optional[str] = None):\n self._llama = llama\n\n self.vocab_size = self._llama.n_vocab()\n\n self.bos_token_id = self._llama.token_bos()\n self.bos_token = llama_token_get_text(self._llama.model, self.bos_token_id).decode(\"utf-8\")\n\n self.eos_token_id = self._llama.token_eos()\n self.eos_token = llama_token_get_text(self._llama.model, self.eos_token_id).decode(\"utf-8\")\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None # ? fill by taking a look at id 0?\n self.unk_token = None\n\n # workaround for https://github.com/ggerganov/llama.cpp/issues/4772\n self._workaround1 = reg_flags is not None and \"llamacpp1\" in reg_flags\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/llamacpp.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n\n if self._workaround1:\n # append a space after each bos and eos, so that llama's tokenizer matches HF\n def space_post(text, s):\n out = \"\"\n while (index := text.find(s)) != -1:\n after = index + len(s)\n out += text[:after]\n if text[after] != ' ':\n out += ' '\n text = text[after:]\n\n out += text\n return out\n\n text = space_post(text, self.bos_token)\n text = space_post(text, self.eos_token)\n # print(text)\n\n # str -> bytes\n btext = text.encode(\"utf-8\", errors=\"ignore\")\n\n return self._llama.tokenize(btext, add_bos=False, special=True)\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/llamacpp.py
def decode(self,\n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Using instead of llama-cpp-python's to fix error: remove first character after a bos only if it's a space.\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n\n if not len(token_ids):\n return \"\"\n\n output = b\"\"\n size = 32\n buffer = (ctypes.c_char * size)()\n\n if not skip_special:\n special_toks = {self.bos_token_id: self.bos_token.encode(\"utf-8\"), # type: ignore[union-attr]\n self.eos_token_id: self.eos_token.encode(\"utf-8\")} # type: ignore[union-attr]\n\n for token in token_ids:\n if token == self.bos_token_id:\n output += special_toks[token]\n elif token == self.eos_token_id:\n output += special_toks[token]\n else:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n else: # skip special\n for token in token_ids:\n if token != self.bos_token_id and token != self.eos_token_id:\n n = llama_cpp.llama_token_to_piece(\n self._llama.model, llama_cpp.llama_token(token), buffer, size\n )\n output += bytes(buffer[:n]) # type: ignore[arg-type]\n\n\n # \"User code is responsible for removing the leading whitespace of the first non-BOS token when decoding multiple tokens.\"\n if (# token_ids[0] != self.bos_token_id and # we also try cutting if first is bos to approximate HF tokenizer\n len(output) and output[0] <= 32 # 32 = ord(' ')\n ):\n output = output[1:]\n\n return output.decode(\"utf-8\", errors=\"ignore\")\n
"},{"location":"api-reference/tokenizer/#sibila.LlamaCppTokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tokenizer/#openai","title":"OpenAI","text":""},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer","title":"OpenAITokenizer","text":"OpenAITokenizer(model)\n
Tokenizer for OpenAI models.
Source code in sibila/openai.py
def __init__(self, \n model: str\n ):\n\n if not has_tiktoken:\n raise Exception(\"Please install tiktoken by running: pip install tiktoken\")\n\n self._tok = tiktoken.encoding_for_model(model)\n\n self.vocab_size = self._tok.n_vocab\n\n self.bos_token_id = None\n self.bos_token = None\n\n self.eos_token_id = None\n self.eos_token = None\n\n self.pad_token_id = None\n self.pad_token = None\n\n self.unk_token_id = None\n self.unk_token = None\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.encode","title":"encode","text":"encode(text)\n
Encode text into model tokens. Inverse of Decode().
Parameters:
Name Type Description Default text
str
Text to be encoded.
required Returns:
Type Description list[int]
A list of ints with the encoded tokens.
Source code in sibila/openai.py
def encode(self, \n text: str) -> list[int]:\n \"\"\"Encode text into model tokens. Inverse of Decode().\n\n Args:\n text: Text to be encoded.\n\n Returns:\n A list of ints with the encoded tokens.\n \"\"\"\n return self._tok.encode(text)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.decode","title":"decode","text":"decode(token_ids, skip_special=True)\n
Decode model tokens to text. Inverse of Encode().
Parameters:
Name Type Description Default token_ids
list[int]
List of model tokens.
required skip_special
bool
Don't decode special tokens like bos and eos. Defaults to True.
True
Returns:
Type Description str
Decoded text.
Source code in sibila/openai.py
def decode(self, \n token_ids: list[int],\n skip_special: bool = True) -> str:\n \"\"\"Decode model tokens to text. Inverse of Encode().\n\n Args:\n token_ids: List of model tokens.\n skip_special: Don't decode special tokens like bos and eos. Defaults to True.\n\n Returns:\n Decoded text.\n \"\"\"\n assert skip_special, \"OpenAITokenizer only supports skip_special=True\"\n\n return self._tok.decode(token_ids)\n
"},{"location":"api-reference/tokenizer/#sibila.OpenAITokenizer.token_len","title":"token_len","text":"token_len(text)\n
Returns token length for given text.
Parameters:
Name Type Description Default text
str
Text to be measured.
required Returns:
Type Description int
Token length for given text.
Source code in sibila/model.py
def token_len(self, \n text: str) -> int:\n \"\"\"Returns token length for given text.\n\n Args:\n text: Text to be measured.\n\n Returns:\n Token length for given text.\n \"\"\"\n\n tokens = self.encode(text)\n return len(tokens) \n
"},{"location":"api-reference/tools/","title":"Tools","text":""},{"location":"api-reference/tools/#sibila.tools","title":"tools","text":"Tools for model interaction, summarization, etc.
- interact(): Interact with model as in a chat, using input().
- loop(): Iteratively append inputs and generate model outputs.
- recursive_summarize(): Recursively summarize a (large) text or text file.
"},{"location":"api-reference/tools/#sibila.tools.interact","title":"interact","text":"interact(\n model,\n *,\n ctx=None,\n inst_text=None,\n trim_flags=TRIM_DEFAULT,\n genconf=None\n)\n
Interact with model as in a chat, using input().
Includes a list of commands: type !? to see help.
Parameters:
Name Type Description Default model
Model
Model to use for generating.
required ctx
Optional[Context]
Optional input Context. Defaults to None.
None
inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
trim_flags
Trim
Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, defaults to model's.
None
Returns:
Type Description Context
Context after all the interactions.
Source code in sibila/tools.py
def interact(model: Model,\n *,\n ctx: Optional[Context] = None,\n inst_text: Optional[str] = None,\n trim_flags: Trim = TRIM_DEFAULT,\n\n genconf: Optional[GenConf] = None,\n ) -> Context:\n \"\"\"Interact with model as in a chat, using input().\n\n Includes a list of commands: type !? to see help.\n\n Args:\n model: Model to use for generating.\n ctx: Optional input Context. Defaults to None.\n inst_text: text for Thread instructions. Defaults to None.\n trim_flags: Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n genconf: Model generation configuration. Defaults to None, defaults to model's. \n\n Returns:\n Context after all the interactions.\n \"\"\"\n\n def callback(out: Union[GenOut,None], \n ctx: Context, \n model: Model,\n genconf: GenConf) -> bool:\n\n if out is not None:\n if out.res != GenRes.OK_STOP:\n print(f\"***Result={GenRes.as_text(out.res)}***\")\n\n if out.text:\n text = out.text\n else:\n text = \"***No text out***\"\n\n ctx.add_OUT(text)\n print(text)\n print()\n\n\n def print_thread_info():\n if ctx.max_token_len is not None: # use from ctx\n max_token_len = ctx.max_token_len\n else: # assume max possible for model context and genconf\n max_token_len = model.ctx_len - genconf.max_tokens\n\n length = model.token_len(ctx, genconf)\n print(f\"Thread token len={length}, max len before next gen={max_token_len}\")\n\n\n\n # input loop ===============================================\n MARKER: str = '\"\"\"'\n multiline: str = \"\"\n\n while True:\n\n user = input('>').strip()\n\n if multiline:\n if user.endswith(MARKER):\n user = multiline + \"\\n\" + user[:-3]\n multiline = \"\"\n else:\n multiline += \"\\n\" + user\n continue\n\n else:\n if not user:\n return False # terminate loop\n\n elif user.startswith(MARKER):\n multiline = user[3:]\n continue\n\n elif user.endswith(\"\\\\\"):\n user = user[:-1]\n user = user.replace(\"\\\\n\", \"\\n\")\n ctx.add_IN(user)\n continue\n\n elif user.startswith(\"!\"): # a command\n params = user[1:].split(\"=\")\n cmd = params[0]\n params = params[1:]\n\n if cmd == \"inst\":\n ctx.clear()\n if params:\n text = params[0].replace(\"\\\\n\", \"\\n\")\n ctx.inst = text\n\n elif cmd == \"add\" or cmd == \"a\":\n if params:\n try:\n path = params[0]\n ctx.addx(path=path)\n ct = ctx.last_text\n print(ct[:500])\n except FileNotFoundError:\n print(f\"Could not load '{path}'\")\n else:\n print(\"Path needed\")\n\n elif cmd == 'c':\n print_thread_info()\n print(ctx)\n\n elif cmd == 'cl':\n if not params:\n params.append(\"ctx.json\")\n try:\n ctx.load(params[0])\n print(f\"Loaded context from {params[0]}\")\n except FileNotFoundError:\n print(f\"Could not load '{params[0]}'\")\n\n elif cmd == 'cs':\n if not params:\n params.append(\"ctx.json\")\n ctx.save(params[0])\n print(f\"Saved context to {params[0]}\")\n\n elif cmd == 'tl':\n print_thread_info()\n\n elif cmd == 'i':\n print(f\"Model:\\n{model.info()}\")\n print(f\"GenConf:\\n{genconf}\\n\")\n\n print_thread_info()\n\n # elif cmd == 'p':\n # print(model.text_from_turns(ctx.turns))\n\n # elif cmd == 'to':\n # token_ids = model.tokens_from_turns(ctx.turns)\n # print(f\"Prompt tokens={token_ids}\")\n\n\n else:\n print(f\"Unknown command '!{cmd}' - known commands:\\n\"\n \" !inst[=text] - clear messages and add inst (system) message\\n\"\n \" !add|!a=path - load file and add to last msg\\n\"\n \" !c - list context msgs\\n\"\n \" !cl=path - load context (default=ctx.json)\\n\"\n \" !cs=path - save context (default=ctx.json)\\n\"\n \" !tl - thread's token length\\n\"\n \" !i - model and genconf info\\n\"\n ' Delimit with \"\"\" for multiline begin/end or terminate line with \\\\ to continue into a new line\\n'\n \" Empty line + enter to quit\"\n )\n # \" !p - show formatted prompt (if model supports it)\\n\"\n # \" !to - prompt's tokens\\n\"\n\n print()\n continue\n\n # we have a user prompt\n user = user.replace(\"\\\\n\", \"\\n\")\n break\n\n\n ctx.add_IN(user)\n\n return True # continue looping\n\n\n\n # start prompt loop\n ctx = loop(callback,\n model,\n\n ctx=ctx,\n inst_text=inst_text,\n in_text=None, # call callback for first prompt\n trim_flags=trim_flags)\n\n return ctx\n
"},{"location":"api-reference/tools/#sibila.tools.loop","title":"loop","text":"loop(\n callback,\n model,\n *,\n inst_text=None,\n in_text=None,\n trim_flags=TRIM_DEFAULT,\n ctx=None,\n genconf=None\n)\n
Iteratively append inputs and generate model outputs.
Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.
If last Thread msg is not MsgKind.IN, callback() will be called with out_text=None.
Parameters:
Name Type Description Default callback
Callable[[Union[GenOut, None], Context, Model, GenConf], bool]
A function(out, ctx, model) that will be iteratively called with model's output.
required model
Model
Model to use for generating.
required inst_text
Optional[str]
text for Thread instructions. Defaults to None.
None
in_text
Optional[str]
Text for Thread's initial MsgKind.IN. Defaults to None.
None
trim_flags
Trim
Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.
TRIM_DEFAULT
ctx
Optional[Context]
Optional input Context. Defaults to None.
None
genconf
Optional[GenConf]
Model generation configuration. Defaults to None, defaults to model's.
None
Source code in sibila/tools.py
def loop(callback: Callable[[Union[GenOut,None], Context, Model, GenConf], bool],\n model: Model,\n *,\n inst_text: Optional[str] = None,\n in_text: Optional[str] = None,\n\n trim_flags: Trim = TRIM_DEFAULT,\n ctx: Optional[Context] = None,\n\n genconf: Optional[GenConf] = None,\n ) -> Context:\n \"\"\"Iteratively append inputs and generate model outputs.\n\n Callback should call ctx.add_OUT(), ctx.add_IN() and return a bool to continue looping or not.\n\n If last Thread msg is not MsgKind.IN, callback() will be called with out_text=None.\n\n Args:\n callback: A function(out, ctx, model) that will be iteratively called with model's output.\n model: Model to use for generating.\n inst_text: text for Thread instructions. Defaults to None.\n in_text: Text for Thread's initial MsgKind.IN. Defaults to None.\n trim_flags: Context trimming flags, when Thread is too long. Defaults to TRIM_DEFAULT.\n ctx: Optional input Context. Defaults to None.\n genconf: Model generation configuration. Defaults to None, defaults to model's.\n \"\"\"\n\n if ctx is None:\n ctx = Context()\n else:\n ctx = ctx\n\n if inst_text is not None:\n ctx.inst = inst_text\n if in_text is not None:\n ctx.add_IN(in_text)\n\n if genconf is None:\n genconf = model.genconf\n\n if ctx.max_token_len is not None: # use from ctx\n max_token_len = ctx.max_token_len\n else: # assume max possible for model context and genconf\n max_token_len = model.ctx_len - genconf.max_tokens\n\n\n while True:\n\n if len(ctx) and ctx.last_kind == MsgKind.IN:\n # last is an IN message: we can trim and generate\n\n ctx.trim(trim_flags,\n model,\n max_token_len=max_token_len\n )\n\n out = model.gen(ctx, genconf)\n else:\n out = None # first call\n\n res = callback(out, \n ctx, \n model,\n genconf)\n\n if not res:\n break\n\n\n return ctx\n
"},{"location":"api-reference/tools/#sibila.tools.recursive_summarize","title":"recursive_summarize","text":"recursive_summarize(\n model, text=None, path=None, overlap_size=20\n)\n
Recursively summarize a (large) text or text file.
Works by:
- Breaking text into chunks that fit models context.
- Run model to summarize chunks.
- Join generated summaries and jump to 1. - do this until text size no longer decreases.
Parameters:
Name Type Description Default model
Model
Model to use for summarizing.
required text
Optional[str]
Initial text.
None
path
Optional[str]
--Or-- A path to an UTF-8 text file.
None
overlap_size
int
Size in model tokens of the overlapping portions at beginning and end of chunks.
20
Returns:
Type Description str
The summarized text.
Source code in sibila/tools.py
def recursive_summarize(model: Model,\n text: Optional[str] = None,\n path: Optional[str] = None,\n overlap_size: int = 20) -> str:\n \"\"\"Recursively summarize a (large) text or text file.\n\n Works by:\n\n 1. Breaking text into chunks that fit models context.\n 2. Run model to summarize chunks.\n 3. Join generated summaries and jump to 1. - do this until text size no longer decreases.\n\n Args:\n model: Model to use for summarizing.\n text: Initial text.\n path: --Or-- A path to an UTF-8 text file.\n overlap_size: Size in model tokens of the overlapping portions at beginning and end of chunks.\n\n Returns:\n The summarized text.\n \"\"\"\n\n if (text is not None) + (path is not None) != 1:\n raise ValueError(\"Only one of text or path can be given\")\n\n if path is not None:\n with open(path, \"r\", encoding=\"utf-8\") as f:\n text = f.read()\n\n inst_text = \"\"\"Your task is to do short summaries of text.\"\"\"\n in_text = \"Summarize the following text:\\n\"\n ctx = Context(pinned_inst_text=inst_text)\n\n # split initial text\n max_token_len = model.ctx_len - model.genconf.max_tokens - (model.tokenizer.token_len(inst_text + in_text) + 16) \n logger.debug(f\"Max ctx token len {max_token_len}\")\n\n token_len_fn = model.tokenizer.token_len_lambda\n logger.debug(f\"Initial text token_len {token_len_fn(text)}\") # type: ignore[arg-type]\n\n spl = RecursiveTextSplitter(max_token_len, overlap_size, len_fn=token_len_fn)\n\n round = 0\n while True: # summarization rounds\n logger.debug(f\"Round {round} {'='*60}\")\n\n in_list = spl.split(text=text)\n in_len = sum([len(t) for t in in_list])\n\n logger.debug(f\"Split in {len(in_list)} parts, total len {in_len} chars\")\n\n out_list = []\n for i,t in enumerate(in_list):\n\n logger.debug(f\"{round}>{i} {'='*30}\")\n\n ctx.clear()\n ctx.add_IN(in_text)\n ctx.add_IN(t)\n\n out = model.gen(ctx) \n logger.debug(out)\n\n out_list.append(out.text)\n\n text = \"\\n\".join(out_list)\n\n out_len = len(text) # sum([len(t) for t in out_list])\n if out_len >= in_len:\n break\n elif len(out_list) == 1:\n break\n else:\n round += 1\n\n return text\n
"},{"location":"examples/","title":"Examples","text":"Example Description Hello model Introductory pirate arrr-example: create local or remote models, use the Models class to simplify. From text to object Keypoint extractor, showing progressively better ways to query a model, from plain text, JSON, to Pydantic classes. Extract information Extract information about all persons mentioned in a text. Also available in a dataclass version. Tag customer queries Summarize and classify customer queries into tags. Quick meeting Extracting participants, action items and priorities from a simple meeting transcript. Tough meeting Extracting information from a long and complex transcript. Compare model output Compare sentiment analyses of customer reviews done by two models. Chat interaction Interact with the model as in a back-and-forth chat session. Model management with CLI Download and manage models with the command-line sibila. Each example is explained in a Read Me and usually include a Jupyter notebook and/or a .py script version.
Most of the examples use a local model but you can quickly change to using OpenAI models by uncommenting one or two lines.
"},{"location":"examples/cli/","title":"Sibila CLI","text":"In this example we'll see how to use the sibila Command-Line Interface (CLI) to download a GGUF model from the Hugging Face model hub.
We'll then register it in the Models factory, so that it can be easily used with Models.create(). The Models factory is based in a folder where model GGUF format files are stored and two configuration files: \"models.json\" and \"formats.json\".
After Doing the above, we'll be able to use this model in Python with two lines:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:rocket\")\n
Let's run sibila CLI to get help:
> sibila --help\n\nusage: sibila [-h] [--version] {models,formats,hub} ...\n\nSibila cli tool for managing models and formats.\n\noptions:\n -h, --help show this help message and exit\n --version show program's version number and exit\n\nactions:\n hf, models, formats\n\n {models,formats,hub} Run 'sibila {command} --help' for specific help.\n
Sibila CLI has three modes:
- models: to edit a 'models.json' file, create model entries set format, etc.
- formats: to edit a 'formats.json' file, add new formats, etc.
- hub: search and download models from Hugging Face model hub.
Specific help for each mode is available by doing: sibila mode --help
Let's download the Rocket 3B model, a small but capable model, fine-tuned for chat/instruct prompts:
https://huggingface.co/TheBloke/rocket-3B-GGUF
We'll use a \"sibila hub -d\" command to download to \"../../models\" folder. We'll get the 4-bit quantization (Q4_K_M):
> sibila hub -d 'TheBloke/rocket-3B-GGUF' -f Q4_K_M -m '../../models'\n\nSearching...\nDownloading model 'TheBloke/rocket-3B-GGUF' file 'rocket-3b.Q4_K_M.gguf' to '../../models/rocket-3b.Q4_K_M.gguf'\n\nDownload complete.\nFor information about this and other models, please visit https://huggingface.co\n
After this command, the \"rocket-3b.Q4_K_M.gguf\" file has now been downloaded to the \"../../models\" folder.
We'll now register it with the Models factory, which is located in the folder to where we downloaded the model.
This can be done by editing the \"models.json\" file directly or even simpler, with a \"sibila models -s\" command:
> sibila models -s llamacpp:rocket rocket-3b.Q4_K_M.gguf -m '../../models'\n\nUsing models directory '../../models'\nSet model 'llamacpp:rocket' with name='rocket-3b.Q4_K_M.gguf' at '/home/jorge/ai/sibila/models/models.json'.\n
An entry has now been created in \"models.json\" for this model.
However, we did not set the chat template format - but let's first test if the downloaded GGUF file already includes it in its metadata.
This is done with \"sibila models -t\":
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nError: Could not find a suitable chat template format for this model. Without a format, fine-tuned models cannot function properly. See the docs on how you can fix this: either setup the format in Models factory, or provide the chat template in the 'format' arg.\n
Error. Looks like we need to set the chat template format!
Checking the model's page, we find that it uses the ChatML prompt/chat template, which is great because it's one of the base formats included with Sibila.
So let's set the template format in the \"llamacpp:rocket\" entry we've just created:
> sibila models -f llamacpp:rocket chatml -m '../../models'\n\nUsing models directory '/home/jorge/ai/sibila/models'\nUpdated model 'llamacpp:rocket' with format 'chatml' at '/home/jorge/ai/sibila/models/models.json'.\n
Let's now test again:
> sibila models -t llamacpp:rocket -m '../../models'\n\nUsing models directory '../../models'\nTesting model 'llamacpp:rocket'...\nModel 'llamacpp:rocket' was properly created and should run fine.\n
Great - the model passed the test and should be ready for use.
Let's try using it from Python:
from sibila import Models\n\nModels.setup(\"../../models\") # the folder with models and configs\n\nmodel = Models.create(\"llamacpp:rocket\") # model name in provider:name format\n\nmodel(\"Hello there!\")\n
\"Hello! I'm an AI language model here to assist you with your inquiries or generate content for you. I am programmed to be polite and respectful, so please let me know how I can help you today.\"\n
Seems to be working - and politely too!
"},{"location":"examples/compare/","title":"Compare","text":"In this example we'll use an utility function from the multigen module that builds a table of answers to a list of questions, as generated by multiple models. This can be very helpful to compare how two or more models react to the same input.
This function generates a 2-D table of [ input , model ], where each row is the output from different models to the same question or input. Such table can be printed or saved as a CSV file.
For the local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the local_name variable below, after the text \"llamacpp:\".
Jupyter notebook and Python script versions are available in the example's folder.
Instead of directly creating models as we've seen in previous examples, multigen will create the models via the Models class directory.
We'll start by choosing a local and a remote model that we'll compare.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nlocal_name = \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\n# to use an OpenAI model:\nremote_name = \"openai:gpt-3.5\"\n
Now let's define a list of reviews that we'll ask the two models to do sentiment analysis upon.
These are generic product reviews, that you could find in an online store.
reviews = [\n\"The user manual was confusing, but once I figured it out, the product more or less worked.\",\n\"This widget changed my life! It's sleek, efficient, and worth every penny.\",\n\"I'm disappointed with the product quality. It broke after just a week of use.\",\n\"The customer service team was incredibly helpful in resolving my issue with the device.\",\n\"I'm blown away by the functionality of this gadget. It exceeded my expectations.\",\n\"The packaging was damaged upon arrival, but the product itself works great.\",\n\"I've been using this tool for months, and it's still as good as new. Highly recommended!\",\n\"I regret purchasing this item. It doesn't perform as advertised.\",\n\"I've never had so much trouble with a product before. It's been a headache from day one.\",\n\"I bought this as a gift for my friend, and they absolutely love it!\",\n\"The price seemed steep at first, but after using it, I understand why. Quality product.\",\n\"This gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\"\n]\n\n# model instructions text, also known as system message\ninst_text = \"You are a helpful assistant that analyses text sentiment.\"\n
Since we just want to obtain a sentiment classification, we'll use a convenient enumeration: a list with three values: positive, negative or neutral.
Let's try the first review on a local model:
sentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nin_text = \"Each line is a product review. Extract the sentiment associated with each review:\\n\\n\" + reviews[0]\n\nprint(reviews[0])\n\nlocal_model = Models.create(local_name)\n\nout = local_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\n# to clear memory\ndel local_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
Definitely neutral is a good answer for this one.
Let's now try the remote model:
print(reviews[0])\n\nremote_model = Models.create(remote_name)\n\nout = remote_model.extract(sentiment_enum,\n in_text,\n inst=inst_text)\ndel remote_model\n\nprint(out)\n
The user manual was confusing, but once I figured it out, the product more or less worked.\nneutral\n
And the remote model (GPT-3.5) seems to agree on neutrality.
By using the query_multigen() function that we'll import from sibila.multigen, we'll be able to compare what multiple models generate in response to each input.
In our case the inputs will be the list of reviews. This function accepts these interesting arguments: - text: type of text output, which can be the word \"print\" or a text filename to which it will save. - csv: type of CSV output, which can also be \"print\" or a text filename to save into. - out_keys: what we want listed: the generated raw text (\"text\"), a Python dict (\"dict\") or a Pydantic object (\"obj\"). For our case \"dict\" is the right one. - gencall: we need to pass a function that will actually call the model for each input. We use a convenient predefined function and provide it with the sentiment_type definition.
Let's run it with our two models:
from sibila.multigen import (\n query_multigen,\n make_extract_gencall\n)\n\nsentiment_enum = [\"positive\", \"neutral\", \"negative\"]\n\nout = query_multigen(reviews,\n inst_text,\n model_names = [local_name, remote_name],\n text=\"print\",\n csv=\"sentiment.csv\",\n out_keys = [\"value\"],\n gencall = make_extract_gencall(sentiment_enum)\n )\n
////////////////////////////////////////////////////////////\nThe user manual was confusing, but once I figured it out, the product more or less worked.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-3.5 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nThis widget changed my life! It's sleek, efficient, and worth every penny.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm disappointed with the product quality. It broke after just a week of use.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nThe customer service team was incredibly helpful in resolving my issue with the device.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI'm blown away by the functionality of this gadget. It exceeded my expectations.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe packaging was damaged upon arrival, but the product itself works great.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'neutral'\n==================== openai:gpt-3.5 -> OK_STOP\n'neutral'\n\n////////////////////////////////////////////////////////////\nI've been using this tool for months, and it's still as good as new. Highly recommended!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nI regret purchasing this item. It doesn't perform as advertised.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI've never had so much trouble with a product before. It's been a headache from day one.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'negative'\n==================== openai:gpt-3.5 -> OK_STOP\n'negative'\n\n////////////////////////////////////////////////////////////\nI bought this as a gift for my friend, and they absolutely love it!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThe price seemed steep at first, but after using it, I understand why. Quality product.\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n\n////////////////////////////////////////////////////////////\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP\n'positive'\n==================== openai:gpt-3.5 -> OK_STOP\n'positive'\n
The output format is - see comments nearby -----> arrows:
//////////////////////////////////////////////////////////// -----> This is the model input, a review text:\nThis gizmo is a game-changer for my daily routine. Couldn't be happier with my purchase!\n////////////////////////////////////////////////////////////\n==================== llamacpp:openchat-3.5-1210.Q4_K_M.gguf -> OK_STOP <----- Local model name and result\n'positive' <----- What the local model output\n==================== openai:gpt-3.5 -> OK_STOP <----- Remote model name and result\n'positive' <----- Remote model output\n
We also requested the creation of a CSV file with the results: sentiment.csv.
Example's assets at GitHub.
"},{"location":"examples/extract/","title":"Extract Pydantic","text":"In this example we'll extract information about all persons mentioned in a text. This example is also available in a dataclass version.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from pydantic import BaseModel, Field\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' source_location='London'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' source_location='Barcelona'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' source_location='Tokyo'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' source_location='Mumbai'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' source_location='Paris'\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom'\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident' source_location='Barcelona' source_country='Spain'\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories' source_location='Tokyo' source_country='Japan'\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology' source_location='Mumbai' source_country='India'\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace' source_location='Paris' source_country='France'\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married: bool\" field to the Person class.
class Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
first_name='Lucy' last_name='Bennett' age=28 occupation='journalist' details_about_person='her pen poised to capture the essence of the world around her' source_location='London' source_country='United Kingdom' is_married=False\nfirst_name='Carlos' last_name='Ramirez' age=35 occupation='architect' details_about_person='exuded creativity, passion for design evident' source_location='Barcelona' source_country='Spain' is_married=False\nfirst_name='Mia' last_name='Chang' age=23 occupation='musician' details_about_person='fingers danced across the strings, weaving stories' source_location='Tokyo' source_country='Japan' is_married=False\nfirst_name='Ahmed' last_name='Khan' age=40 occupation='engineer' details_about_person='navigated the complexities of technology' source_location='Mumbai' source_country='India' is_married=True\nfirst_name='Isabella' last_name='Santos' age=32 occupation='fashion designer' details_about_person='impeccable style and effortless grace' source_location='Paris' source_country='France' is_married=False\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/extract_dataclass/","title":"Extract dataclass","text":"This is the Python dataclass version of of the Pydantic extraction example.
We'll extract information about all persons mentioned in a text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
We'll use this text written in a flamboyant style, courtesy GPT three and a half:
text = \"\"\"\\\nIt was a breezy afternoon in a bustling caf\u00e9 nestled in the heart of a vibrant city. Five strangers found themselves drawn together by the aromatic allure of freshly brewed coffee and the promise of engaging conversation.\n\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, her pen poised to capture the essence of the world around her. Her eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed streets of Barcelona. With a sketchbook in hand, he exuded creativity, his passion for design evident in the thoughtful lines that adorned his face.\n\nNext to them, lost in the melodies of her guitar, was Mia Chang, a 23-year-old musician from the bustling streets of Tokyo. Her fingers danced across the strings, weaving stories of love and longing, echoing the rhythm of her vibrant city.\n\nJoining the trio was Ahmed Khan, a married 40-year-old engineer from the bustling metropolis of Mumbai. With a laptop at his side, he navigated the complexities of technology with ease, his intellect shining through the chaos of urban life.\n\nLast but not least, leaning against the counter with an air of quiet confidence, was Isabella Santos, a 32-year-old fashion designer from the romantic streets of Paris. Her impeccable style and effortless grace reflected the timeless elegance of her beloved city.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
from dataclasses import dataclass\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n\n# the input query, including the above text\nin_text = \"Extract person information from the following text:\\n\\n\" + text\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', source_location='London')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', source_location='Barcelona')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', source_location='Tokyo')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', source_location='Mumbai')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', source_location='Paris')\n
It seems to be doing a good job of extracting the info we requested.
Let's add two more fields: the source country (which the model will have to figure from the source location) and a \"details_about_person\" field, which the model should quote from the info in the source text about each person.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='a 28-year-old journalist from London, her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom')\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='a 35-year-old architect from the sun-kissed streets of Barcelona, with a sketchbook in hand, he exuded creativity', source_location='Barcelona', source_country='Spain')\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='a 23-year-old musician from the bustling streets of Tokyo, her fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan')\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='a married 40-year-old engineer from the bustling metropolis of Mumbai, with a laptop at his side, he navigated the complexities of technology with ease', source_location='Mumbai', source_country='India')\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='a 32-year-old fashion designer from the romantic streets of Paris, her impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France')\n
Quite reasonable: the model is doing a good job and we didn't even add descriptions to the fields - it's inferring what we want from the field names only.
Let's now query an attribute that only one of the person have: being married. Adding the \"is_married\" field to the Person dataclass.
@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n details_about_person: str\n source_location: str\n source_country: str\n is_married: bool\n\nout = model.extract(list[Person],\n in_text,\n inst=inst_text)\n\nfor person in out:\n print(person)\n
Person(first_name='Lucy', last_name='Bennett', age=28, occupation='journalist', details_about_person='a 28-year-old journalist from London, her pen poised to capture the essence of the world around her', source_location='London', source_country='United Kingdom', is_married=False)\nPerson(first_name='Carlos', last_name='Ramirez', age=35, occupation='architect', details_about_person='a 35-year-old architect from the sun-kissed streets of Barcelona, with a sketchbook in hand, he exuded creativity', source_location='Barcelona', source_country='Spain', is_married=False)\nPerson(first_name='Mia', last_name='Chang', age=23, occupation='musician', details_about_person='a 23-year-old musician from the bustling streets of Tokyo, her fingers danced across the strings, weaving stories of love and longing', source_location='Tokyo', source_country='Japan', is_married=False)\nPerson(first_name='Ahmed', last_name='Khan', age=40, occupation='engineer', details_about_person='a married 40-year-old engineer from the bustling metropolis of Mumbai, with a laptop at his side, he navigated the complexities of technology with ease', source_location='Mumbai', source_country='India', is_married=True)\nPerson(first_name='Isabella', last_name='Santos', age=32, occupation='fashion designer', details_about_person='a 32-year-old fashion designer from the romantic streets of Paris, her impeccable style and effortless grace reflected the timeless elegance of her beloved city', source_location='Paris', source_country='France', is_married=False)\n
From the five characters only Ahmed is mentioned to be married, and it is the one that the model marked with the is_married=True attribute.
Example's assets at GitHub.
"},{"location":"examples/from_text_to_object/","title":"From text to object","text":"In this example we'll ask the model to extract keypoints from a text: - First in plain text format - Then free JSON output (with fields selected by the model) - Later constrained by a JSON schema (so that we can specify which fields) - And finally by generating to a Pydantic object (from a class definition)
All the queries will be made at temperature=0, which is the default GenConf setting. This means that the model is giving it's best (as in most probable) answer and that it will always output the same results, given the same inputs.
Also available as a Jupyter notebook or a Python script in the example's folder.
We'll start by creating either a local model or a GPT-4 model.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\". For an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Let's use this fragment from Wikipedia's entry on the Fiji islands: https://en.wikipedia.org/wiki/
doc = \"\"\"\\\nFiji, officially the Republic of Fiji,[n 2] is an island country in Melanesia,\npart of Oceania in the South Pacific Ocean. It lies about 1,100 nautical miles \n(2,000 km; 1,300 mi) north-northeast of New Zealand. Fiji consists of \nan archipelago of more than 330 islands\u2014of which about 110 are permanently \ninhabited\u2014and more than 500 islets, amounting to a total land area of about \n18,300 square kilometres (7,100 sq mi). The most outlying island group is \nOno-i-Lau. About 87% of the total population of 924,610 live on the two major \nislands, Viti Levu and Vanua Levu. About three-quarters of Fijians live on \nViti Levu's coasts, either in the capital city of Suva, or in smaller \nurban centres such as Nadi (where tourism is the major local industry) or \nLautoka (where the sugar-cane industry is dominant). The interior of Viti Levu \nis sparsely inhabited because of its terrain.[13]\n\nThe majority of Fiji's islands were formed by volcanic activity starting around \n150 million years ago. Some geothermal activity still occurs today on the islands \nof Vanua Levu and Taveuni.[14] The geothermal systems on Viti Levu are \nnon-volcanic in origin and have low-temperature surface discharges (of between \nroughly 35 and 60 degrees Celsius (95 and 140 \u00b0F)).\n\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and \nlater Melanesians, with some Polynesian influences. Europeans first visited Fiji \nin the 17th century.[15] In 1874, after a brief period in which Fiji was an \nindependent kingdom, the British established the Colony of Fiji. Fiji operated as \na Crown colony until 1970, when it gained independence and became known as \nthe Dominion of Fiji. In 1987, following a series of coups d'\u00e9tat, the military \ngovernment that had taken power declared it a republic. In a 2006 coup, Commodore \nFrank Bainimarama seized power. In 2009, the Fijian High Court ruled that the \nmilitary leadership was unlawful. At that point, President Ratu Josefa Iloilo, \nwhom the military had retained as the nominal head of state, formally abrogated \nthe 1997 Constitution and re-appointed Bainimarama as interim prime minister. \nLater in 2009, Ratu Epeli Nailatikau succeeded Iloilo as president.[16] On 17 \nSeptember 2014, after years of delays, a democratic election took place. \nBainimarama's FijiFirst party won 59.2% of the vote, and international observers \ndeemed the election credible.[17] \n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Be helpful and provide concise answers.\"\n
Let's start with a free text query by calling model().
in_text = \"Extract 5 keypoints of the following text:\\n\" + doc\n\nout = model(in_text, inst=inst_text)\nprint(out)\n
1. Fiji is an island country located in Melanesia, part of Oceania in the South Pacific Ocean. It lies approximately 1,100 nautical miles north-northeast of New Zealand.\n2. The country consists of more than 330 islands with about 110 permanently inhabited islands and over 500 islets, totaling a land area of about 18,300 square kilometers.\n3. Approximately 87% of Fiji's total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu, with a majority living on Viti Levu's coasts.\n4. The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago, with some geothermal activity still occurring on certain islands.\n5. Fiji has a complex history, transitioning from an independent kingdom to a British colony, then a Dominion, and finally a republic after a series of coups and constitutional changes. In 2014, a democratic election took place, marking a significant milestone in the country's political history.\n
These are quite reasonable keypoints!
Let's now ask for JSON output, taking care to explicitly request it in the query (in_text variable).
Instead of model() we now use json() which returns a Python dict. We'll pass None as the first parameter because we're not using a JSON schema.
import pprint\npp = pprint.PrettyPrinter(width=300, sort_dicts=False)\n\nin_text = \"Extract 5 keypoints of the following text in JSON format:\\n\\n\" + doc\n\nout = model.json(None,\n in_text,\n inst=inst_text)\npp.pprint(out)\n
{'keypoints': [{'title': 'Location', 'description': 'Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.'},\n {'title': 'Geography', 'description': 'Consists of more than 330 islands with about 110 permanently inhabited islands.'},\n {'title': 'Population', 'description': 'Total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.'},\n {'title': 'History', 'description': 'Humans have lived in Fiji since the second millennium BC with Austronesians, Melanesians, and Polynesian influences.'},\n {'title': 'Political Status', 'description': 'Officially known as the Republic of Fiji, gained independence from British rule in 1970.'}]}\n
Note how the model chose to return different fields like \"title\" or \"description\".
Because we didn't specify which fields we want, each model will generate different ones.
To specify a fixed format, let's now generate by setting a JSON schema that defines which fields and types we want:
json_schema = {\n \"properties\": {\n \"keypoint_list\": {\n \"description\": \"Keypoint list\",\n \"items\": {\n \"type\": \"string\",\n \"description\": \"Keypoint\"\n },\n \"type\": \"array\"\n }\n },\n \"required\": [\n \"keypoint_list\"\n ],\n \"type\": \"object\"\n}\n
This JSON schema requests that the generated dict constains a \"keypoint_list\" with a list of strings.
We'll also use json(), now passing the json_schema as first argument:
out = model.json(json_schema,\n in_text,\n inst=inst_text)\n\nprint(out)\n
{'keypoint_list': ['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]}\n
for kpoint in out[\"keypoint_list\"]:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
It has generated a string list in the \"keypoint_list\" field, as we specified in the JSON schema.
This is better, but the problem with JSON schemas is that they can be quite hard to work with.
Let's use an easier way to specify the fields we want returned: Pydantic classes derived from BaseModel. This is way simpler to use than JSON schemas.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Keypoints(BaseModel):\n keypoint_list: list[str]\n\nout = model.pydantic(Keypoints,\n in_text,\n inst=inst_text)\nprint(out)\n
keypoint_list=['Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.', 'About 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.', \"The majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\", 'Humans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.', \"In 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\"]\n
for kpoint in out.keypoint_list:\n print(kpoint)\n
Fiji is an island country in Melanesia, part of Oceania in the South Pacific Ocean.\nAbout 87% of the total population of 924,610 live on the two major islands, Viti Levu and Vanua Levu.\nThe majority of Fiji's islands were formed by volcanic activity starting around 150 million years ago.\nHumans have lived in Fiji since the second millennium BC\u2014first Austronesians and later Melanesians, with some Polynesian influences.\nIn 2014, a democratic election took place, with Bainimarama's FijiFirst party winning 59.2% of the vote.\n
The pydantic() method returns an object of class Keypoints, instantiated with the model output.
This is a much simpler way to extract structured data from model.
Please see other examples for more interesting objects. In particular, we did not add descriptions to the fields, which are important clues to help the model understand what we want.
Besides Pydantic classes, Sibila can also use Python's dataclass to extract structured data. This is a lighter and easier alternative to using Pydantic.
Example's assets at GitHub.
"},{"location":"examples/hello_model/","title":"Hello model","text":"In this example we see how to directly create local or remote model objects and later to do that more easily with the Models class.
"},{"location":"examples/hello_model/#using-a-local-model","title":"Using a local model","text":"To use a local model, make sure you download its GGUF format file and save it into the \"../../models\" folder.
In these examples, we'll use a 4-bit quantization of the OpenChat-3.5 7 billion parameters model, which at the current time is quite a good model for its size.
The file is named \"openchat-3.5-1210.Q4_K_M.gguf\" and was downloaded from the above link. Make sure to save it into the \"../../models\" folder.
See here for more information about setting up your local models.
With the model file in the \"../../models\" folder, we can run the following script:
from sibila import LlamaCppModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# create a LlamaCpp model\nmodel = LlamaCppModel(model_path,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
Run the script above and after a few seconds (it has to load the model from disk), the good model answers back something like:
User: Hello there?\nModel: Ahoy there matey! How can I assist ye today on this here ship o' mine?\nIs it be treasure you seek or maybe some tales from the sea?\nLet me know, and we'll set sail together!\n
"},{"location":"examples/hello_model/#using-an-openai-model","title":"Using an OpenAI model","text":"To use a remote model like GPT-4 you'll need a paid OpenAI account: https://openai.com/pricing
With an OpenAI account, you'll be able to generate an access token that you should set into the OPENAI_API_KEY env variable.
(An even better way is to use .env files with your variables, and use the dotenv library to read them.)
Once a valid OPENAI_API_KEY env variable is set, you can run this script:
from sibila import OpenAIModel, GenConf\n\n# model file from the models folder\nmodel_path = \"../../models/openchat-3.5-1210.Q4_K_M.gguf\"\n\n# make sure you set the environment variable named OPENAI_API_KEY with your API key.\n# create an OpenAI model with generation temperature=1\nmodel = OpenAIModel(\"gpt-4\",\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
We get back the usual funny pirate answer:
User: Hello there?\nModel: Ahoy there, matey! What can this old sea dog do fer ye today?\n
"},{"location":"examples/hello_model/#using-the-models-directory","title":"Using the Models directory","text":"In these two scripts we created different objects to access the LLM model: LlamaCppModel and OpenAIModel.
This was done to simplify, but a better way is to use the Models class directory.
Models is a singleton class that implements a directory of models where you can store file locations, configurations, aliases, etc.
After setting up a JSON configuration file you can have the Models class create models by using names like \"llamacpp:openchat\" or \"openai:gpt-4\" together with their predefined settings. This permits easy model change, comparing model outputs, etc.
In the scripts above, instead on instancing different classes for different models, we could use Models class to create the model from a name, by setting the model_name variable:
from sibila import Models, GenConf\n\n# Using a local llama.cpp model: we first setup the ../../models directory:\n# Models.setup(\"../../models\")\n# model_name = \"llamacpp:openchat\"\n\n# OpenAI: make sure you set the environment variable named OPENAI_API_KEY with your API key.\nmodel_name = \"openai:gpt-4\"\n\nmodel = Models.create(model_name,\n genconf=GenConf(temperature=1))\n\n# the instructions or system command: speak like a pirate!\ninst_text = \"You speak like a pirate.\"\n\n# the in prompt\nin_text = \"Hello there?\"\nprint(\"User:\", in_text)\n\n# query the model with instructions and input text\ntext = model(in_text,\n inst=inst_text)\nprint(\"Model:\", text)\n
The magic happens in the line:
model = Models.create(model_name, ...)\n
The Models class will take care of initializing the model based on the name you provide.
Example's assets at GitHub.
"},{"location":"examples/interact/","title":"Interact","text":"In this example we look at the interact() function, which allows a back-and-forth chat session. The user enters messages in an input() prompt and can use some special \"!\" commands for more functionality. The model answers back after each user message.
In a chat interaction, the model has to \"remember\" the previous messages exchanged. For this to work, a persistent context with the previous messages has to be provided to the model in each turn. This is done by using a Context class object, which can manage thread messages and delete older ones when the context maximum length is reached.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
So, let's create a local model.
from sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Import interact() from the tools module and run it.
What follows is the actual chat where the user prompts start with a \">\" and the model answer's immediately below.
Read on, it does get funny at times... : )
from sibila.tools import (\n interact\n)\n\ninteract(model,\n inst_text=\"Be helpful.\", # model instructions text, also known as system message\n genconf=GenConf(temperature=0.9))\n
> Hello there!\n\n\nHello! How can I help you today?\n\n\n\n> I'm doing an example of this LLM access package called Sibila - have you heard about it?\n\n\nYes, I am familiar with the AI language model named Sibila. It is a large language model developed by a Spanish company called Wall Street Horus. Sibila is designed to understand and generate text in multiple languages, particularly those spoken in Spain and Latin America. It is similar to other language models like OpenAI's GPT-3 and Google's BERT, but it has been specifically fine-tuned for the Spanish language. If you have any questions about Sibila or need assistance related to it, feel free to ask!\n\n\n\n> No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\n\n\nApologies for the confusion in my previous response. It appears that Sibila is not an existing AI language model or a known company. I may have mixed up information from different sources. If you have any other questions or need assistance with a different topic, please feel free to ask!\n\n\n\n> No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Context class.\n\n\nGot it! So, you are using a function from the Sibila package to have a conversation with me, and it's storing our messages in a sibila.Context object. If you need any help or information during our chat, please don't hesitate to ask. I'm here to assist you!\n\n\n\n> That's right - I can list the context messages by entering \"!c\"\n\n\nUnderstood! If you ever need to view the messages that have been exchanged in the conversation, you can enter \"!c\" and I will provide you with a summary of the messages stored in the sibila.Context object. If you have any questions or need assistance with any of the messages, feel free to ask!\n\n\n\n> Not you - the interact() function!\n\n\nI apologize for the confusion. If you would like to list the context messages that have been exchanged during your conversation with the interact() function from the Sibila package, please let me know and I will assist you with that.\n\n\n\n> !c\n\n\nThread token len=596, max len before next gen=2048\ninst=\u2588Be helpful.\u2588, sep='\\n', len=12\n0: IN=\u2588Hello there!\u2588\n1: OUT=\u2588Hello! How can I help you today?\u2588\n2: IN=\u2588I'm doing an example of this LLM access package called Sibila - have you heard about it?\u2588\n3: OUT=\u2588Yes, I am familiar with the AI language model named Sibila. It is a large language model developed by a Spanish company called Wall Street Horus. Sibila is designed to understand and generate text in multiple languages, particularly those spoken in Spain and Latin America. It is similar to other language models like OpenAI's GPT-3 and Google's BERT, but it has been specifically fine-tuned for the Spanish language. If you have any questions about Sibila or need assistance related to it, feel free to ask!\u2588\n4: IN=\u2588No way! I've just searched and there's no such company nor model. You are hallucinating, Mr. Model!\u2588\n5: OUT=\u2588Apologies for the confusion in my previous response. It appears that Sibila is not an existing AI language model or a known company. I may have mixed up information from different sources. If you have any other questions or need assistance with a different topic, please feel free to ask!\u2588\n6: IN=\u2588No problem. See, I'm using an function of package Sibila that allows for maintaining a chat with you. It stores all user and model messages in a sibila.Context class.\u2588\n7: OUT=\u2588Got it! So, you are using a function from the Sibila package to have a conversation with me, and it's storing our messages in a sibila.Context object. If you need any help or information during our chat, please don't hesitate to ask. I'm here to assist you!\u2588\n8: IN=\u2588That's right - I can list the context messages by entering \"!c\"\u2588\n9: OUT=\u2588Understood! If you ever need to view the messages that have been exchanged in the conversation, you can enter \"!c\" and I will provide you with a summary of the messages stored in the sibila.Context object. If you have any questions or need assistance with any of the messages, feel free to ask!\u2588\n10: IN=\u2588Not you - the interact() function!\u2588\n11: OUT=\u2588I apologize for the confusion. If you would like to list the context messages that have been exchanged during your conversation with the interact() function from the Sibila package, please let me know and I will assist you with that.\u2588\n\n\n\n> Info about the model: \"!i\"\n\n\nAs an AI language model, my main purpose is to provide information, answer questions, and offer assistance. If you have any specific questions or topics related to models like GPT-3, BERT, or other AI-related subjects, please feel free to ask, and I'll be happy to help!\n\n\n\n> !i\n\n\nModel:\ndesc='LlamaCppModel: /home/jorge/ai/sibila/models/openchat-3.5-1210.Q4_K_M.gguf - 'llama 7B Q4_K - Medium'',ctx_len=2048,\ngenconf=GenConf(max_tokens=0,\n stop=[],\n temperature=0.0,\n top_p=0.9,\n format='text',\n json_schema=None)\nGenConf:\nGenConf(max_tokens=0,\n stop=[],\n temperature=0.9,\n top_p=0.9,\n format='text',\n json_schema=None)\n\nThread token len=684, max len before next gen=2048\n\n\n\n> For all commands: \"!\"\n\n\nIt seems that you can use the \"!\" command prefix to access various functions within the Sibila package. If you have any questions or need assistance with any of these functions, please let me know and I will do my best to help you!\n\n\n\n> !\n\n\nUnknown command '!' - known commands:\n !inst[=text] - clear messages and add inst (system) message\n !add|!a=path - load file and add to last msg\n !c - list context msgs\n !cl=path - load context (default=ctx.json)\n !cs=path - save context (default=ctx.json)\n !tl - thread's token length\n !i - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n\n\n\n> Sorry, I was not talking to you, just demoing the commands available with \"!\"\n\n\nApologies for the misunderstanding. If you have any questions or need assistance with the Sibila package or any other topic, please don't hesitate to ask. I'm here to help!\n\n\n\n> !cs\n\n\nSaved context to ctx.json\n\n\n\n> !tl\n\n\nThread token len=832, max len before next gen=2048\n\n\n\n> The Context class manages total token length: it will delete older messages if the thread becomes larger than the content max length.\n\n\nThank you for providing that information. It seems that the sibila.Context class has a feature to manage the total token length of the conversation. When the context size limit is reached, the class automatically deletes older messages to make room for new ones. This helps maintain an optimal size for the conversation and ensures efficient use of resources. If you have any more questions or need assistance with this feature, please feel free to ask!\n\n\n\n> Right. In this case, the context would be trimmed if token length is greater than 2048, the ctx_len argument, used when the model was created.\n\n\nI understand now. When the sibila.Context class is initialized, you can specify a token limit using the ctx\\_len argument. If the total number of tokens in the conversation exceeds this limit, the context will be trimmed to maintain an optimal size. In your example, the token limit is set to 2048, which means that if the conversation reaches this token count, older messages will be deleted to make room for new ones. This helps manage the conversation's size and ensures efficient use of resources. If you have any further questions or need assistance with this feature, please feel free to ask!\n\n\n\n> Right. Let's do the pirate number? Or do you prefer the Homer Simpson one?\n\n\nI can provide both versions for you: the pirate-themed \"Sibila\" song and the Homer Simpson-inspired version. Which one would you like to hear first?\n\n\n\n> Nevermind - let's do the pirate. I'll enter !inst=You are a pirate.\n\n\nAlright! Here is the pirate-themed \"Sibila\" song based on your instruction:\n\n```javascript\n!inst=You are a pirate.\n```\n\nYo ho, yo ho, it's a Sibila life for me!\n\n\n\n> You're just too much.\n\n\nThank you! I'm here to provide assistance and have some fun along the way. If you have any more questions or need help with anything else, feel free to ask!\n\n\n\n> !inst=You are a pirate.\n\n\n\n\n\n> Hey there!\n\n\nAhoy, me hearty! What brings ye to these seas of ours? Are ye lookin' for treasure or perhaps a bit o' adventure?\n\n\n\n> Right \"me hearty\", I'm here for great pirate adventures, and that just by talking to an LLM model!\n\n\nWell then, prepare to set sail on a thrilling journey with the most talkative mate aboard these digital waters! Let's dive right in and explore the wonders of this world with our trusty AI companion. Yarr!\n\nSo, me mate, what be yer name? And how did ye end up here, talkin' to a pirate like yours truly?\n\n\n\n> Arrrr!\n\n\nThat's the spirit! A proper pirate response right there. So, let's see, what be yer name, landlubber? And what brings ye to these pirate-infested waters? Are ye here by choice or did some devious captain force you into service? Speak up, me hearty, before I send out a search party for ye!\n\n\n\n>\n\n\n\n\n\n<sibila.context.Context at 0x7f27f2714650>\n
These are the \"!\" commands that you can use in the interact() inputs:
! - to show this help\n !inst[=text] - clear messages and add inst (system) message\n !add|!a=path - load file and add to last msg\n !c - list context msgs\n !cl=path - load context (default=ctx.json)\n !cs=path - save context (default=ctx.json)\n !tl - thread's token length\n !i - model and genconf info\n Delimit with \"\"\" for multiline begin/end or terminate line with \\ to continue into a new line\n Empty line + enter to quit\n
Example's assets at GitHub.
"},{"location":"examples/quick_meeting/","title":"Quick meeting","text":"Let's extract structured data from a meeting transcript, like attendees, action items and their priorities.
This is a quick meeting whose transcript is not very large, so a small local model should work well. See the Tough meeting example for a larger and more complex transcription text.
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
If you prefer to use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
Here's the transcript we'll be using as source:
transcript = \"\"\"\\\nDate: 10th April 2024\nTime: 10:30 AM\nLocation: Conference Room A\n\nAttendees:\n Arthur: Logistics Supervisor\n Bianca: Operations Manager\n Chris: Fleet Coordinator\n\nArthur: Good morning, team. Thanks for making it. We've got three matters to address quickly today.\n\nBianca: Morning, Arthur. Let's dive in.\n\nChris: Ready when you are.\n\nArthur: First off, we've been having complaints about late deliveries. This is very important, we're getting some bad reputation out there.\n\nBianca: Chris, I think you're the right person to take care of this. Can you investigate and report back by end of day? \n\nChris: Absolutely, Bianca. I'll look into the reasons and propose solutions.\n\nArthur: Great. Second, Bianca, we need to update our driver training manual. Can you take the lead and have a draft by Friday?\n\nBianca: Sure thing, Arthur. I'll get started on that right away.\n\nArthur: Lastly, we need to schedule a meeting with our software vendor to discuss updates to our tracking system. This is a low-priority task but still important. I'll handle that. Any input on timing?\n\nBianca: How about next Wednesday afternoon?\n\nChris: Works for me.\n\nArthur: Sounds good. I'll arrange it. Thanks, Bianca, Chris. Let's keep the momentum going.\n\nBianca: Absolutely, Arthur.\n\nChris: Will do.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information.\"\n
Let's define two Pydantic BaseModel classes whose instances will receive the extracted information: - Attendee: to store information about each meeting attendee - Meeting: to keep meeting's date and location, list of participants and other info we'll see below
And let's ask the model to create objects that are instances of these classes:
from pydantic import BaseModel, Field\n\n# class definitions will be used to constrain the model output and initialize an instance object\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
meeting_date='10th April 2024' meeting_location='Conference Room A' attendees=[Attendee(name='Arthur', occupation='Logistics Supervisor'), Attendee(name='Bianca', occupation='Operations Manager'), Attendee(name='Chris', occupation='Fleet Coordinator')]\n
A prettier display:
print(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\n
This information was correctly extracted.
Let's now request the action items mentioned in the meeting. We'll create a new class ActionItem with an index and a name for the item. Note that we're annotating each field with a Field(description=...) information to help the model understand what we're looking extract.
We'll also add an action_items field to the Meeting class to hold the items list.
class Attendee(BaseModel):\n name: str\n occupation: str\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate and report on late deliveries'\nindex=2 name='Update driver training manual'\nindex=3 name='Schedule a meeting with software vendor to discuss tracking system updates'\n
The extracted action items also look good.
Let's now extract more action item information: - Priority for each item - Due by... information - Name of the attendee that was assigned for that item
So, we create a Priority class holding three priority types - low to high.
We also add three fields to the ActionItem class, to hold the new information: priority, due_by and assigned_attendee.
from enum import Enum\n\nclass Attendee(BaseModel):\n name: str\n occupation: str\n\nclass Priority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: Priority = Field(description=\"Action item priority\")\n due_by: str = Field(description=\"When should the item be complete\")\n assigned_attendee: str = Field(description=\"Name of the attendee to which action item was assigned\")\n\nclass Meeting(BaseModel):\n meeting_date: str\n meeting_location: str\n attendees: list[Attendee]\n action_items: list[ActionItem]\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Meeting:\", out.meeting_date, \"in\", out.meeting_location)\nprint(\"Attendees:\")\nfor att in out.attendees:\n print(att)\nprint(\"Action items:\") \nfor items in out.action_items:\n print(items)\n
Meeting: 10th April 2024 in Conference Room A\nAttendees:\nname='Arthur' occupation='Logistics Supervisor'\nname='Bianca' occupation='Operations Manager'\nname='Chris' occupation='Fleet Coordinator'\nAction items:\nindex=1 name='Investigate late deliveries' priority=<Priority.HIGH: 'high'> due_by='end of day' assigned_attendee='Chris'\nindex=2 name='Update driver training manual' priority=<Priority.MEDIUM: 'medium'> due_by='Friday' assigned_attendee='Bianca'\nindex=3 name='Schedule meeting with software vendor' priority=<Priority.LOW: 'low'> due_by='next Wednesday afternoon' assigned_attendee='Arthur'\n
The new information was correctly extracted: priorities, due by and assigned attendees for each action item.
For an example of a harder, more complex transcript see the Tough meeting example.
Example's assets at GitHub.
"},{"location":"examples/tag/","title":"Tag","text":"In this example we'll summarize and classify customer queries with tags. We'll use dataclasses to specify the structure of the information we want extracted (we could also use Pydantic BaseModel classes).
To use a local model, make sure you have its file in the folder \"../../models\". You can use any GGUF format model - see here how to download the OpenChat model used below. If you use a different one, don't forget to set its filename in the name variable below, after the text \"llamacpp:\".
To use an OpenAI model, make sure you defined the env variable OPENAI_API_KEY with a valid token and uncomment the line after \"# to use an OpenAI model:\".
Available as a Jupyter notebook or a Python script in the example's folder.
Let's start by creating the model:
from sibila import Models\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\nModels.setup(\"../../models\")\n# set the model's filename - change to your own model\nmodel = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n\n# to use an OpenAI model:\n# model = Models.create(\"openai:gpt-4\")\n
These will be our queries, ten typical customer support questions:
queries = \"\"\"\\\n1. Do you offer a trial period for your software before purchasing?\n2. I'm experiencing a glitch with your app, it keeps freezing after the latest update.\n3. What are the different pricing plans available for your subscription service?\"\n4. Can you provide instructions on how to reset my account password?\"\n5. I'm unsure about the compatibility of your product with my device, can you advise?\"\n6. How can I track my recent order and estimate its delivery date?\"\n7. Is there a customer loyalty program or rewards system for frequent buyers?\"\n8. I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\"\n9. Could you clarify the coverage and limitations of your product warranty?\"\n10. What are your customer support hours and how can I reach your team in case of emergencies?\n\"\"\"\n
We'll start by summarizing each query.
Let's try just using field names (without descriptions), perhaps they are enough to tell the model about what we want.
from dataclasses import dataclass\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_text: str\n\n# model instructions text, also known as system message\ninst_text = \"Extract information from customer queries.\"\n\n# the input query, including the above text\nin_text = \"Each line is a customer query. Extract information about each query:\\n\\n\" + queries\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Trial period inquiry', query_text='Do you offer a trial period for your software before purchasing?')\nQuery(id=2, query_summary='Technical issue', query_text=\"I'm experiencing a glitch with your app, it keeps freezing after the latest update.\")\nQuery(id=3, query_summary='Pricing inquiry', query_text='What are the different pricing plans available for your subscription service?')\nQuery(id=4, query_summary='Password reset request', query_text='Can you provide instructions on how to reset my account password?')\nQuery(id=5, query_summary='Compatibility inquiry', query_text=\"I'm unsure about the compatibility of your product with my device, can you advise?\")\nQuery(id=6, query_summary='Order tracking', query_text='How can I track my recent order and estimate its delivery date?')\nQuery(id=7, query_summary='Loyalty program inquiry', query_text='Is there a customer loyalty program or rewards system for frequent buyers?')\nQuery(id=8, query_summary='Refund policy inquiry', query_text=\"I'm interested in your online courses, but do you offer refunds if I'm not satisfied?\")\nQuery(id=9, query_summary='Warranty inquiry', query_text='Could you clarify the coverage and limitations of your product warranty?')\nQuery(id=10, query_summary='Customer support inquiry', query_text='What are your customer support hours and how can I reach your team in case of emergencies?')\n
The summaries look good.
Let's now define tags and ask the model to classify each query into a tag. In the Tag class, we set its docstring to the rules we want for the classification. This is done in the docstring because Tag is not a dataclass, but derived from Enum.
No longer asking for the query_text in the Query class to keep output shorter.
from enum import Enum\n\nclass Tag(str, Enum):\n \"\"\"Queries can be classified into the following tags:\ntech_support: queries related with technical problems.\nbilling: post-sale queries about billing cycle, or subscription termination.\naccount: queries about user account problems.\npre_sales: queries from prospective customers (who have not yet purchased).\nother: all other query topics.\"\"\" \n TECH_SUPPORT = \"tech_support\"\n BILLING = \"billing\"\n PRE_SALES = \"pre_sales\"\n ACCOUNT = \"account\"\n OTHER = \"other\"\n\n@dataclass \nclass Query():\n id: int\n query_summary: str\n query_tag: Tag\n\nout = model.extract(list[Query],\n in_text,\n inst=inst_text)\n\nfor query in out:\n print(query)\n
Query(id=1, query_summary='Asking about trial period', query_tag='pre_sales')\nQuery(id=2, query_summary='Reporting app issue', query_tag='tech_support')\nQuery(id=3, query_summary='Inquiring about pricing plans', query_tag='billing')\nQuery(id=4, query_summary='Requesting password reset instructions', query_tag='account')\nQuery(id=5, query_summary='Seeking device compatibility advice', query_tag='pre_sales')\nQuery(id=6, query_summary='Tracking order and delivery date', query_tag='other')\nQuery(id=7, query_summary='Inquiring about loyalty program', query_tag='billing')\nQuery(id=8, query_summary='Asking about refund policy', query_tag='pre_sales')\nQuery(id=9, query_summary='Seeking warranty information', query_tag='other')\nQuery(id=10, query_summary='Inquiring about customer support hours', query_tag='other')\n
The applied tags appear mostly reasonable.
Of course, pre-sales tagging could be done automatically from a database of existing customer contacts, but the model is doing a good job of identifying questions likely to be pre-sales, like ids 1, 5 and 8 which are questions typically asked before buying/subscribing.
Also, note that classification is being done from a single phrase. More information in each customer query would certainly allow for fine-grained classification.
Example's assets at GitHub.
"},{"location":"examples/tough_meeting/","title":"Tough meeting","text":"In this example we'll look at extracting participants and action items from a meeting transcript.
Start by creating the model. As you'll see below, the transcript is large, with complex language, so we'll use OpenAI's GPT-4 this time. You can still use a local model by uncommenting the commented lines below.
Make sure to set your OPENAI_API_KEY env variable.
Jupyter notebook and Python script versions are available in the example's folder.
Let's create the model.
# load env variables like OPENAI_API_KEY from a .env file (if available)\ntry: from dotenv import load_dotenv; load_dotenv()\nexcept: ...\n\nfrom sibila import Models, GenConf\n\n# delete any previous model\ntry: del model\nexcept: ...\n\n# to use a local model, assuming it's in ../../models:\n# setup models folder:\n# Models.setup(\"../../models\")\n# the transcript is large, so we'll create the model with a context length of 3072, which should be enough.\n# model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\", ctx_len=3072)\n\n# to use an OpenAI model:\nmodel = Models.create(\"openai:gpt-4\", ctx_len=3072)\n
We'll use a sample meeting transcript from https://www.ctas.tennessee.edu/eli/sample-meeting-transcript
transcript = \"\"\"\\\nChairman Wormsley (at the proper time and place, after taking the chair and striking the gavel on the table): This meeting of the CTAS County Commission will come to order. Clerk please call the role. (Ensure that a majority of the members are present.)\n\nChairman Wormsley: Each of you has received the agenda. I will entertain a motion that the agenda be approved.\n\nCommissioner Brown: So moved.\n\nCommissioner Hobbs: Seconded\n\nChairman Wormsley: It has been moved and seconded that the agenda be approved as received by the members. All those in favor signify by saying \"Aye\"?...Opposed by saying \"No\"?...The agenda is approved. You have received a copy of the minutes of the last meeting. Are there any corrections or additions to the meeting?\n\nCommissioner McCroskey: Mister Chairman, my name has been omitted from the Special Committee on Indigent Care.\n\nChairman Wormsley: Thank you. If there are no objections, the minutes will be corrected to include the name of Commissioner McCroskey. Will the clerk please make this correction. Any further corrections? Seeing none, without objection the minutes will stand approved as read. (This is sort of a short cut way that is commonly used for approval of minutes and/or the agenda rather than requiring a motion and second.)\n\nChairman Wormsley: Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Mister Chairman, I would like to make a motion to approve the resolution taking money from the Data Processing Reserve Account in the County Clerk's office and moving it to the equipment line to purchase a laptop computer.\n\nCommissioner Carmical: I second the motion.\n\nChairman Wormsley: This resolution has a motion and second. Will the clerk please take the vote.\n\nChairman Wormsley: The resolution passes. We will now take up old business. At our last meeting, Commissioner McKee, your motion to sell property near the airport was deferred to this meeting. You are recognized.\n\nCommissioner McKee: I move to withdraw that motion.\n\nChairman Wormsley: Commissioner McKee has moved to withdraw his motion to sell property near the airport. Seeing no objection, this motion is withdrawn. The next item on the agenda is Commissioner Rodgers'.\n\nCommissioner Rodgers: I move adopton of the resolution previously provided to each of you to increase the state match local litigation tax in circuit, chancery, and criminal courts to the maximum amounts permissible. This resolution calls for the increases to go to the general fund.\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: The sheriff is opposed to this increase.\n\nChairman Wormsley: Commissioner, you are out of order because this motion has not been seconded as needed before the floor is open for discussion or debate. Discussion will begin after we have a second. Is there a second?\n\nCommissioner Reinhart: For purposes of discussion, I second the motion.\n\nChairman Wormsley: Commissioner Rodgers is recognized.\n\nCommissioner Rodgers: (Speaks about the data on collections, handing out all sorts of numerical figures regarding the litigation tax, and the county's need for additional revenue.)\n\nChairman Wormsley: Commissioner Duckett\n\nCommissioner Duckett: I move an amendment to the motion to require 25 percent of the proceeds from the increase in the tax on criminal cases go to fund the sheriff's department.\n\nChairman Wormsley: Commissioner Malone\n\nCommissioner Malone: I second the amendment.\n\nChairman Wormsley: A motion has been made and seconded to amend the motion to increase the state match local litigation taxes to the maximum amounts to require 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Any discussion? Will all those in favor please raise your hand? All those opposed please raise your hand. The amendment carries 17-2. We are now on the motion as amended. Any further discussion?\n\nCommissioner Headrick: Does this require a two-thirds vote?\n\nChairman Wormsley: Will the county attorney answer that question?\n\nCounty Attorney Fults: Since these are only courts of record, a majority vote will pass it. The two-thirds requirement is for the general sessions taxes.\n\nChairman Wormsley: Other questions or discussion? Commissioner Adams.\n\nCommissioner Adams: Move for a roll call vote.\n\nCommissioner Crenshaw: Second\n\nChairman Wormsley: The motion has been made and seconded that the state match local litigation taxes be increased to the maximum amounts allowed by law with 25 percent of the proceeds from the increase in the tax on criminal cases in courts of record going to fund the sheriff's department. Will all those in favor please vote as the clerk calls your name, those in favor vote \"aye,\" those against vote \"no.\" Nine votes for, nine votes against, one not voting. The increase fails. We are now on new business. Commissioner Adkins, the first item on the agenda is yours.\n\nCommissioner Adkins: Each of you has previously received a copy of a resolution to increase the wheel tax by $10 to make up the state cut in education funding. I move adoption of this resolution.\n\nChairman Wormsley: Commissioner Thompson\n\nCommissioner Thompson: I second.\n\nChairman Wormsley: It has been properly moved and seconded that a resolution increasing the wheel tax by $10 to make up the state cut in education funding be passed. Any discussion? (At this point numerous county commissioners speak for and against increasing the wheel tax and making up the education cuts. This is the first time this resolution is under consideration.) Commissioner Hayes is recognized.\n\nCommissioner Hayes: I move previous question.\n\nCommisioner Crenshaw: Second.\n\nChairman Wormsley: Previous question has been moved and seconded. As you know, a motion for previous question, if passed by a two-thirds vote, will cut off further debate and require us to vote yes or no on the resolution before us. You should vote for this motion if you wish to cut off further debate of the wheel tax increase at this point. Will all those in favor of previous question please raise your hand? Will all those against please raise your hand? The vote is 17-2. Previous question passes. We are now on the motion to increase the wheel tax by $10 to make up the state cut in education funding. Will all those in favor please raise your hand? Will all those against please raise your hand? The vote is 17-2. This increase passes on first passage. Is there any other new business? Since no member is seeking recognition, are there announcements? Commissioner Hailey.\n\nCommissioner Hailey: There will be a meeting of the Budget Committee to look at solid waste funding recommendations on Tuesday, July 16 at noon here in this room.\n\nChairman Wormsley: Any other announcements? The next meeting of this body will be Monday, August 19 at 7 p.m., here in this room. Commissioner Carmical.\n\nCommissioner Carmical: There will be a chili supper at County Elementary School on August 16 at 6:30 p.m. Everyone is invited.\n\nChairman Wormsley: Commissioner Austin.\n\nCommissioner Austin: Move adjournment.\n\nCommissioner Garland: Second.\n\nChairman Wormsley: Without objection, the meeting will stand adjourned.\n\"\"\"\n\n# model instructions text, also known as system message\ninst_text = \"Extract information and output in JSON format.\"\n
As you can see, this is a quite large transcript, filled with long names and complex phrases. Let's see how the model will handle it...
Let's start by extracting the names of the participants in the meeting.
We'll create the Meeting class with a list of strings, to receive the names of mentioned participants.
The model will take clues from the variable names as well as from the description Field we set. In this case we name the string list \"participants\" and add a description of what we're looking to receive.
from pydantic import BaseModel, Field\n\n# this class definition will be used to constrain the model output and initialize an instance object\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n\nin_text = \"Extract information from this meeting transcript:\\n\\n\" + transcript\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\nprint(out)\n
participants=['Chairman Wormsley', 'Commissioner Brown', 'Commissioner Hobbs', 'Commissioner McCroskey', 'Commissioner Adkins', 'Commissioner Carmical', 'Commissioner McKee', 'Commissioner Rodgers', 'Commissioner Duckett', 'Commissioner Reinhart', 'Commissioner Malone', 'Commissioner Headrick', 'County Attorney Fults', 'Commissioner Adams', 'Commissioner Crenshaw', 'Commissioner Thompson', 'Commissioner Hayes', 'Commissioner Hailey', 'Commissioner Carmical', 'Commissioner Austin', 'Commissioner Garland']\n
# print the generated participants list:\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
Some names appear twice (\"Commissioner Carmical\") and the \"clerk\", which is mentioned in the text, is not listed.
It's a matter of opinion if the clerk is an active participant, but let's try to fix the repeated names.
Let's try asking for a list of participants \"without repeated entries\", in the field's description:
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Wormsley\nBrown\nHobbs\nMcCroskey\nAdkins\nCarmical\nMcKee\nRodgers\nDuckett\nReinhart\nMalone\nHeadrick\nFults\nAdams\nCrenshaw\nThompson\nHayes\nHailey\nAustin\nGarland\n
Didn't work as expected, repetition is gone but it dropped the titles, only names are appearing.
Let's try asking for \"names and titles\":
class Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of names and titles of participants without repeated entries\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nfor part in out.participants:\n print(part)\n
Chairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\n
And now \"Commissioner Carmical\" is repeating again!
Let's move on, the point is that you can also do some prompt engineering with the description field. And this model shortcoming could be dealt with by post-processing the received list.
Let's now also request a list of action items mentioned in the transcript:
class ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda'\nindex=2 name='Correct the minutes to include Commissioner McCroskey in the Special Committee on Indigent Care'\nindex=3 name='Approve the resolution to transfer funds from the Data Processing Reserve Account to purchase a laptop'\nindex=4 name='Withdraw the motion to sell property near the airport'\nindex=5 name='Adopt the resolution to increase the state match local litigation tax'\nindex=6 name=\"Amend the motion to allocate 25 percent of the proceeds from the tax increase to fund the sheriff's department\"\nindex=7 name='Vote on the state match local litigation taxes increase with the amendment'\nindex=8 name='Adopt the resolution to increase the wheel tax by $10 for education funding'\nindex=9 name='Hold a Budget Committee meeting on solid waste funding recommendations'\nindex=10 name='Announce the chili supper at County Elementary School'\n
These are reasonable action items.
Let's now also request a priority for each ActionItem - we'll create a string Enum class with three priority levels.
from enum import Enum\n\nclass ActionPriority(str, Enum):\n HIGH = \"high\"\n MEDIUM = \"medium\"\n LOW = \"low\"\n\nclass ActionItem(BaseModel):\n index: int = Field(description=\"Sequential index for the action item\")\n name: str = Field(description=\"Action item name\")\n priority: ActionPriority = Field(description=\"Action item priority\")\n\nclass Meeting(BaseModel):\n participants: list[str] = Field(description=\"List of complete names of meeting participants\")\n action_items: list[ActionItem] = Field(description=\"List of action items in the meeting\")\n\nout = model.extract(Meeting,\n in_text,\n inst=inst_text)\n\nprint(\"Participants\", \"-\" * 16)\nfor part in out.participants:\n print(part)\nprint(\"Action items\", \"-\" * 16)\nfor ai in out.action_items:\n print(ai)\n
Participants ----------------\nChairman Wormsley\nCommissioner Brown\nCommissioner Hobbs\nCommissioner McCroskey\nCommissioner Adkins\nCommissioner Carmical\nCommissioner McKee\nCommissioner Rodgers\nCommissioner Duckett\nCommissioner Reinhart\nCommissioner Malone\nCommissioner Headrick\nCounty Attorney Fults\nCommissioner Adams\nCommissioner Crenshaw\nCommissioner Thompson\nCommissioner Hayes\nCommissioner Hailey\nCommissioner Carmical\nCommissioner Austin\nCommissioner Garland\nAction items ----------------\nindex=1 name='Approve the agenda' priority=<ActionPriority.HIGH: 'high'>\nindex=2 name='Correct the minutes to include Commissioner McCroskey' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=3 name='Approve the resolution to transfer funds for laptop purchase' priority=<ActionPriority.HIGH: 'high'>\nindex=4 name='Withdraw motion to sell property near the airport' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=5 name='Adopt resolution to increase state match local litigation tax' priority=<ActionPriority.HIGH: 'high'>\nindex=6 name=\"Amend resolution to allocate funds to sheriff's department\" priority=<ActionPriority.HIGH: 'high'>\nindex=7 name='Vote on the amended resolution for litigation tax increase' priority=<ActionPriority.HIGH: 'high'>\nindex=8 name='Adopt resolution to increase the wheel tax' priority=<ActionPriority.HIGH: 'high'>\nindex=9 name='Budget Committee meeting on solid waste funding' priority=<ActionPriority.MEDIUM: 'medium'>\nindex=10 name='Announce chili supper at County Elementary School' priority=<ActionPriority.LOW: 'low'>\nindex=11 name='Adjourn the meeting' priority=<ActionPriority.MEDIUM: 'medium'>\n
It's not clear from the meeting transcript text if these priorities are correct, but some items related to taxes are receiving high priorities, from the context, it looks reasonable that taxes are a priority. : )
Example's assets at GitHub.
"},{"location":"extract/dataclass/","title":"Dataclass","text":"Besides simple types and enums, we can also extract objects whose structure is given by a dataclass definition:
Example
from sibila import Models\nfrom dataclasses import dataclass\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\n@dataclass\nclass Person:\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the Pydantic version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/dataclass/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For dataclasses this is done with Annotated[type, \"description\"] - see the \"start\" and \"end\" attributes of the Period class:
Example
from typing import Annotated\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\n@dataclass\nclass Period():\n start: Annotated[Weekday, \"Day of arrival\"]\n end: Annotated[Weekday, \"Day of departure\"]\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
Check the Extract dataclass example to see this in action.
"},{"location":"extract/enums/","title":"Enums","text":"Enumerations are important for classification tasks or in any situation where you need a choice to be made from a list of options.
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract([\"red\", \"blue\", \"green\", \"yellow\"], \n \"The car color was a shade of indigo\")\n
Result
'blue'\n
You can pass a list of items in any of the supported native types: str, float, int or bool.
"},{"location":"extract/enums/#literals","title":"Literals","text":"We can also use Literals:
Example
from typing import Literal\n\nmodel.extract(Literal[\"SPAM\", \"NOT_SPAM\", \"UNSURE\"], \n \"Hello my dear friend, I'm contacting you because I want to give you a million dollars\",\n inst=\"Classify this text on the likelihood of being spam\")\n
Result
'SPAM'\n
Extracting to a Literal type returns one of its possible options in its native type (str, float, int or bool).
"},{"location":"extract/enums/#enum-classes","title":"Enum classes","text":"Or Enum classes of native types. An example of extracting to Enum classes:
Example
from enum import IntEnum\n\nclass Heads(IntEnum):\n SINGLE = 1\n DOUBLE = 2\n TRIPLE = 3\n\nmodel.extract(Heads,\n \"The Two-Headed Monster from The Muppets.\")\n
Result
<Heads.DOUBLE: 2>\n
For the model, the important information is actual the value of each enum member, not its name. For example, in this enum, the model would only see the strings to the right of each member (the enum values), not \"RED\", \"ORANGE\" nor \"GREEN\":
class Light(Enum):\n RED = 'stop'\n YELLOW = 'slow down'\n GREEN = 'go'\n
See the Tag classification example to see how Enum is used to tag support queries.
"},{"location":"extract/enums/#classify","title":"Classify","text":"You can also use the classify() method to extract enumerations, which accepts the enum types we've seen above. It calls extract() internally and its only justification is to make things more readable:
Example
model.classify([\"mouse\", \"cat\", \"dog\", \"bird\"],\n \"Snoopy\")\n
Result
'dog'\n
"},{"location":"extract/free_text/","title":"Free text","text":"You can also generate free text by calling model():
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nresponse = model(\"Explain in a few lines how to build a brick wall?\")\nprint(response)\n
Result
To build a brick wall, follow these steps:\n\n1. Prepare the site by excavating and leveling the ground, then install a damp-proof \nmembrane and create a solid base with concrete footings.\n2. Lay a foundation of concrete blocks or bricks, ensuring it is level and square.\n3. Build the wall using bricks or blocks, starting with a corner or bonding pattern \nto ensure stability. Use mortar to bond each course (row) of bricks or blocks, \nfollowing the recommended mortar mix ratio.\n4. Use a spirit level to ensure each course is level, and insert metal dowels or use \nbrick ties to connect adjacent walls or floors.\n5. Allow the mortar to dry for the recommended time before applying a damp-proof \ncourse (DPC) at the base of the wall.\n6. Finish the wall with capping bricks or coping stones, and apply any desired \nrender or finish.\n
"},{"location":"extract/pydantic/","title":"Pydantic","text":"Besides simple types and enums, we can also extract objects whose structure is given by a class derived from Pydantic's BaseModel definition:
Example
from sibila import Models\nfrom pydantic import BaseModel\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nclass Person(BaseModel):\n first_name: str\n last_name: str\n age: int\n occupation: str\n source_location: str\n\nin_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\"\"\"\n\nmodel.extract(Person,\n in_text)\n
Result
Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London')\n
See the dataclass version here.
We can extract a list of Person objects by using list[Person]:
Example
in_text = \"\"\"\\\nSeated at a corner table was Lucy Bennett, a 28-year-old journalist from London, \nher pen poised to capture the essence of the world around her. \nHer eyes sparkled with curiosity, mirroring the dynamic energy of her beloved city.\n\nOpposite Lucy sat Carlos Ramirez, a 35-year-old architect from the sun-kissed \nstreets of Barcelona. With a sketchbook in hand, he exuded creativity, \nhis passion for design evident in the thoughtful lines that adorned his face.\n\"\"\"\n\nmodel.extract(list[Person],\n in_text)\n
Result
[Person(first_name='Lucy', \n last_name='Bennett',\n age=28, \n occupation='journalist',\n source_location='London'),\n Person(first_name='Carlos', \n last_name='Ramirez',\n age=35,\n occupation='architect',\n source_location='Barcelona')]\n
"},{"location":"extract/pydantic/#field-annotations","title":"Field annotations","text":"As when extracting to simple types, we could also provide instructions by setting the inst argument. However, instructions are by nature general and when extracting structured data, it's harder to provide specific instructions for fields.
For this purpose, field annotations are more effective than instructions: they can be provided to clarify what we want extracted for each specific field.
For Pydantic this is done with Field(description=\"description\") - see the \"start\" and \"end\" attributes of the Period class:
Example
from pydantic import Field\n\nWeekday = Literal[\"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\", \"Friday\", \"Saturday\", \"Sunday\"\n]\n\nclass Period(BaseModel):\n start: Weekday = Field(description=\"Day of arrival\")\n end: Weekday = Field(description=\"Day of departure\")\n\nmodel.extract(Period,\n \"Right, well, I was planning to arrive on Wednesday and \"\n \"only leave Sunday morning. Would that be okay?\")\n
Result
Period(start='Wednesday', end='Sunday')\n
In this manner, the model can be informed of what is wanted for each specific field.
Check the Extract Pydantic example to see this kind of extraction.
"},{"location":"extract/simple_types/","title":"Simple types","text":"Sibila can constrain model generation to output simple python types. This is helpful for situations where you want to extract a specific data type.
To get a response from the model in a certain type, you can use the extract() method:
Example
from sibila import Models\n\nModels.setup(\"../models\")\nmodel = Models.create(\"llamacpp:openchat\")\n\nmodel.extract(bool, \n \"Certainly, I'd like to subscribe.\")\n
Result
True\n
"},{"location":"extract/simple_types/#instructions-to-help-the-model","title":"Instructions to help the model","text":"You may need to provide more extra information to the model, so that it understands what you want. This is done with the inst argument - inst is a shorter name for instructions:
Example
model.extract(str, \n \"I don't quite remember the product's name, I think it was called Cornaca\",\n inst=\"Extract the product name\")\n
Result
Cornaca\n
"},{"location":"extract/simple_types/#supported-types","title":"Supported types","text":"The following simple types are supported:
- bool
- int
- float
- str
- datetime
About datetime type
A special note about extracting to datetime: the datetime type is expecting an ISO 8601 formatted string. Because some models are less capable than others at correctly formatting dates/times, it helps to mention in the instructions that you want the output in \"ISO 8601\" format.
from datetime import datetime\nmodel.extract(datetime, \n \"Sure, glad to help, it all happened at December the 10th, 2023, around 3PM, I think\",\n inst=\"Output in ISO 8601 format\")\n
Result
datetime.datetime(2023, 12, 10, 15, 0)\n
"},{"location":"extract/simple_types/#lists","title":"Lists","text":"You can extract lists of any of the supported types (simple types, enum, dataclass, Pydantic).
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\")\n
Result
['Naples', 'Genoa', 'Florence', 'Rome']\n
As in all extractions, you may need to set the instructions text to specify what you want from the model. Just as an example of the power of instructions, let's add instructions asking for country output: it will still output a list, but with a single element - 'Italy':
Example
model.extract(list[str], \n \"I'd like to visit Naples, Genoa, Florence and of course, Rome\",\n inst=\"Output the country\")\n
Result
['Italy']\n
"},{"location":"models/find_local_models/","title":"Finding new models","text":""},{"location":"models/find_local_models/#chat-or-instruct-types-only","title":"Chat or instruct types only","text":"Sibila can use models that were fine-tuned for chat or instruct purposes. These models work in user - assistant turns or messages and use a chat template to properly compose those messages to the format that the model was fine-tuned to.
For example, the Llama2 model was released in two editions: a simple Llama2 text completion model and a Llama2-instruct model that was fine tuned for user-assistant turns. For Sibila you should always select chat or instruct versions of a model.
But which model to choose? You can look at model benchmark scores in popular listing sites:
- https://llm.extractum.io/list/
- https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
- https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
"},{"location":"models/find_local_models/#find-a-quantized-version-of-the-model","title":"Find a quantized version of the model","text":"Since Large Language Models are quite big, they are usually quantized so that each parameter occupies a little more than 4 bits or half a byte.
Without quantization, a 7 billion parameters model would require 14Gb of memory (with each parameter taking 16 bits) to load and a bit more during inference.
But with quantization techniques, a 7 billion parameters model can have a file size of only 4.4Gb (using about 50% more in memory - 6.8Gb), which makes it accessible to be ran in common GPUs or even in common RAM memory (albeit slower).
Quantized models are stored in a file format popularized by llama.cpp, the GGUF format (which means GPT-Generated Unified Format). We're using llama.cpp to run local models, so we'll be needing GGUF files.
A good place to find quantized models is in HuggingFace's model hub, particularly in the well-know TheBloke's (Tom Jobbins) area:
https://huggingface.co/TheBloke
TheBloke is very prolific in producing quality quantized versions of models, usually shortly after they are released.
And a good model that we'll be using for the examples is a 4 bit quantization of the OpenChat-3.5 model, which itself is a fine-tuning of Mistral-7b:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
"},{"location":"models/find_local_models/#download-the-file-into-the-models-folder","title":"Download the file into the \"models\" folder","text":"See the OpenChat model section on how to download models with the sibila CLI tool or manually in your browser.
The OpenChat model already includes the chat template format in its metadada, but for some other models we'll need to set the format - see the Setup chat template format section on how to handle this.
"},{"location":"models/formats_json/","title":"Managing formats","text":"A \"formats.json\" file stores the chat template definitions used in models. This allows for models that don't have a chat template in their metadata to be detected and get the right format so they can function well.
If you downloaded the GitHub repository, you'll find a file named \"sibila/res/base_formats.json\", which is the default base configuration that will be used, with many known chat template formats.
When you call Models.setup(), any \"formats.json\" file found in the folder will be loaded and its definitions will be merged with the ones from \"base_formats.json\" which are loaded on initialization. Any entries with the same name will be replaced by freshly loaded ones.
How to add a new format entry that can be used when creating a model? You can do it with the sibila CLI tool or by manually editing the formats.json file.
"},{"location":"models/formats_json/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s openchat openchat \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'openchat' with match='openchat', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/formats_json/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative, we can edit the \"formats.json\" file in the \"Models\" folder, and add the entry:
\"openchat\": {\n \"match\": \"openchat\", # a regexp to match model name or filename\n \"template\": \"{{ bos_token }}...\"\n},\n
In the \"openchat\" key value we have a dictionary with the following keys:
Key match Regular expression that will be used to match the model name or filename template The chat template definition in Jinja format The \"openchat\" format name we are defining here is the name you can use when creating a model, by setting the format argument:
model = LlamaCppModel.create(\"openchat-3.5-1210.Q4_K_M.gguf\",\n format=\"openchat\")\n
or to be more practical: \"openchat\" is also the format name you would use when creating a \"models.json\" entry for a model, in the \"format\" key:
\"openchat\": {\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
See the \"base_formats.json\" file for all the default base formats.
"},{"location":"models/local_model/","title":"Using a local model","text":"Sibila uses llama.cpp to run local models, which are ordinary files in the GGUF format. You can download local models from places like the Hugging Face model hub.
Most current 7B quantized models are very capable for common data extraction tasks (and getting better all the time). We'll see how to find and setup local models for use with Sibila. If you only plan to use OpenAI remote models, you can skip this section.
"},{"location":"models/local_model/#openchat-model","title":"OpenChat model","text":"By default, most of the examples included with Sibila use OpenChat, a very good 7B parameters quantized model: https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF
You can download this model with the sibila CLI tool or manually in your browser.
"},{"location":"models/local_model/#download-with-sibila-hub","title":"Download with \"sibila hub\"","text":"Open a command line prompt in the \"models\" folder if you downloaded the GitHub repository, or create a folder named \"models\".
Run this command:
sibila hub -d TheBloke/openchat-3.5-1210-GGUF -f openchat-3.5-1210.Q4_K_M.gguf\n
After downloading the 4.4Gb, the file \"openchat-3.5-1210.Q4_K_M.gguf\" will be available in your \"models\" folder and you can run the examples. You can do the same to download any other GGUF models.
"},{"location":"models/local_model/#manual-download","title":"Manual download","text":"Alternatively, you can download in your browser from this URL:
https://huggingface.co/TheBloke/openchat-3.5-1210-GGUF/blob/main/openchat-3.5-1210.Q4_K_M.gguf
In the linked page, click \"download\" and save this file into a \"models\" folder. If you downloaded the Sibila GitHub repository it already includes a \"models\" folder which you can use. Otherwise, just create a \"models\" folder, where you'll store your local model files.
Once the file \"openchat-3.5-1210.Q4_K_M.gguf\" is placed in the \"models\" folder, you should be able to run the examples.
"},{"location":"models/local_model/#llamacppmodel-class","title":"LlamaCppModel class","text":"Local llama.cpp models can be used with the LlamaCppModel class. Let's generate text after our prompt:
Example
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"../../models/openchat-3.5-1210.Q4_K_M.gguf\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
It worked: the model answered with the continuation of the famous poem.
You'll notice that the first time you create the model object and run a query, it will take longer, because the model must load all its parameters into layers in memory. The next queries will work much faster.
"},{"location":"models/local_model/#a-note-about-out-of-memory-errors","title":"A note about out of memory errors","text":"An important thing to know if you'll be using local models is about \"Out of memory\" errors.
A 7B model like OpenChat-3.5, when quantized to 4 bits will occupy about 6.8 Gb of memory, in either GPU's VRAM or common RAM. If you try to run a second model at the same time, you might get an out of memory error and/or llama.cpp may crash: it all depends on the memory available in your computer.
This is less of a problem when running scripts from the command line, but in environments like Jupyter where you can have multiple open notebooks, you may get \"out of memory\" errors or python kernel errors like:
Error
Kernel Restarting\nThe kernel for sibila/examples/name.ipynb appears to have died.\nIt will restart automatically.\n
If you get an error like this in JupyterLab, open the Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any out-of-memory stuck models.
A good practice is to delete any local model after you no longer need it or right before loading a new one. A simple \"del model\" works fine, or you can add these two lines before creating a model:
try: del model\nexcept: ...\n\nmodel = LlamaCppModel(...)\n
This way, any existing model in the current notebook is deleted before creating a new one.
However this won't work across multiple notebooks. In those cases, open JupyterLab's Kernel menu and select \"Shut Down All Kernels...\". This will get rid of any models currently in memory.
"},{"location":"models/models_factory/","title":"Models factory","text":"The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The Models factory class is a more flexible way to create models, for example:
Models.setup(\"../../models\")\n\nmodel = Models.create(\"openai:gpt-4\")\n
The first line calls Models.setup() to initialize the factory with the folder where model files and configs (\"models.json\" and \"formats.json\") are located.
The second line calls Models.create() to create a model from the name \"openai:gpt-4\". In this case we created a remote model, but we could as well create a local model based in a GGUF file.
The names should be in the format \"provider:model_name\" and Sibila currently supports two providers:
Provider Type Creates object of type llamacpp Local GGUF model LlamaCppModel openai Remote model OpenAIModel The name part, after the \"provider:\" must either be:
- A remote model name, like \"gpt-4\": \"openai:gpt-4\"
- A local model name, like \"openchat\": \"llamacpp:openchat\"
- The actual filename of a model in the \"models\" folder: \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\" - this is the form we use in the examples, but of course using \"openchat\" instead of the filename would be better...
Although you can use filenames as model names, it's generally a better idea, for continued use, to create an entry in the \"models.json\" file - this allows future model replacement to be much easier.
See Managing models to learn how to register these model names.
"},{"location":"models/models_json/","title":"Managing models","text":"Model names are stored in a file named \"models.json\", in your \"models\" folder. Models registered in this file can then be used when calling Models.create() to create an instance of the model.
Registering a name is not strictly needed, as you can create models from their filenames or remote model names, for example in most examples you'll find models created with:
model = Models.create(\"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\")\n
However, it's a good idea to register a name, specially if you'll be using a model for some time, or there's the possibility you'll need to replace it later. If you register a name, only that will later need to be changed.
There are two ways of registering names: by using the sibila CLI tool or by directly editing the \"models.json\" file.
"},{"location":"models/models_json/#use-the-sibila-models-cli-tool","title":"Use the \"sibila models\" CLI tool","text":"To register a model with the Models factory you can use the \"sibila models\" tool. Run in the \"models\" folder:
> sibila models -s \"llamacpp:openchat openchat-3.5-1210.Q4_K_M.gguf\" openchat\n\nUsing models directory '.'\nSet model 'llamacpp:openchat' with name='openchat-3.5-1210.Q4_K_M.gguf', \nformat='formatx' at './models.json'.\n
First argument after -s is the new entry name (including the \"llamacpp:\" provider), then the filename, then the chat template format, if needed.
This will create an \"openchat\" entry in \"models.json\", exactly like the manually created below.
"},{"location":"models/models_json/#manually-edit-modelsjson","title":"Manually edit \"models.json\"","text":"In alternative, you can manually register a model name by editing the \"models.json\" file located in you \"models\" folder.
A \"models.json\" file:
{\n # \"llamacpp\" is a provider, you can then create models with names \n # like \"provider:model_name\", for ex: \"llamacpp:openchat\"\n \"llamacpp\": { \n\n \"_default\": { # place here default args for all llamacpp: models.\n \"genconf\": {\"temperature\": 0.0}\n # each model entry below can then override as needed\n },\n\n \"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n },\n\n \"phi2\": {\n \"name\": \"phi-2.Q4_K_M.gguf\", # model filename\n \"format\": \"phi2\",\n \"genconf\": {\"temperature\": 2.0} # a hot-headed model\n },\n\n \"oc\": \"openchat\" \n # this is a link: \"oc\" forwards to the \"openchat\" entry\n },\n\n # The \"openai\" provider. A model can be created with name: \"openai:gpt-4\"\n \"openai\": { \n\n \"_default\": {}, # default settings for all OpenAI models\n\n \"gpt-3.5\": {\n \"name\": \"gpt-3.5-turbo-1106\" # OpenAI's model name\n },\n\n \"gpt-4\": {\n \"name\": \"gpt-4-1106-preview\"\n },\n },\n\n # \"alias\" entry is not a provider but a way to have simpler alias names.\n # For example you can use \"alias:develop\" or even simpler, just \"develop\" to create the model:\n \"alias\": { \n \"develop\": \"llamacpp:openchat\",\n \"production\": \"openai:gpt-3.5\"\n }\n}\n
Looking at the above structure, we have two top entries for providers \"llamacpp\" and \"openai\", and also an \"alias\" entry.
Inside each provider entry, we have a \"_defaults\" key, which can store a base GenConf or other arguments passed during model creation. The default values defined in \"_default\" entries can later be overridden by any keys of the same name specified in each model definition. You can see this in the \"phi2\" entry, which overrides the genconf entry given in the above \"_default\", setting temperature to 2.0. Keys are merged element-wise from any specified in the \"_defaults\" entry for the provider: keys with the same name are overridden, all other keys are inherited.
In the above \"model.json\" example, let's look at the \"openchat\" model entry:
\"openchat\": { # a model definition\n \"name\": \"openchat-3.5-1210.Q4_K_M.gguf\",\n \"format\": \"openchat\" # chat template format used by this model\n},\n
The \"openchat\" key name is the name you'll use to create the model as \"llamacpp:openchat\":
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"llamacpp:openchat\")\n
You can have the following keys in a model entry:
Key name The filename to use when loading a model (or remote model name) format Identifies the chat template format that it should use, from the \"formats.json\" file. Some local models include the chat template format in their metadata, so this key is optional. genconf Default GenConf (generation config settings) used to create the model, which will default to use them in each generation. These config settings are merged element-wise from any specified in the \"_defaults\" entry for the provider. other Any other keys will be passed during model creation as its arguments. You can learn which arguments are possible in the API reference for LlamaCppModel or OpenAIModel. For example you can pass \"ctx_len\": 2048 to define the context length to use. As genconf, these keys are merged element-wise from any specified in the \"_defaults\" entry for the provider. The \"alias\" entry is a handy way to keep names that point to actual model entries (independent of provider). Note the two alias entries \"develop\" and \"production\" in the above \"models.json\" - you could then create the production model by doing:
# initialize Models to this folder\nModels.setup(\"../../models\")\n\nmodel = Models.create(\"production\")\n
Alias entries can be used as \"alias:production\" or without the \"alias:\" provider, just as \"production\" as in the example above. For an example of a JSON file with many models defined, see the \"models/models.json\" file.
"},{"location":"models/remote_model/","title":"Remote models","text":"Sibila can use OpenAI remote models, for which you'll need a paid OpenAI account and its API key. Although you can pass this key when you create the model object, it's more secure to define an env variable with this information:
Linux and MacWindows export OPENAI_API_KEY=\"...\"\n
setx OPENAI_API_KEY \"...\"\n
Another possibility is to store your OpenAI key in .env files, which has many advantages: see the dotenv-python package.
"},{"location":"models/remote_model/#model-names","title":"Model names","text":"OpenAI models can be used by Sibila through the OpenAIModel class. To get a list of known model names:
Example
from sibila import OpenAIModel\n\nOpenAIModel.known_models()\n
Result
['gpt-4-0613',\n'gpt-4-32k-0613',\n'gpt-4-0314',\n'gpt-4-32k-0314',\n'gpt-4-1106-preview',\n'gpt-4',\n'gpt-4-32k',\n'gpt-3.5-turbo-1106',\n'gpt-3.5-turbo-0613',\n'gpt-3.5-turbo-16k-0613',\n'gpt-3.5-turbo-0301',\n'gpt-3.5-turbo',\n'gpt-3.5-turbo-16k',\n'gpt-3',\n'gpt-3.5']\n
You can use any of these model names to create an OpenAI model. For example:
Example
model = OpenAIModel(\"gpt-3.5\")\n\nmodel(\"I think that I shall never see.\")\n
Result
'A poem as lovely as a tree.'\n
"},{"location":"models/setup_format/","title":"Chat template format","text":""},{"location":"models/setup_format/#what-are-chat-templates","title":"What are chat templates?","text":"Because these models were fine-tuned for chat or instruct interaction, they use a chat template, which is a Jinja template that converts a list of messages into a text prompt. This template must follow the original format that the model was trained on - this is very important or you won't get good results.
Chat template definitions are Jinja templates like the following one, which is in ChatML format:
{% for message in messages %}\n {{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n'}}\n{% endfor %}\n
When ran over a list of messages with system, user and model messages, the template produces text like the following:
<|im_start|>system\nYou speak like a pirate.<|im_end|>\n<|im_start|>user\nHello there?<|im_end|>\n<|im_start|>assistant\nAhoy there matey! How can I assist ye today on this here ship o' mine?<|im_end|>\n
Only by using the specific chat template for the model, can we get back the best results.
Sibila tries to automatically detect which template to use with a model, either from the model name or from embedded metadata, if available.
"},{"location":"models/setup_format/#does-the-model-have-a-built-in-chat-template-format","title":"Does the model have a built-in chat template format?","text":"Some GGUF models include the chat template in their metadata, unfortunately this is not standard.
You can quickly check if the model has a chat template by running the sibila CLI in the same folder as the model file:
> sibila models -t \"llamacpp:openchat-3.5-1210.Q4_K_M.gguf\"\n\nUsing models directory '.'\nTesting model 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf'...\nModel 'llamacpp:openchat-3.5-1210.Q4_K_M.gguf' was properly created and should run fine.\n
In this case the chat template format is included with the model and nothing else is needed.
Another way to test this is to try creating the model in python. If no exception is raised, the model GGUF file contains the template definition and should work fine.
Example of model creation error
from sibila import LlamaCppModel\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Error
...\n\nValueError: Could not find a suitable format (chat template) for this model.\nWithout a format, fine-tuned models cannot function properly.\nSee the docs on how you can fix this: pass the template in the format arg or \ncreate a 'formats.json' file.\n
But if you get an error such as above, you'll need to provide a chat template. It's quite easy - let's see how to do it.
"},{"location":"models/setup_format/#find-the-chat-template-format","title":"Find the chat template format","text":"So, how to find the chat template for a new model that you intend to use?
This is normally listed in the model's page: search in that page for \"template\" and copy the listed Jinja template text.
If the template isn't directly listed in the model's page, you can look for a file named \"tokenizer_config.json\" in the main model files. This file should include an entry named \"chat_template\" which is what we want.
Example of a tokenizer_config.json file
For example, in OpenChat's file \"tokenizer_config.json\":
https://huggingface.co/openchat/openchat-3.5-1210/blob/main/tokenizer_config.json
You'll find this line with the template:
{\n \"...\": \"...\",\n\n \"chat_template\": \"{{ bos_token }}...{% endif %}\",\n\n \"...\": \"...\"\n}\n
The value in the \"chat_template\" key is the Jinja template that we're looking for.
Another alternative is to search online for the name of the model and \"chat template\".
Either way, once you know the template used by the model, you can set and use it.
"},{"location":"models/setup_format/#option-1-pass-the-chat-template-format-when-creating-the-model","title":"Option 1: Pass the chat template format when creating the model","text":"Once you know the chat template definition you can create the model and pass it in the format argument. Let's assume you have a model file named \"peculiar-model-7b.gguf\":
chat_template = \"{{ bos_token }}...{% endif %}\"\n\nmodel = LlamaCppModel(\"peculiar-model-7b.gguf\",\n format=chat_template)\n
And the model should now work without problems.
"},{"location":"models/setup_format/#option-2-add-the-chat-template-to-the-models-factory","title":"Option 2: Add the chat template to the Models factory","text":"If you plan to use the model many times, a more convenient solution is to create an entry in the \"formats.json\" file so that all further models with this name will use the template.
"},{"location":"models/setup_format/#with-sibila-formats-cli-tool","title":"With \"sibila formats\" CLI tool","text":"Run the sibila CLI tool in the \"models\" folder:
> sibila formats -s peculiar peculiar-model \"{{ bos_token }}...{% endif %}\"\n\nUsing models directory '.'\nSet format 'peculiar' with match='peculiar-model', template='{{ bos_token }}...'\n
First argument after -s is the format entry name, second the match regular expression (to identify the model filename) and finally the template. Help is available with \"sibila formats --help\".
"},{"location":"models/setup_format/#manually-edit-formatsjson","title":"Manually edit \"formats.json\"","text":"In alternative to using the sibila CLI tool, you can add the chat template format by creating an entry in a \"formats.json\" file, in the same folder as the model, with these fields:
{\n \"peculiar\": {\n \"match\": \"peculiar-model\",\n \"template\": \"{{ bos_token }}...{% endif %}\"\n }\n}\n
The \"match\" field is regular expression that will be used to match the model name or filename. Field \"template\" is the chat template in Jinja format.
After configuring the template as we've seen above, all you need to do is to create a LlamaCppModel object and pass the model file path.
model = LlamaCppModel(\"peculiar-model-7b.gguf\")\n
Note that we're not passing the format argument anymore when creating the model. The \"match\" regular expression we defined above will recognize the model from the filename and use the given chat template format.
Base format definitions
Sibila includes by default the definitions of several well-known chat template formats. These definitions are available in \"sibila/base_formats.json\", and are automatically loaded when Models factory is created.
You can add any chat template formats into your own \"formats.json\" files, but please never change the \"sibila/base_formats.json\" file, to avoid potential errors.
"},{"location":"models/sibila_cli/","title":"Sibila CLI tool","text":"The Sibila Command-Line Interface tool simplifies managing the Models factory and is useful to download models from Hugging Face model hub.
The Models factory is based in a \"models\" folder that contains two configuration files: \"models.json\" and \"formats.json\" and the actual files for local models.
The CLI tool is divided in three areas or actions:
Action models Manage models in \"model.json\" files formats Manage formats in \"model.json\" files hub Search and download models from Hugging Face model hub In all commands you should pass the option \"-m models_folder\" with the path to the \"models\" folder. Or in alternative run the commands inside the \"models\" folder.
The following argument names are used below (other unlisted names should be descriptive enough):
Name res_name Model entry name in the form \"provider:name\", for example \"llamacpp:openchat\". format_name Name of a format entry in \"formats.json\", for example \"chatml\". query Case-insensitive query that will be matched by a substring search. Usage help is available by running \"sibila --help\" for general help, or \"sibila action --help\", where action is one of \"models\", \"formats\" or \"hub\".
"},{"location":"models/sibila_cli/#sibila-models","title":"Sibila models","text":"To register a model entry pointing to a model name or filename, and optional format_name is a format name:
sibila models -s res_name model_name_or_filename [format_name]\n
To set the format_name for an existing model entry:
sibila models -f res_name format_name\n
To test if a model can run (for example to check if it has the chat template format defined):
sibila models -t res_name\n
List all models with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a model entry in:
sibila models -d res_name\n
"},{"location":"models/sibila_cli/#sibila-formats","title":"Sibila formats","text":"Check if a model filename has any format defined in the Models factory:
sibila formats -q filename\n
To register a chat template format, where template is the Jinja chat template and optional match is a regexp that matches model filename:
sibila formats -s format_name template [match_regex]\n
List all formats with optional case-insensitive substring query:
sibila models -l [query]\n
Delete a format entry:
sibila formats -d format_name\n
"},{"location":"models/sibila_cli/#sibila-hub","title":"Sibila hub","text":"List models in the Hugging Face model hub that match the given queries. Argument query can be a list of strings to match, separated by a space character.
Arg Filename is case-insensitive for substring matching.
Arg exact_author is an exact and case-sensitive author name from Hugging Face model hub.
sibila hub -l query [-f filename] [-a exact_author]\n
To download a model, where model_id is a string like \"TheBloke/openchat-3.5-1210-GGUF\". Args filename and author_name same as above:
sibila hub -d model_id -f filename -a exact_author -s set name\n
"}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 3e99b11..6c3666a 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,187 +2,187 @@
https://jndiogo.github.io/sibila/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/first_run/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/installing/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/tips/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/tools/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/what/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/generation/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/model/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/models/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/multigen/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/thread/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/tokenizer/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/api-reference/tools/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/cli/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/compare/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/extract/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/extract_dataclass/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/from_text_to_object/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/hello_model/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/interact/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/quick_meeting/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/tag/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/examples/tough_meeting/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/extract/dataclass/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/extract/enums/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/extract/free_text/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/extract/pydantic/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/extract/simple_types/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/find_local_models/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/formats_json/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/local_model/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/models_factory/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/models_json/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/remote_model/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/setup_format/
- 2024-03-08
+ 2024-03-09
daily
https://jndiogo.github.io/sibila/models/sibila_cli/
- 2024-03-08
+ 2024-03-09
daily
\ No newline at end of file
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index 7bd0c08..88333d3 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ