documentation material

abacusai · Nov 27, 2024 · e9bb435 · e9bb435
1 parent 8a5a0f0
commit e9bb435
Show file tree

Hide file tree

Showing 10 changed files with 1,102 additions and 1,379 deletions.
diff --git a/README.md b/README.md
@@ -26,5 +26,9 @@ from abacusai import ApiClient
 client = ApiClient('YOUR_API_KEY')
 ```
 
+[API KEY](https://abacus.ai/app/profile/apikey)
+[Abacus.AI CheatSheet](https://github.com/abacusai/api-python/blob/main/examples/CheatSheet.md)
+[Abacus.AI API Examples](https://github.com/abacusai/api-python/blob/main/examples)
+
 ## License
 [MIT](https://github.com/abacusai/api-python/blob/main/LICENSE)
diff --git a/examples/CheatSheet.md b/examples/CheatSheet.md
@@ -0,0 +1,49 @@
+## Python SDK Documentation Examples
+The full Documentation of the Abacus.AI python SDK can be found [here](https://abacusai.github.io/api-python/autoapi/abacusai/index.html)
+
+Please note that from within the platform's UI, you will have access to template/example code for all below cases:
+- [Python Feature Group](https://abacus.ai/app/python_functions_list)
+- [Pipelines](https://abacus.ai/app/pipelines)
+- [Custom Loss Functions](https://abacus.ai/app/custom_loss_functions_list)
+- Custom Models & [Algorithms](https://abacus.ai/app/algorithm_list)
+- [Python Modules](https://abacus.ai/app/modules_list)
+- And others...
+
+Furthermore, the platform's `AI Engineer`, which is a ChatBot located on the bottom right of the screen, inside any project, also has access to our API's and will be able to provide you with examples.
+
+#### Abacus.AI - API Command Cheatsheet
+Here is a list of important methods you should keep saved somewhere. You can find examples for all of these methods inside the notebooks, or you can refer to the official documentation.
+
+```python
+from abacusai import ApiClient
+client = ApiClient('YOUR_API_KEY')
+#client.whatever_name_of_method_below()
+```
+
+| Method                                             | Explanation                                                                                                                                                                  |
+|----------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| suggest_abacus_apis                                | Describe what you need, and we will return the methods that will help you achieve it.                                                                                        |
+| describe_project                                   | Describe's project                                                                                                                                                           |
+| create_dataset_from_upload                         | Creates a dataset object from local data                                                                                                                                     |
+| describe_feature_group_by_table_name               | Describes the feature group using the table name                                                                                                                             |
+| describe_feature_group_version                     | Describes the feature group using the feature group version                                                                                                                  |
+| list_models                                        | List's models of a project                                                                                                                                                   |
+| extract_data_using_llm                             | Extracts data from a document. Allows you to create a json output and extract specific information from a document                                                           |
+| execute_data_query_using_llm                       | Runs SQL on top of feature groups based on natural language input. Can return both SQL and the result of SQL execution.                                                      |
+| get_chat_response                                  | Uses a chatLLM deployment. Can be used to add filters, change LLM and do advanced use cases using an agent on top of a ChatLLM deployment.                                   |
+| get_chat_response_with_binary_data                 | Same as above, but you can also send a binary dataset                                                                                                                        |
+| get_conversation_response                          | Uses a chatLLM deployment with conversation history. Useful when you need to use the API. You create a conversation ID and you send it or you use the one created by Abacus. |
+| get_conversation_response_with_binary_data         | Same as above, but you can also send a binary dataset                                                                                                                        |
+| evaluate_prompt                                    | LLM call for a user query. Can get JSON output using additional arguments                                                                                                    |
+| get_matching_documents                             | Gets the search results for a user query using document retriever directly. Can be used along with evaluate_prompt to create a customized chat llm like agent                |
+| get_relevant_snippets                              | Creates a doc retriever on the fly for retrieving search results                                                                                                             |
+| extract_document_data                              | Extract data from a PDF, Word document, etc using OCR or using the digital text.                                                                                             |
+| get_docstore_document                              | Download document from the doc store using their doc_id.                                                                                                                     |
+| get_docstore_document_data                         | Get extracted or embedded text from a document using their doc_id.                                                                                                                         |
+| stream_message                                     | Streams message on the UI for agents                                                                                                                                         |
+| update_feature_group_sql_definition                | Updates the SQL definition of a feature group                                                                                                                                |
+| query_database_connector                           | Executes a SQL query on top of a database connector. Will only work for connectors that support it.                                                                          |
+| export_feature_group_version_to_file_connector     | Exports a feature group to a file connector                                                                                                                                  |
+| export_feature_group_version_to_database_connector | Exports a feature group to a database connector                                                                                                                              |
+| create_dataset_version_from_file_connector         | Refreshes data from the file connector connected to the file connector.                                                                                                      |
+| create_dataset_version_from_database_connector     | Refreshes data from the file connector connected to the database connector.                                                                                                  |
diff --git a/examples/basics/basics.ipynb b/examples/basics/basics.ipynb
@@ -0,0 +1,291 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Connect to Abacus\n",
+    "You can find your API key here: [API KEY](https://abacus.ai/app/profile/apikey)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import abacusai\n",
+    "client = abacusai.ApiClient(\"YOUR API KEY\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Finding API's easily.\n",
+    "There are two ways to find API's easily in the Abacus platform:\n",
+    "1. Try auto-completion by using tab. Most API's follow expressive language so you can search them using the autocomplete feature.\n",
+    "2. Use the `suggest_abacus_apis` method. This method calls a large language model that has access to our full documentation. It can suggest you what API works for what you are trying to do.\n",
+    "3. Use the official [Python SDK documentation](https://abacusai.github.io/api-python/autoapi/abacusai/index.html) page which will have all the available methods and attributes of classes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "apis = client.suggest_abacus_apis(\"list feature groups in a project\", verbosity=2, limit=3)\n",
+    "for api in apis:\n",
+    "    print(f\"Method: {api.method}\")\n",
+    "    print(f\"Docstring: {api.docstring}\")\n",
+    "    print(\"---\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Project Level API's\n",
+    "You can find the ID easily by looking at the URL in your browser. For example, if your URL looks like this: `https://abacus.ai/app/projects/fsdfasg33?doUpload=true`, the project id is \"fsdfasg33\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Gets information about the project based on the ID.\n",
+    "project = client.describe_project(project_id=\"YOUR_PROJECT_ID\")\n",
+    "\n",
+    "# A list of all models trained under the project\n",
+    "models = client.list_models(project_id=\"YOUR_PROJECT_ID\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Load a Feature Group"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Loads the specific version of a FeatureGroup class object \n",
+    "fg = client.describe_feature_group_version(\"FEATURE_GROUP_VERSION\")\n",
+    "\n",
+    "# Loads the latest version of a FeatureGroup class object based on a name\n",
+    "fg = client.describe_feature_group_by_table_name(\"FEATURE_GROUP_NAME\")\n",
+    "\n",
+    "# Loads the FeatureGroup as a pandas dataframe\n",
+    "df = fg.load_as_pandas()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Add a Feature Group to the Project"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# First we connect our docstore to our project\n",
+    "client.add_feature_group_to_project(\n",
+    "    feature_group_id='FEATURE_GROUP_ID',\n",
+    "    project_id='PROJECT_ID',\n",
+    "    feature_group_type='CUSTOM_TABLE'  # You can set to DOCUMENTS if this is a document set\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Update the feature group SQL definition"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "client.update_feature_group_sql_definition('YOUR_FG_ID', 'SQL')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Creating a Dataset from local\n",
+    "For every dataset created, a feature group with the same name will also be generated. When you need to update the source data, just update the dataset directly and the feature group will also reflect those changes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import io\n",
+    "zip_filename= 'sample_data_folder.zip'\n",
+    "\n",
+    "with open(zip_filename, 'rb') as f:\n",
+    "    zip_file_content = f.read()\n",
+    "\n",
+    "zip_file_io = io.BytesIO(zip_file_content)\n",
+    "\n",
+    "# If the ZIP folder contains unstructured text documents (PDF, Word, etc.), then set `is_documentset` == True\n",
+    "upload = client.create_dataset_from_upload(table_name='MY_SAMPLE_DATA', file_format='ZIP', is_documentset=False)\n",
+    "upload.upload_file(zip_file_io)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Updating a Dataset from local"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "upload = client.create_dataset_version_from_upload(dataset_id='YOUR_DATASET_ID', file_format='ZIP')\n",
+    "upload.upload_file(zip_file_io)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Executing SQL using a connector"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "connector_id = \"YOUR_CONNECTOR_ID\"\n",
+    "sql_query = \"SELECT * FROM TABLE LIMIT 5\"\n",
+    "\n",
+    "result = client.query_database_connector(connector_id, sql_query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Uploading a Dataset using a connector\n",
+    "\n",
+    "`doc_processing_config` is optional depending on if you want to load a document set or no. use the code below and change based on your application. \n",
+    "\n",
+    "Similar to `create_dataset_from_file_connector` you can use `create_dataset_from_database_connector`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# doc_processing_config = abacusai.DatasetDocumentProcessingConfig(\n",
+    "#     extract_bounding_boxes=True,\n",
+    "#     use_full_ocr=False,\n",
+    "#     remove_header_footer=False,\n",
+    "#     remove_watermarks=True,\n",
+    "#     convert_to_markdown=False,\n",
+    "# )\n",
+    "\n",
+    "dataset = client.create_dataset_from_file_connector(\n",
+    "    table_name=\"MY_TABLE_NAME\",\n",
+    "    location=\"azure://my-location:share/whatever/*\",\n",
+    "    # refresh_schedule=\"0 0 * * *\",  # Daily refresh at midnight UTC\n",
+    "    # is_documentset=True, #Only if this is an actual documentset (Meaning word documents, PDF files, etc)\n",
+    "    # extract_bounding_boxes=True,\n",
+    "    # document_processing_config=doc_processing_config,\n",
+    "    # reference_only_documentset=False,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Updating a Dataset using a connector"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "client.create_dataset_version_from_file_connector('DATASET_ID') # For file connector\n",
+    "client.create_dataset_version_from_database_connector('DATASET_ID')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Export A feature group to a connector\n",
+    "Below code will also work for non-SQL connectors like blob storages. The `database_feature_mapping` would be optional in those cases.\n",
+    "\n",
+    "You can find the `connector_id` [here](https://abacus.ai/app/profile/connected_services)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "WRITEBACK = 'Anonymized_Store_Week_Result'\n",
+    "MAPPING = {\n",
+    "  'COLUMN_1': 'COLUMN_1', \n",
+    "  'COLUMN_2': 'COLUMN_2', \n",
+    "}\n",
+    "\n",
+    "feature_group = client.describe_feature_group_by_table_name(f\"FEATURE_GROUP_NAME\")\n",
+    "feature_group.materialize() # To make sure we have latest version\n",
+    "feature_group_version = feature_group.latest_feature_group_version.feature_group_version\n",
+    "client.export_feature_group_version_to_database_connector(\n",
+    "    feature_group_version, \n",
+    "    database_connector_id='connector_id',\n",
+    "    object_name=WRITEBACK,\n",
+    "    database_feature_mapping=MAPPING, \n",
+    "    write_mode='insert'\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### "
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}