StacklokLabs · lukehinds · Nov 8, 2024 · Nov 5, 2024 · Nov 6, 2024 · Nov 6, 2024
diff --git a/README.md b/README.md
@@ -11,9 +11,10 @@ the ability to generate prompt led synthetic datasets.
 
 Promptwright was inspired by the [redotvideo/pluto](https://github.com/redotvideo/pluto),
 in fact it started as fork, but ended up largley being a re-write, to allow dataset generation
-against a local LLM model, as opposed to OpenAI where costs can be prohibitively expensive.
+against a local LLM model.
 
-The library interfaces with Ollama, making it easy to just pull a model and run Promptwright.
+The library interfaces with Ollama, making it easy to just pull a model and run
+Promptwright.
 
 ## Features
 
@@ -54,12 +55,23 @@ To run an example:
 4. Set the `model_name` in the chosen example file to the model you have downloaded.
 
   ```python
-      engine = LocalDataEngine(
-        args=LocalEngineArguments(
+
+      tree = TopicTree(
+        args=TopicTreeArguments(
+            root_prompt="Creative Writing Prompts",
+            model_system_prompt=system_prompt,
+            tree_degree=5, # Increase degree for more prompts
+            tree_depth=4, # Increase depth for more prompts
+            temperature=0.9, # Higher temperature for more creative variations
+            model_name="ollama/llama3" # Set the model name here
+        )
+      )
+      engine = DataEngine(
+        args=EngineArguments(
             instructions="Generate creative writing prompts and example responses.",
             system_prompt="You are a creative writing instructor providing writing prompts and example responses.",
-            model_name="llama3.2:latest",
-            temperature=0.9,  # Higher temperature for more creative variations
+            model_name="ollama/llama3",
+            temperature=0.9,
             max_retries=2,
   ```
 5. Run your chosen example file:
@@ -89,47 +101,34 @@ To run an example:
 }
 ```
 
-### Library Overview
-
-#### Classes
-
-- **Dataset**: A class for managing generated datasets.
-- **LocalDataEngine**: The main engine responsible for interacting with the LLM client and generating datasets.
-- **LocalEngineArguments**: A configuration class that defines the instructions, system prompt, model name temperature, retries, and prompt templates used for generating data.
-- **OllamaClient**: A client class for interacting with the Ollama API
-- **HFUploader**: A utility class for uploading datasets to Hugging Face (pass in the path to the dataset and token).
-
-### Troubleshooting
-
-If you encounter any errors while running the script, here are a few common troubleshooting steps:
-
-1. **Restart Ollama**:  
-   ```bash
-   killall ollama && ollama serve
-   ```
-
-2. **Verify Model Installation**:  
-   ```bash
-   ollama pull {model_name}
-   ```
-
-3. **Check Ollama Logs**:  
-   Inspect the logs for any error messages that might provide more context on
-   what went wrong, these can be found in the `~/.ollama/logs` directory.
-
 ## Model Compatibility
 
 The library should work with most LLM models. It has been tested with the
 following models so far:
 
-- **LLaMA3**: The library is designed to work with the LLaMA model, specifically
-the `llama3:latest` model.
-- **Mistral**: The library is compatible with the Mistral model, which is a fork
-of the GPT-3 model.
+- **Mistral**
+- **LLaMA3**
+--**Qwen2.5**
+
+## Unpredictable Behavior
+
+The library is designed to generate synthetic data based on the prompts and instructions
+provided. The quality of the generated data is dependent on the quality of the prompts
+and the model used. The library does not guarantee the quality of the generated data.
+
+Large Language Models can sometimes generate unpredictable or inappropriate
+content and the authors of this library are not responsible for the content
+generated by the models. We recommend reviewing the generated data before using it
+in any production environment.
 
-If you test anymore, please make a pull request to update this list!
+Large Language Models also have the potential to fail to stick with the behavior
+defined by the prompt around JSON formatting, and may generate invalid JSON. This
+is a known issue with the underlying model and not the library. We handle these
+errors by retrying the generation process and filtering out invalid JSON. The 
+failure rate is low, but it can happen. We report on each failure within a final
+summary.
 
-### Contributing
+## Contributing
 
 If something here could be improved, please open an issue or submit a pull request.