By Alex Pareto and Sam Kaplan for the Pear+OpenAI hackathon.
Victor is a new architecture for higher-level intelligence with LLMs. Right now, LLMs are limited in their ability to learn. After the initial train, you’re pretty much limited fine-tuning them. This is a manual and brittle process.
Victor is based on a simple insight: evolutionary algorithms around code are very good at improving the reasoning ability of LLMs (EUREKA, Deepmind, etc). Given a task T(inputData) → Answer
, Victor automatically decomposes this task into sub-functions. Victor creates a graph where each node is a function. Through this representation, we can define actions on the graph:
- Generate - Look at each function and write a new version. The new version could also include new sub-functions. Changing the function must be backwards compatible, so that dependent functions continue to function. New versions are added to the graph, but do not replace previous functions.
- Backpropgate - The generate steps create more functions. The backprop step evaluates and ranks the functions. A critical insight for automated prompt revision is how to effectively create an objective function over arbitrary prompts. Our initial implementation solves this by viewing this as a ranking problem. Two versions of a function are chosen to be compared. In addition, the output node is chosen. The output node can be the function itself, or it could be an upstream function. In this way, a change in the version of a highly nested function, can be evaluated on its effect to the final output.
- Evaluate - This means solving the task
T(inputData)
. Nodes are chosen based on their quality from backpropagate, withtemperature
introducing randomness. All inputs and outputs are automatically captured for each node/function call, so the granular data can be used in backpropogating.
With this, we have a system which can autonomously learn and improve. Generate intelligently improves the solution of the problem through rewriting functions, and decomposing the function into finer and finer steps. Backprop then granularly evaluates the new functions with existing sample data.
Most impressively, the more tasks given to Victor, the better it becomes on new tasks, since it has a bigger and bigger repertoire of functions to use.
Victor is especially good for use cases involving transformations over data.
- Question answering
- Given podcast transcripts, which are the best episodes?
- Automatically compute statistics across lots of subcomponents of data (i.e. financial data)
- Detailed data analysis and essay writing
- Agentic behavior - self-learning
- Working End2End
- Program graph schema
- Function dependencies
- Full invocation history
- Program generation
- Automatically generate programs based on a task
- Program improvements
- Sub-function decomposition and generation
- Program execution
- Automatically compose dependencies and execute
- Automatically store input and outputs of every function call node. (Code is transpiled to inject state management between each function call)
- Automatic rewriting if errors
- Evaluation, fitness, and backpropogation
- Frontend
The current implementation is very basic. A significant enhancement would be to conceptualize each node’s quality not as a fixed value, but as a probability distribution. Each node
The posterior distribution
With this formulation, the system can strategically choose the next node to measure in a manner that maximizes information gain. This decision is based on the expected reduction in uncertainty about the node’s fitness, quantified by the variance of the posterior distribution. The node selected for the next measurement, say node
where
More evaluation modules could be written, such as an agentic evaluator that can autonomously make a deeper decision of the better output.
More complex backprop algorithms could be used, such as ones that allow for updating more than two nodes at once.