PyFinder provides the Python implementation of the Pathfinder tool for the Circles UBI project. It includes network flow algorithms and tools to analyze and visualize the flow of value through a network of trust-based connections, facilitating the understanding and optimization of value transfers in the Circles UBI ecosystem.
- Methodology
- Installation
- Usage
- Project Structure
- Class Descriptions
- Running the Script
- Examples
- Dashboard
The graph is constructed using two primary data sources:
- Trusts: This dataset defines which tokens each account trusts and is willing to accept.
- Balances: This dataset shows the token balances held by each account.
For each account (referred to as the "truster"), we follow these steps:
- Identify all tokens trusted by the account.
- Find all other accounts holding any of these trusted tokens.
- Create edges from these token-holding accounts to the truster.
- Each edge represents a potential flow of a specific token from the holder to the truster.
This process is repeated for all accounts, resulting in a complex, multi-directed graph.
In a naive implementation, each edge's capacity would be set to the token balance held by the sender. However, this approach presents a problem: if a sender has multiple edges for the same token (connecting to different trusters), it could lead to "balance non-conservation." This occurs because standard flow algorithms are not aware of the need to conserve total balance across multiple edges.
For example, if account A holds 100 units of token B and can send to both accounts C and D, a naive implementation might allow a total flow of 200 units (100 to C and 100 to D), which violates the actual balance constraint.
To enforce balance conservation, we introduce intermediate nodes:
- For each unique combination of sender and token, we create an intermediate node.
- Instead of direct edges from sender to trusters, we now have:
- An edge from the sender to the intermediate node
- Edges from the intermediate node to each truster accepting that token
For example:
- If A can send token B to both C and D:
- We create an intermediate node A_B
- We add an edge from A to A_B with capacity equal to A's balance of token B
- We add edges from A_B to C and from A_B to D
This structure ensures:
- The total outflow of token B from A is limited to its actual balance
- A_B acts as a "gate," enforcing the balance constraint
- The flow to C and D can be any combination, as long as their sum doesn't exceed A's balance of token B
By using this intermediate node structure, we automatically enforce balance conservation without needing to modify standard flow algorithms.
For an actual implementation take the case take the requested Flow of 129130 mCRC from node 109023 to node 52226. The full graph implementation gives
Which can the be simplified to
- Clone the repository:
git clone https://github.com/hdser/pyfinder.git
cd pyfinder
- Install dependencies using conda:
conda env create -f environment.yml
conda activate pyfinder
- Run dashboard:
python run.py
Or run dashboard with docker:
docker compose up
PyFinder provides three main interfaces for analysis:
The dashboard offers a web-based interface for interactive analysis. Launch options:
# Direct Python launch
python run.py
# Docker Compose launch
docker compose up
# Manual Docker launch
docker build -t pyfinder .
docker run -p 5006:5006 pyfinder
Configuration:
# Default settings in run.py
port = int(os.getenv('PORT', '5006'))
host = os.getenv('HOST', '0.0.0.0')
websocket_max_size = int(os.getenv('BOKEH_WEBSOCKET_MAX_MESSAGE_SIZE', 20*1024*1024))
The CLI tool provides two operation modes:
python -m src.main \
--trust-file data/trust.csv \
--balance-file data/balance.csv \
--implementation networkx \
--source 0x3fb47823a7c66553fb6560b75966ef71f5ccf1d0 \
--sink 0xe98f0672a8e31b408124f975749905f8003a2e04
Available implementations:
- networkx (Python-based, full algorithm support)
- graph-tool (C++-based, high performance)
- ortools (Industrial solver)
python -m src.main --benchmark \
--implementations networkx,graph_tool \
--output-dir benchmarks
Specialized tool for detecting arbitrage opportunities:
python arb.py \
--trust-file data/trust.csv \
--balance-file data/balance.csv \
--implementation graph_tool \
--source 0x3fb47823a7c66553fb6560b75966ef71f5ccf1d0 \
--start-token 0xe98f0672a8e31b408124f975749905f8003a2e04 \
--end-token 0x123...
pyfinder/
├── src/ # Core implementation
│ ├── graph/ # Graph implementations
│ │ ├── base.py # Abstract base classes
│ │ ├── networkx_graph.py
│ │ ├── graphtool_graph.py
│ │ └── ortools_graph.py
│ | └── flow/ # Flow algorithms
│ │ ├── analysis.py # Flow analysis
│ │ ├── decomposition.py
│ │ └── utils.py
│ ├── visualization.py # Visualization tools
│ ├── data_ingestion.py # Data loading
│ └── graph_manager.py # Orchestration
├── dashboard/ # Web interface
│ ├── components/ # UI components
│ └── visualization/ # Interactive viz
├── data/ # Input data
└── output/ # Analysis results
Core orchestration class that coordinates analysis:
from src.graph_manager import GraphManager
# Initialize with CSV data
manager = GraphManager(
data_source=("data/trust.csv", "data/balance.csv"),
graph_type="networkx"
)
# Analyze flow
results = manager.analyze_flow(
source="0x3fb4...",
sink="0xe98f...",
flow_func=None, # Use default algorithm
requested_flow="4000000000000000000"
)
# Unpack results
flow_value, paths, simplified_flows, original_flows = results
Handles data loading and preprocessing:
from src.data_ingestion import DataIngestion
import pandas as pd
# Load data
df_trusts = pd.read_csv("data/trust.csv")
df_balances = pd.read_csv("data/balance.csv")
# Initialize ingestion
data = DataIngestion(df_trusts, df_balances)
# Access mappings
node_id = data.get_id_for_address("0x3fb4...")
address = data.get_address_for_id("42")
Graph implementation wrappers:
from src.graph import NetworkXGraph
# Create graph
graph = NetworkXGraph(edges, capacities, tokens)
# Compute flow
flow_value, flow_dict = graph.compute_flow(
source="42",
sink="318",
flow_func=nx.algorithms.flow.preflow_push
)
# Decompose into paths
paths, edge_flows = graph.flow_decomposition(
flow_dict, "42", "318"
)
Handles core flow computations:
from src.flow.analysis import NetworkFlowAnalysis
# Initialize analyzer
analyzer = NetworkFlowAnalysis(graph)
# Analyze flow
flow_value, paths, simplified_flows, original_flows = analyzer.analyze_flow(
source="42",
sink="318",
flow_func=None,
requested_flow="4000000000000000000"
)
Creates analysis visualizations:
from src.visualization import Visualization
# Initialize visualizer
viz = Visualization()
# Create path visualization
viz.plot_flow_paths(
graph,
paths,
simplified_flows,
id_to_address,
"output/paths.png"
)
# Create full graph visualization
viz.plot_full_flow_paths(
graph,
original_flows,
id_to_address,
"output/full_graph.png"
)
Place data files in the data/
directory:
data/
├── trust.csv # Trust relationships
└── balance.csv # Account balances
Required CSV formats:
# trust.csv
trustee,truster
0x3fb4...,0xe98f...
# balance.csv
account,tokenAddress,demurragedTotalBalance
0x3fb4...,0xe98f...,4000000000000000000
Run the analysis:
python -m src.main
Select mode when prompted:
Choose mode:
1. Run Mode (analyze specific flow)
2. Benchmark Mode (compare algorithms)
Enter choice (1-2):
For Run Mode, enter:
- Source address (42-char hex)
- Sink address (42-char hex)
- Flow amount (optional)
- Algorithm choice:
- Preflow Push (default)
- Edmonds-Karp
- Shortest Augmenting Path
- Boykov-Kolmogorov
- Dinitz
The script produces:
-
Console output:
- Flow value
- Computation time
- Path details
- Performance metrics
-
Visualization files:
output/simplified_paths.png
output/full_paths.png
output/metrics.csv
from src.graph_manager import GraphManager
# Initialize manager
manager = GraphManager(
data_source=("data/trust.csv", "data/balance.csv"),
graph_type="networkx"
)
# Analyze maximum flow
results = manager.analyze_flow(
source="0x3fb47823a7c66553fb6560b75966ef71f5ccf1d0",
sink="0xe98f0672a8e31b408124f975749905f8003a2e04"
)
# Process results
flow_value, paths, simplified_flows, original_flows = results
print(f"Maximum flow: {flow_value}")
for path, tokens, flow in paths:
print("\nPath:")
print(f"Nodes: {' -> '.join(path)}")
print(f"Tokens: {' -> '.join(tokens)}")
print(f"Flow: {flow}")
# Initialize with graph-tool for performance
manager = GraphManager(
data_source=("data/trust.csv", "data/balance.csv"),
graph_type="graph_tool"
)
# Find arbitrage opportunities
flow_value, paths, flows = manager.analyze_arbitrage(
source="0x3fb47823a7c66553fb6560b75966ef71f5ccf1d0",
start_token="0xe98f0672a8e31b408124f975749905f8003a2e04",
end_token="0x123..."
)
if flow_value > 0:
print(f"\nFound arbitrage: {flow_value} mCRC")
for path, tokens, amount in paths:
print(f"\nPath: {' -> '.join(path)}")
print(f"Token conversions: {' -> '.join(tokens)}")
print(f"Amount: {amount} mCRC")
import time
from src.graph_manager import GraphManager
implementations = ["networkx", "graph_tool", "ortools"]
results = []
for impl in implementations:
manager = GraphManager(
data_source=("data/trust.csv", "data/balance.csv"),
graph_type=impl
)
start_time = time.time()
flow_value, _, _, _ = manager.analyze_flow(
source="0x3fb47823a7c66553fb6560b75966ef71f5ccf1d0",
sink="0xe98f0672a8e31b408124f975749905f8003a2e04"
)
duration = time.time() - start_time
results.append({
"implementation": impl,
"flow_value": flow_value,
"computation_time": duration
})
for result in results:
print(f"\n{result['implementation']}:")
print(f"Flow: {result['flow_value']}")
print(f"Time: {result['computation_time']:.4f}s")
The dashboard provides an interactive web interface for analysis:
-
Data Source Configuration
- File upload
- PostgreSQL connection
- Environment configuration
-
Analysis Controls
- Source/sink selection
- Algorithm choice
- Flow amount input
-
Visualization Panels
- Network overview
- Path visualization
- Flow metrics
-
Results Display
- Flow statistics
- Path breakdown
- Transaction list
- Direct Python:
python run.py
- Docker Compose:
docker compose up
- Manual Docker:
docker build -t pyfinder .
docker run -p 5006:5006 pyfinder
Access the dashboard at http://localhost:5006
This project is licensed under the MIT License - see the LICENSE file for details.