SIARE Quick Start Guide
Get a self-evolving RAG pipeline running in under 10 minutes.
What you’ll accomplish:
- Install SIARE and configure an LLM provider
- Run a demo that shows multi-agent execution and evaluation
- Understand the output and what comes next
Prerequisites
Choose your path based on your setup:
Option A: Local Development with Ollama (Free)
Requirements:
- Python 3.10 or higher
- Docker or Ollama installed locally
- 8GB+ RAM recommended
- No API keys required
Setup:
- Install Ollama: https://ollama.ai/download
- Pull a model:
ollama pull llama3.2 - Verify Ollama is running:
ollama list
Option B: Cloud LLM with OpenAI (API Key Required)
Requirements:
- Python 3.10 or higher
- OpenAI API key (https://platform.openai.com/api-keys)
- Active OpenAI account with credits
Setup:
- Get your API key from OpenAI dashboard
- Keep it ready for Step 2
Quick Start
Step 1: Clone and Setup
# Clone the repository
git clone https://github.com/synaptiai/siare.git
cd siare
# Create a virtual environment (recommended)
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Step 2: Configure Your Environment
For Option A (Ollama - Local):
# Start Ollama server (if not already running)
ollama serve
# In another terminal, pull the model
ollama pull llama3.2
# Verify the model is available
ollama list
# You should see llama3.2 in the list
For Option B (OpenAI - Cloud):
# Set your OpenAI API key
export OPENAI_API_KEY="sk-your-api-key-here"
# On Windows PowerShell:
# $env:OPENAI_API_KEY="sk-your-api-key-here"
# Verify it's set
echo $OPENAI_API_KEY
Step 3: Run the Demo
Execute the quickstart demo with your chosen provider:
# Option A: Using Ollama (local)
python -m siare.demos.agentic_rag_quickstart --provider ollama
# Option B: Using OpenAI (cloud)
python -m siare.demos.agentic_rag_quickstart --provider openai
The demo will run 3 iterations by default. You can customize this:
# Run 5 iterations
python -m siare.demos.agentic_rag_quickstart --provider ollama --iterations 5
# Run quietly (minimal output)
python -m siare.demos.agentic_rag_quickstart --provider openai --quiet
Step 4: Understand the Output
When you run the demo, you’ll see output like this:
============================================================
SIARE Quickstart Demo
============================================================
Provider: ollama
Model: llama3.2
Iterations: 3
[1/4] Services initialized
[2/4] Created RAG pipeline with 2 roles
--- Iteration 1/3 ---
Task 1: completed
Task 2: completed
Task 3: completed
Iteration accuracy: 78.33%
--- Iteration 2/3 ---
Task 1: completed
Task 2: completed
Task 3: completed
Iteration accuracy: 81.67%
--- Iteration 3/3 ---
Task 1: completed
Task 2: completed
Task 3: completed
Iteration accuracy: 83.33%
[3/4] Completed 3 iterations
[4/4] Results:
Initial score: 78.33%
Final score: 83.33%
Delta: +5.00%
============================================================
What Each Section Means:
- Services initialized: LLM provider, execution engine, and evaluation service are ready
- Created RAG pipeline with 2 roles:
retriever: Finds relevant documentsanswerer: Generates answers based on retrieved documents
- Iteration results: Each iteration runs 3 sample tasks and calculates accuracy
- Final results: Shows performance across all iterations
Important Note: This demo runs the same pipeline multiple times to demonstrate execution and evaluation. For actual SOP evolution (where the system improves itself), use the EvolutionScheduler (see Running Full Evolution below).
What’s Next?
You’ve successfully run a multi-agent RAG pipeline! Here’s where to go based on your goal:
| I want to… | Do this |
|---|---|
| Customize prompts and add agents | Continue to Step 5: Customize Your Pipeline below |
| Run automatic evolution | Jump to Run Full Evolution |
| Understand how evolution works | Read Evolution Lifecycle |
| Build a domain-specific pipeline | Follow First Custom Pipeline |
| Deploy to production | See Deployment Guide |
Step 5: Customize Your Pipeline
The demo uses a simple 2-role pipeline. Let’s customize it step by step.
5.1 Add a New Agent (Ranker)
Add a document ranker between retriever and answerer:
from typing import List
from siare.core.models import ProcessConfig, RoleConfig, GraphEdge, RoleInput
def create_three_stage_rag(model: str) -> ProcessConfig:
"""Create a 3-stage RAG pipeline: retriever → ranker → answerer.
Args:
model: Model identifier (e.g., "llama3.2" for Ollama, "gpt-4o-mini" for OpenAI)
Returns:
ProcessConfig defining the multi-agent pipeline
"""
return ProcessConfig(
id="three_stage_rag",
version="1.0.0",
models={model: model},
tools=[],
roles=[
RoleConfig(
id="retriever",
model=model,
tools=[],
promptRef="retriever_prompt",
inputs=[RoleInput(from_="user_input")],
outputs=["documents"],
),
RoleConfig(
id="ranker", # New role: re-ranks retrieved documents
model=model,
tools=[],
promptRef="ranker_prompt",
inputs=[RoleInput(from_=["user_input", "retriever"])],
outputs=["ranked_docs"],
),
RoleConfig(
id="answerer",
model=model,
tools=[],
promptRef="answerer_prompt",
inputs=[RoleInput(from_=["user_input", "ranker"])],
outputs=["answer"],
),
],
graph=[
GraphEdge(from_="user_input", to="retriever"),
GraphEdge(from_=["user_input", "retriever"], to="ranker"),
GraphEdge(from_=["user_input", "ranker"], to="answerer"),
],
)
# Usage
sop = create_three_stage_rag("llama3.2")
print(f"Pipeline has {len(sop.roles)} roles: {[r.id for r in sop.roles]}")
What changed: We added a ranker role that sits between retriever and answerer. The ranker receives both the original query and retrieved documents, then outputs re-ranked documents for the answerer.
5.2 Customize Agent Prompts
Each role references a prompt by ID. Create the prompts that define agent behavior:
from siare.core.models import PromptGenome, RolePrompt
def create_three_stage_prompts() -> PromptGenome:
"""Create prompts for the three-stage RAG pipeline.
Returns:
PromptGenome containing all role prompts
"""
return PromptGenome(
id="three_stage_genome",
version="1.0.0",
rolePrompts={
"retriever_prompt": RolePrompt(
id="retriever_prompt",
content="""You are an expert document retrieval agent.
Given a query, find and return the TOP 5 most relevant passages.
Query: {query}
Focus on semantic relevance and recency.
Return as JSON: {"documents": [...]}""",
),
"ranker_prompt": RolePrompt(
id="ranker_prompt",
content="""You are a document relevance ranker.
Re-rank the following documents by relevance to the query.
Query: {query}
Documents: {documents}
Return the TOP 3 documents in order of relevance.
Return as JSON: {"ranked_docs": [...]}""",
),
"answerer_prompt": RolePrompt(
id="answerer_prompt",
content="""You are a precise question-answering agent.
Use ONLY the provided documents to answer. If unsure, say so.
Question: {query}
Documents: {ranked_docs}
Provide a concise, evidence-based answer with citations.""",
),
},
)
# Usage
genome = create_three_stage_prompts()
print(f"Prompts for: {list(genome.rolePrompts.keys())}")
What changed: We added a ranker_prompt and updated answerer_prompt to use {ranked_docs} instead of {documents}.
5.3 Add a Custom Evaluation Metric
Define domain-specific metrics to evaluate your pipeline:
from typing import Dict, List, Any
from siare.services.evaluation_service import EvaluationService
from siare.core.models import ExecutionTrace
def term_coverage_metric(trace: ExecutionTrace, task_data: Dict[str, Any]) -> float:
"""Check if the answer contains required domain terms.
Args:
trace: Execution trace containing role outputs
task_data: Task data including required_terms list
Returns:
Score between 0.0 and 1.0 indicating term coverage
"""
answer: str = trace.outputs.get("answer", "")
required_terms: List[str] = task_data.get("required_terms", [])
if not required_terms:
return 1.0 # No terms required = perfect score
matches = sum(1 for term in required_terms if term.lower() in answer.lower())
return matches / len(required_terms)
def citation_metric(trace: ExecutionTrace, task_data: Dict[str, Any]) -> float:
"""Check if the answer includes source citations.
Args:
trace: Execution trace containing role outputs
task_data: Task data (unused but required by interface)
Returns:
1.0 if citations present, 0.0 otherwise
"""
answer: str = trace.outputs.get("answer", "")
# Simple heuristic: check for citation patterns like [1], [Source], etc.
import re
has_citations = bool(re.search(r'\[[\w\d]+\]', answer))
return 1.0 if has_citations else 0.0
# Register custom metrics with the evaluation service
# evaluation_service = EvaluationService(llm_provider=your_provider)
# evaluation_service.register_metric_function("term_coverage", term_coverage_metric)
# evaluation_service.register_metric_function("has_citations", citation_metric)
What changed: We defined two custom metrics. term_coverage_metric checks if required domain terms appear in answers. citation_metric verifies that answers include source citations.
Run Full Evolution (Self-Improvement)
This is where SIARE’s power shines: automatic improvement of your pipeline through evolutionary optimization.
from typing import List
from siare.services.scheduler import EvolutionScheduler
from siare.services.director import DirectorService
from siare.services.gene_pool import GenePool
from siare.services.execution_engine import ExecutionEngine
from siare.services.evaluation_service import EvaluationService
from siare.core.models import (
EvolutionJob,
MetricConfig,
AggregationMethod,
MetricType,
Task,
SOPGene,
)
def run_evolution_example(llm_provider) -> List[SOPGene]:
"""Run an evolution job to automatically improve a RAG pipeline.
Args:
llm_provider: Configured LLM provider (OpenAI or Ollama)
Returns:
List of Pareto-optimal SOPs discovered during evolution
"""
# Initialize core services
gene_pool = GenePool()
execution_engine = ExecutionEngine(llm_provider=llm_provider)
evaluation_service = EvaluationService(llm_provider=llm_provider)
director = DirectorService(llm_provider=llm_provider)
scheduler = EvolutionScheduler(
gene_pool=gene_pool,
director=director,
execution_engine=execution_engine,
evaluation_service=evaluation_service,
)
# Define your task set (questions with ground truth answers)
task_set: List[Task] = [
Task(id="q1", input={"query": "What is SIARE?"}, groundTruth={"answer": "Self-Improving Agentic RAG Engine"}),
Task(id="q2", input={"query": "How does evolution work?"}, groundTruth={"answer": "Mutation and selection"}),
# Add more tasks for better evolution...
]
# Configure the evolution job
job = EvolutionJob(
id="rag_evolution_001",
baseSopIds=["three_stage_rag"], # Start from our 3-stage pipeline
taskSet=task_set,
metricsToOptimize=[
MetricConfig(
id="accuracy",
type=MetricType.LLM_JUDGE,
model="gpt-4o-mini",
promptRef="accuracy_judge",
inputs=["query", "answer", "groundTruth"],
aggregationMethod=AggregationMethod.MEAN,
),
MetricConfig(
id="cost",
type=MetricType.RUNTIME,
aggregationMethod=AggregationMethod.SUM,
),
],
constraints={
"maxCostPerTask": 0.10, # Max $0.10 per task
"minSafetyScore": 0.90, # Minimum 90% safety
},
maxGenerations=10, # Run 10 evolution cycles
populationSize=5, # Maintain 5 SOP variants
)
# Run evolution (this may take several minutes)
print("Starting evolution...")
scheduler.run_evolution(job)
print("Evolution complete!")
# Get the Pareto-optimal solutions (best trade-offs between accuracy and cost)
pareto_frontier: List[SOPGene] = gene_pool.get_pareto_frontier(
metrics=["accuracy", "cost"],
domain="rag",
)
# Display results
print(f"\nFound {len(pareto_frontier)} Pareto-optimal solutions:")
for sop in pareto_frontier:
print(f" SOP {sop.id} v{sop.version}")
print(f" Accuracy: {sop.metrics.get('accuracy', 0):.2%}")
print(f" Cost: ${sop.metrics.get('cost', 0):.4f}")
return pareto_frontier
What happens during evolution:
- Execute: Each SOP variant runs on your task set
- Evaluate: Metrics are computed (accuracy, cost, latency, etc.)
- Diagnose: AI Director analyzes failures and identifies weaknesses
- Mutate: Director proposes improvements (better prompts, new agents, rewired graphs)
- Select: Best solutions are kept, forming the next generation
The result is a Pareto frontier — a set of solutions that represent optimal trade-offs. You can then choose the SOP that best fits your needs (highest accuracy, lowest cost, or balanced).
Troubleshooting
Ollama Issues
Problem: RuntimeError: Ollama not running at http://localhost:11434
Solution:
# Check if Ollama is running
curl http://localhost:11434/api/tags
# If not running, start it
ollama serve
# Verify the model is pulled
ollama list
# If llama3.2 is missing:
ollama pull llama3.2
Problem: Slow performance with Ollama
Solution:
- Ensure you have at least 8GB RAM available
- Try a smaller model:
ollama pull llama3.2:1b - Use the smaller model:
--model llama3.2:1b
OpenAI Issues
Problem: RuntimeError: OPENAI_API_KEY environment variable not set
Solution:
# Set the key
export OPENAI_API_KEY="sk-your-key-here"
# Verify it's set
echo $OPENAI_API_KEY
Problem: AuthenticationError or 401 Unauthorized
Solution:
- Verify your API key is valid at https://platform.openai.com/api-keys
- Check that your OpenAI account has available credits
- Ensure the key is properly exported (no quotes issues, no spaces)
General Issues
Problem: ModuleNotFoundError: No module named 'siare'
Solution:
# Make sure you're in the project directory
pwd # Should show .../siare
# Ensure dependencies are installed
pip install -r requirements.txt
# Install in development mode (optional)
pip install -e .
Problem: Tests failing or import errors
Solution:
# Verify Python version (must be 3.10+)
python --version
# Reinstall dependencies in a fresh venv
deactivate # if in a venv
rm -rf venv
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Learn More
Key Concepts
| Concept | Description |
|---|---|
| ProcessConfig (SOP) | Defines your multi-agent pipeline structure |
| PromptGenome | Contains all prompts used by agents |
| ExecutionEngine | Runs your pipeline as a directed acyclic graph (DAG) |
| EvaluationService | Measures pipeline performance across multiple metrics |
| Director | AI brain that diagnoses issues and proposes improvements |
| GenePool | Stores all SOP versions with their performance history |
Example Use Cases
| Domain | What SIARE Does |
|---|---|
| Customer Support RAG | Automatically improve answer quality and reduce costs |
| Legal Document Analysis | Evolve pipelines for better compliance detection |
| Research Paper Search | Optimize retrieval and summarization strategies |
| Clinical Trials | Match patients to trials with evolving criteria |
Documentation
| Topic | Link |
|---|---|
| System Architecture | SYSTEM_ARCHITECTURE.md |
| Data Models | DATA_MODELS.md |
| Configuration Reference | CONFIGURATION.md |
| Deployment Guide | DEPLOYMENT.md |
| Contributing | CONTRIBUTING.md |
Summary: What You’ve Learned
You’ve successfully:
- ✅ Installed SIARE and configured an LLM provider (Ollama or OpenAI)
- ✅ Run a multi-agent RAG pipeline
- ✅ Understood how to add agents, customize prompts, and add metrics
- ✅ Seen how to run automatic evolution for self-improvement
Continue Your Journey
| Next Step | Guide |
|---|---|
| Build a domain-specific pipeline | First Custom Pipeline |
| Understand evolution deeply | Evolution Lifecycle |
| Learn multi-agent design patterns | Multi-Agent Patterns |
| Add custom metrics and tools | Custom Extensions |
| Deploy to production | Deployment Guide |
Questions? Open an issue or start a discussion.