GitHub
Back to course
Lesson 2
45 min

Multi-Agent Systems

Workflow Patterns - Sequential, Parallel, and Loop patterns for agent coordination

The Problem with Monolithic Agents

Imagine you need an agent that researches a topic, writes an article, edits for grammar, fact-checks claims, and formats for publication.

You could put all of this into one agent with a huge instruction set. But that creates problems:

  • Hard to debug (which step failed?)
  • Hard to improve (change one thing, break another)
  • Unreliable (too many responsibilities)
  • Expensive (huge context = more tokens)
  • Inflexible (can't reuse components)
+-------------------------------------------------------------+ | THE MONOLITHIC AGENT PROBLEM | | --------------------------------- | | | | instructions = """ | | You are a researcher, writer, editor, fact-checker, | | and formatter. When asked to write an article: | | 1. First research the topic thoroughly... | | 2. Then write a draft... | | 3. Then edit your draft... | | 4. Then fact-check all claims... | | 5. Then format for publication... | | ... (500 more lines of instructions) | | """ | +-------------------------------------------------------------+

The Multi-Agent Solution

Instead of one agent doing everything, create a team of specialists. Each agent has one job and does it well.

Benefits:

  • Each agent has ONE clear job
  • Easy to test individually
  • Easy to improve one without breaking others
  • Cheaper (smaller contexts per agent)
  • Reusable (use same editor in different pipelines)

This is the same principle as microservices vs monoliths in software engineering.

+-------------------------------------------------------------+ | THE MULTI-AGENT SOLUTION | | | | +---------------+ | | | Coordinator | <-- Decides who does what | | +-------+-------+ | | | | | +-----+-----+ | | | | | | | v v v | | +---+ +---+ +---+ | | | R | | W | | E | R = Researcher | | +---+ +---+ +---+ W = Writer | | E = Editor | +-------------------------------------------------------------+

Agent Delegation Pattern

The most common pattern: one agent calls another as a tool. The "main" agent coordinates, and "specialist" agents do specific tasks.

Notice:

  • Main agent uses expensive model (gpt-4o) for coordination
  • Calculator uses cheap model (gpt-4o-mini) for simple task
  • This saves money while maintaining quality
delegation.py
from pydantic_ai import Agent, RunContext

# Specialist agent - does ONE thing well
calculator_agent = Agent(
    'openai:gpt-4o-mini',  # Cheaper model for simple task
    instructions='You are a calculator. Return only the numeric result.',
)

# Main agent - coordinates everything
main_agent = Agent(
    'openai:gpt-4o',
    instructions='You help with various tasks. Use the calculator for math.'
)

@main_agent.tool
async def calculate(ctx: RunContext[None], expression: str) -> str:
    """Calculate a mathematical expression."""
    result = await calculator_agent.run(
        f'Calculate: {expression}',
        usage=ctx.usage  # Track combined token usage
    )
    return result.output

# Use the system
result = main_agent.run_sync('What is 15% of 250?')
print(result.output)
# Output: 15% of 250 is 37.5

Research + Summarization System

A more practical example with two specialists:

research_summarize.py
from pydantic_ai import Agent, RunContext

# Agent 1: Researcher - finds information
research_agent = Agent(
    'openai:gpt-4o',
    instructions='''You are a research specialist.
    Find 2-3 key facts about the given topic.
    Be thorough but concise.'''
)

@research_agent.tool_plain
def web_search(query: str) -> str:
    """Search the web for information."""
    return f"Search results for '{query}': [relevant information here]"

# Agent 2: Summarizer - creates summaries
summarizer_agent = Agent(
    'openai:gpt-4o-mini',
    instructions='''You create concise summaries.
    Format as 3-5 bullet points.
    Keep it simple and clear.'''
)

# Coordinator agent
coordinator = Agent(
    'openai:gpt-4o',
    instructions='Coordinate research and summarization tasks.'
)

@coordinator.tool
async def research_topic(ctx: RunContext[None], topic: str) -> str:
    """Research a topic thoroughly."""
    result = await research_agent.run(f'Research: {topic}', usage=ctx.usage)
    return result.output

@coordinator.tool
async def summarize_text(ctx: RunContext[None], text: str) -> str:
    """Summarize the given text."""
    result = await summarizer_agent.run(f'Summarize:\n{text}', usage=ctx.usage)
    return result.output

# Use it
result = coordinator.run_sync('Research AI agents and give me a summary')
print(result.output)

Pattern 1: Sequential (Pipeline)

When to use: Steps depend on each other. Output of step 1 is input to step 2.

Example: Content creation pipeline where you cannot write without research, and cannot edit without a draft.

+----------+ +----------+ +----------+ | Research | --> | Write | --> | Edit | +----------+ +----------+ +----------+ | | | v v v facts draft final article

Sequential Pipeline Code

sequential.py
from pydantic_ai import Agent

# Three specialist agents
researcher = Agent('openai:gpt-4o',
    instructions='Research the topic. Provide key facts.')

writer = Agent('openai:gpt-4o',
    instructions='Write a clear article based on research.')

editor = Agent('openai:gpt-4o',
    instructions='Edit for clarity and grammar. Return improved version.')

async def content_pipeline(topic: str) -> str:
    # Step 1: Research
    research_result = await researcher.run(f'Research: {topic}')
    research = research_result.output

    # Step 2: Write (needs research)
    article_result = await writer.run(
        f'Write an article based on this research:\n{research}'
    )
    article = article_result.output

    # Step 3: Edit (needs draft)
    final_result = await editor.run(f'Edit this article:\n{article}')

    return final_result.output

# Use it
import asyncio
article = asyncio.run(content_pipeline('The future of AI'))
print(article)

Pattern 2: Parallel

When to use: Steps are independent. They do not need each other's output.

Example: Company analysis where market, technical, and risk analyses are independent.

Why parallel is faster: If each analysis takes 5 seconds:

  • Sequential: 5 + 5 + 5 = 15 seconds
  • Parallel: max(5, 5, 5) = 5 seconds
+--------------+ +--->| Market |---+ | | Analysis | | | +--------------+ | +-----+ | +--------------+ | +---------+ |Input|--+--->| Technical |---+->| Combined| +-----+ | | Analysis | | | Report | | +--------------+ | +---------+ | +--------------+ | +--->| Risk |---+ | Analysis | +--------------+

Parallel Pattern Code

parallel.py
from pydantic_ai import Agent
import asyncio

# Three analysts
market_analyst = Agent('openai:gpt-4o',
    instructions='Analyze market trends. Be specific with numbers.')

tech_analyst = Agent('openai:gpt-4o',
    instructions='Analyze technical aspects. Focus on innovation.')

risk_analyst = Agent('openai:gpt-4o',
    instructions='Identify potential risks. Be thorough.')

async def parallel_analysis(company: str) -> dict:
    # Run all three IN PARALLEL using asyncio.gather
    results = await asyncio.gather(
        market_analyst.run(f'Market analysis for {company}'),
        tech_analyst.run(f'Technical analysis for {company}'),
        risk_analyst.run(f'Risk analysis for {company}')
    )

    return {
        'market': results[0].output,
        'technology': results[1].output,
        'risks': results[2].output
    }

# Much faster than running one by one
analysis = asyncio.run(parallel_analysis('Tesla'))
print(analysis)

Pattern 3: Loop (Iterative Refinement)

When to use: You need to improve output until it meets a quality threshold.

Example: Writing that needs multiple drafts, code that needs to pass tests, designs that need approval.

+----------+ +----------+ | Writer |<--->| Critic | +----------+ +----------+ | | | Feedback | +----------------+ Repeat until approved

Loop Pattern Code

loop.py
from pydantic import BaseModel
from pydantic_ai import Agent

class Review(BaseModel):
    approved: bool
    feedback: str

writer = Agent('openai:gpt-4o',
    instructions='Write or improve content based on feedback.')

critic = Agent('openai:gpt-4o',
    output_type=Review,
    instructions='''Review the content critically.
    Set approved=True ONLY if it is excellent.
    Otherwise, give specific feedback for improvement.'''
)

async def iterative_writing(topic: str, max_rounds: int = 3) -> str:
    # Initial draft
    result = await writer.run(f'Write about: {topic}')
    content = result.output

    for round_num in range(max_rounds):
        # Get critique
        review_result = await critic.run(f'Review this:\n{content}')
        review = review_result.output

        if review.approved:
            print(f'Approved after {round_num + 1} round(s)')
            return content

        print(f'Round {round_num + 1}: {review.feedback}')

        # Improve based on feedback
        result = await writer.run(
            f'Improve this based on feedback.\n\n'
            f'Content: {content}\n\n'
            f'Feedback: {review.feedback}'
        )
        content = result.output

    print('Max rounds reached')
    return content

# Use it
import asyncio
final_content = asyncio.run(iterative_writing('Benefits of meditation'))

Choosing the Right Pattern

Use this decision tree to pick the right pattern:

  1. Fixed Pipeline (A -> B -> C) - Steps depend on each other → Use SEQUENTIAL
  2. Concurrent Tasks (Run A, B, C at once) - Steps are independent → Use PARALLEL
  3. Iterative Refinement (A <-> B) - Need to improve until good enough → Use LOOP
  4. Dynamic Decisions (Let LLM decide) - Don't know in advance which agents to call → Use DELEGATION

Combining Patterns: Real systems often combine patterns. For example:

Sequential[
    Parallel[Research1, Research2, Research3],
    Loop[Writer, Critic],
    Editor
]

This researches 3 topics in parallel, writes and refines in a loop, then does a final edit.

Key Takeaways
  • 1One job per agent. Keep agents simple and focused. A "research agent" only researches.
  • 2Choose pattern by dependency: Steps depend on each other? Sequential. Steps are independent? Parallel. Need quality improvement? Loop. Don't know in advance? Delegation.
  • 3Parallel = faster. Independent tasks should always run in parallel.
  • 4Loops = quality. When output quality matters, use a critic to iteratively improve.
  • 5Combine patterns. Real systems use multiple patterns together.