GitHub
Back to course
Lesson 1
45 min

Your First AI Agent

From Prompt to Action - Install, configure, and build your first agent with tools

The Problem with Regular Chatbots

A regular chatbot (like ChatGPT without tools) can only do one thing: generate text based on its training data. When you ask "What is the weather in Dubai?", it cannot actually check the weather. It can only say "I don't have access to real-time data" or make something up.

This is a fundamental limitation. The model knows a lot, but it is frozen in time. It cannot:

  • Access current information
  • Check databases
  • Send emails
  • Make API calls
  • Do anything in the real world

How Agents Solve This

An Agent adds one critical capability: tools. A tool is just a function that the agent can call. When the agent needs information or wants to take an action, it calls the appropriate tool.

+-------------------------------------------------------------+ | | | CHATBOT AGENT | | ------- ----- | | "What's the weather?" "What's the weather?" | | | | | | v v | | "I don't know, my *calls weather API* | | training ended in 2024" | | | v | | "It's 32C and sunny | | in Dubai right now" | | | +-------------------------------------------------------------+

The Agent Loop

Under the hood, an agent works in a loop. The key insight: the LLM decides when to use tools. You do not write if-else logic. The LLM reads the user's question, looks at available tools, and decides what to do. This is what makes agents powerful and flexible.

+------------------------------------------------------------------+ | THE AGENT LOOP | | | | 1. RECEIVE: Get user message | | | | | v | | 2. THINK: LLM decides what to do | | | | | +---> "I can answer directly" ---> Go to step 5 | | | | | +---> "I need to use a tool" ---> Go to step 3 | | | | | v | | 3. ACT: Call the tool with arguments | | | | | v | | 4. OBSERVE: Get tool result, go back to step 2 | | | | | v | | 5. RESPOND: Return final answer to user | | | +------------------------------------------------------------------+

Why PydanticAI

PydanticAI is a Python framework for building agents. It was created by the team behind Pydantic (the most popular data validation library in Python). Key features:

  • Type safety: Uses Python type hints for tool parameters and outputs
  • Model agnostic: Works with OpenAI, Anthropic, Google, and others
  • Simple API: Just decorators and function calls
  • Production ready: Built-in support for testing, streaming, and async

Install PydanticAI

Install the base package or with specific providers:

terminal
pip install pydantic-ai

# Or for specific providers:
pip install 'pydantic-ai[openai]'
pip install 'pydantic-ai[anthropic]'
pip install 'pydantic-ai[google]'
pip install 'pydantic-ai[all]'

Configure Your API Key

PydanticAI reads API keys from environment variables. Set them before running your code:

terminal
# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google Gemini
export GOOGLE_API_KEY="..."

Your First Agent

Three lines. You have an agent. But this agent has no tools, so it is basically just a chatbot. It can only use knowledge from its training data.

first_agent.py
from pydantic_ai import Agent

# Create agent
agent = Agent('openai:gpt-4o')

# Run it
result = agent.run_sync('What is 2 + 2?')
print(result.output)
# Output: 4

Understanding the Model String

PydanticAI uses a simple format for model names: provider:model-name

models.py
# OpenAI
agent = Agent('openai:gpt-4o')           # Best quality
agent = Agent('openai:gpt-4o-mini')      # Cheaper, faster

# Anthropic (Claude)
agent = Agent('anthropic:claude-sonnet-4-20250514')

# Google Gemini
agent = Agent('google-gla:gemini-2.0-flash')

# Groq (open source, fast)
agent = Agent('groq:llama-3.3-70b-versatile')

Adding Instructions

Instructions tell the agent how to behave. They are included in every request to the LLM. Think of them as the agent's "personality" or "job description". Good instructions are specific:

with_instructions.py
# Bad: vague
instructions = "Be helpful"

# Good: specific
agent = Agent(
    'openai:gpt-4o',
    instructions="""You are a customer support agent for TechCorp.
- Answer questions about our products
- Be polite but concise
- If you don't know something, say so
- Never discuss competitor products"""
)

Adding a Tool

This is where agents become useful. A tool is a Python function that the agent can call.

agent_with_tool.py
from pydantic_ai import Agent

agent = Agent(
    'openai:gpt-4o',
    instructions='Use the weather tool to answer weather questions.'
)

@agent.tool_plain
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    # In real life, this would call a weather API
    weather_data = {
        'Dubai': 'Sunny, 35C',
        'London': 'Rainy, 12C',
        'Tokyo': 'Cloudy, 22C',
    }
    return weather_data.get(city, f'Weather data not available for {city}')

# Now ask about weather
result = agent.run_sync('What is the weather in Dubai?')
print(result.output)
# Output: The weather in Dubai is sunny and 35C.

How Tools Work Under the Hood

When you define a tool, PydanticAI does several things:

  1. Extracts the function signature: Parameter names, types, and the docstring
  2. Creates a JSON schema: This tells the LLM what the tool does and how to call it
  3. Sends schema to LLM: The LLM sees all available tools with each request
  4. LLM decides: Based on the user's question, the LLM decides whether to call a tool
  5. Executes the function: If the LLM calls the tool, PydanticAI runs your Python function
  6. Returns result to LLM: The LLM sees the result and can use it in its response
+----------------------------------------------------------+ | 1. User asks: "What is the weather in Dubai?" | | | | | v | | 2. LLM sees available tools: | | - get_weather(city: str) -> str | | "Get current weather for a city" | | | | | v | | 3. LLM decides: "I should call get_weather" | | Returns: {"tool": "get_weather", "args": {"city": "Dubai"}} | | | | | v | | 4. PydanticAI executes: get_weather("Dubai") | | Result: "Sunny, 35C" | | | | | v | | 5. LLM receives result, generates response: | | "The weather in Dubai is sunny and 35C." | +----------------------------------------------------------+

Why Docstrings Matter

The LLM reads your function's docstring to understand what the tool does. A good docstring helps the LLM decide when to use the tool.

docstrings.py
# Bad: no docstring, LLM has no idea what this does
@agent.tool_plain
def get_data(x: str) -> str:
    return data[x]

# Good: clear docstring
@agent.tool_plain
def get_exchange_rate(from_currency: str, to_currency: str) -> float:
    """Get current exchange rate between two currencies.

    Args:
        from_currency: Source currency code (e.g., 'USD', 'EUR')
        to_currency: Target currency code (e.g., 'AED', 'GBP')

    Returns:
        Current exchange rate as a float
    """
    return rates.get((from_currency, to_currency), 0.0)

Three Ways to Run

PydanticAI provides three methods to run an agent:

Synchronous (blocking) - Use when: Simple scripts, testing, when you don't need async.

Asynchronous - Use when: Web servers, handling multiple requests, IO-bound operations.

Streaming - Use when: You want to show output as it generates (like ChatGPT does).

running_agents.py
# Synchronous (blocking)
result = agent.run_sync('Hello!')
print(result.output)

# Asynchronous
import asyncio

async def main():
    result = await agent.run('Hello!')
    print(result.output)

asyncio.run(main())

# Streaming
async def stream_response():
    async with agent.run_stream('Tell me a story') as response:
        async for text in response.stream_text():
            print(text, end='', flush=True)

asyncio.run(stream_response())

Understanding the Result Object

When you run an agent, you get back a RunResult object with several useful properties. The usage() method is important for production. LLM APIs charge per token, so tracking usage helps you monitor costs.

result_object.py
result = agent.run_sync('What is AI?')

# The answer
print(result.output)

# Token usage (for cost tracking)
print(result.usage())
# Usage(input_tokens=50, output_tokens=100, requests=1)

# All messages exchanged (for debugging)
print(result.all_messages())

# Just the new messages from this run
print(result.new_messages())

Complete Example: Search Agent

A more practical example that combines everything:

search_agent.py
from pydantic_ai import Agent

search_agent = Agent(
    'openai:gpt-4o',
    instructions='''You are a research assistant.
    Use the search tool to find current information.
    Always cite your sources.
    Be concise.'''
)

@search_agent.tool_plain
def web_search(query: str) -> str:
    """Search the web for information.

    Args:
        query: The search query

    Returns:
        Search results as a string
    """
    # Simulated search results (in production, use real search API)
    results = {
        'python': 'Python is a programming language created by Guido van Rossum in 1991.',
        'ai agents': 'AI agents are autonomous systems that perceive, decide, and act.',
        'pydantic': 'Pydantic is a data validation library for Python using type hints.'
    }

    for key, value in results.items():
        if key in query.lower():
            return f"Found: {value}"

    return 'No relevant results found.'

# Use it
result = search_agent.run_sync('What are AI agents?')
print(result.output)
Key Takeaways
  • 1Agent = LLM + Tools. The LLM thinks and decides, tools take actions.
  • 2The Agent Loop. Agents work in a loop: receive, think, act, observe, respond. They can call multiple tools before answering.
  • 3Tools are Python functions. Any function can become a tool with a decorator.
  • 4LLM decides when to use tools. You do not write if-else logic. The LLM figures it out based on the question and available tools.
  • 5Docstrings are critical. The LLM reads them to understand what tools do.