The Dance of AI Agents: Rethinking How Machines Work Together | Geometrik
Technology

The Dance of AI Agents: Rethinking How Machines Work Together

Photo of Sharad Jain

Sharad Jain

· 16 min read
Thumbnail for The Dance of AI Agents: Rethinking How Machines Work Together

The Dance of AI Agents: Rethinking How Machines Work Together

Something interesting happens when you watch a great team at work. Each person knows their role, understands when to step in, and perhaps more importantly, when to step back. They move with a kind of fluid grace that makes complex tasks look easy. I’ve been thinking about this a lot lately, not in the context of human teams, but in how we might make AI systems work together in the same way.

The Problem with Today’s AI

Most AI systems today are like savants: incredibly capable in their narrow domain but lost when asked to step outside it. We’ve all experienced this. You’re chatting with an AI about a complex problem, and it’s doing great until you need something slightly different - maybe accessing some real-world data or performing a specific action. Suddenly, you hit a wall.

This isn’t just a limitation; it’s a hint about what’s missing.

Routines: Teaching AI to Dance

The breakthrough came when we started thinking about AI instructions differently. Instead of rigid programming, what if we gave AI systems something more like a choreography - a set of flexible guidelines that could adapt to the situation?

The transcript describes routines as the “secret sauce” of this new approach. They’re not just rigid code, but more like a to-do list written in plain English. This allows for an “if this, then that” logic that makes the AI much more adaptable.

Here’s a simple example. Imagine teaching an AI how to handle customer service:

const routine = {
  name: "CustomerSupport",
  steps: [
    "Understand the customer's issue",
    "Suggest a solution",
    "If customer isn't satisfied, offer alternatives"
  ]
};

Let’s see how this translates to actual code. First, we’ll need some imports:

from openai import OpenAI
from pydantic import BaseModel
from typing import Optional
import json

client = OpenAI()

Now, here’s how the OpenAI Cookbook defines a routine. It’s a set of instructions in natural language (a system prompt) along with the tools needed to complete them:

# Customer Service Routine

system_message = (
    "You are a customer support agent for ACME Inc."
    "Always answer in a sentence or less."
    "Follow the following routine with the user:"
    "1. First, ask probing questions and understand the user's problem deeper.\n"
    " - unless the user has already provided a reason.\n"
    "2. Propose a fix (make one up).\n"
    "3. ONLY if not satesfied, offer a refund.\n"
    "4. If accepted, search for the ID and then execute refund."
    ""
)

def look_up_item(search_query):
    """Use to find item ID.
    Search query can be a description or keywords."""
    # return hard-coded item ID - in reality would be a lookup
    return "item_132612938"

def execute_refund(item_id, reason="not provided"):
    print("Summary:", item_id, reason) # lazy summary
    return "success"

This looks deceptively simple. But the magic isn’t in the steps themselves - it’s in how the AI interprets and adapts them based on context. Like a good dancer who can adjust their moves to match their partner, these routines let AI systems be both structured and flexible. The transcript highlights the customer service example, noting how it provides the AI with a decision-making flowchart that can handle the nuances of real conversations.

To make these routines work, there’s a concept called a “loop.” It’s a continuous cycle: the AI gets a request, processes it based on the routine, responds, and waits for the next input. This creates a structured back-and-forth, guided by the routine.

Here’s how we implement this loop in code:

def run_full_turn(system_message, messages):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": system_message}] + messages,
    )
    message = response.choices[0].message
    messages.append(message)

    if message.content: print("Assistant:", message.content)

    return message

messages = []
while True:
    user = input("User: ")
    messages.append({"role": "user", "content": user})

    run_full_turn(system_message, messages)

The Art of the Handoff

But routines alone aren’t enough. The real power comes from what I call “handoffs” - those moments when one AI needs to seamlessly pass the baton to another.

But how does the AI actually “do” things? It needs a connection to real-world systems. That’s where “function schema” comes in. It connects the AI’s language processing to actual tools and actions. For example, a “lookup item” function schema could translate a request into code that retrieves product information from a database. This lets the AI understand the request, use the right function from the routine, and execute the action in the real world.

To enable function calls, we need a helper function that turns Python functions into the corresponding function schema:

import inspect

def function_to_schema(func) -> dict:
    type_map = {
        str: "string",
        int: "integer",
        float: "number",
        bool: "boolean",
        list: "array",
        dict: "object",
        type(None): "null",
    }

    try:
        signature = inspect.signature(func)
    except ValueError as e:
        raise ValueError(
            f"Failed to get signature for function {func.__name__}: {str(e)}"
        )

    parameters = {}
    for param in signature.parameters.values():
        try:
            param_type = type_map.get(param.annotation, "string")
        except KeyError as e:
            raise KeyError(
                f"Unknown type annotation {param.annotation} for parameter {param.name}: {str(e)}"
            )
        parameters[param.name] = {"type": param_type}

    required = [
        param.name
        for param in signature.parameters.values()
        if param.default == inspect._empty
    ]

    return {
        "type": "function",
        "function": {
            "name": func.__name__,
            "description": (func.__doc__ or "").strip(),
            "parameters": {
                "type": "object",
                "properties": parameters,
                "required": required,
            },
        },
    }

# Here's an example of how it works:
def sample_function(param_1, param_2, the_third_one: int, some_optional="John Doe"):
    """
    This is my docstring. Call this function when you want.
    """
    print("Hello, world")

schema = function_to_schema(sample_function)
print(json.dumps(schema, indent=2))

Now we can pass the tools to the model when we call it:

messages = []

tools = [execute_refund, look_up_item]
tool_schemas = [function_to_schema(tool) for tool in tools]

response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Look up the black boot."}],
            tools=tool_schemas,
        )
message = response.choices[0].message

message.tool_calls[0].function

Think about how a great restaurant works. The host greets you and hands you off to a waiter. The waiter takes your order and coordinates with the kitchen. Each handoff is smooth, maintaining the context of your experience while leveraging different expertise.

We can create the same kind of flow with AI:

// A simple example of AI collaboration
const conversation = {
  start: "I need help with my recent purchase",
  handoff: {
    from: "CustomerService",
    to: "TechnicalSupport",
    context: "Customer having trouble with product setup"
  }
};

When the model calls a tool, we need to execute the corresponding function and provide the result back to the model. Here’s how we do that:

tools_map = {tool.__name__: tool for tool in tools}

def execute_tool_call(tool_call, tools_map):
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    print(f"Assistant: {name}({args})")

    # call corresponding function with provided arguments
    return tools_map[name](**args)

for tool_call in message.tool_calls:
            result = execute_tool_call(tool_call, tools_map)

            # add result back to conversation 
            result_message = {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            }
            messages.append(result_message)

Putting it all together, here’s the complete loop that handles tool calls:

tools = [execute_refund, look_up_item]

def run_full_turn(system_message, tools, messages):

    num_init_messages = len(messages)
    messages = messages.copy()

    while True:

        # turn python functions into tools and save a reverse map
        tool_schemas = [function_to_schema(tool) for tool in tools]
        tools_map = {tool.__name__: tool for tool in tools}

        # === 1. get openai completion ===
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "system", "content": system_message}] + messages,
            tools=tool_schemas or None,
        )
        message = response.choices[0].message
        messages.append(message)

        if message.content:  # print assistant response
            print("Assistant:", message.content)

        if not message.tool_calls:  # if finished handling tool calls, break
            break

        # === 2. handle tool calls ===

        for tool_call in message.tool_calls:
            result = execute_tool_call(tool_call, tools_map)

            result_message = {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            }
            messages.append(result_message)

    # ==== 3. return new messages =====
    return messages[num_init_messages:]

def execute_tool_call(tool_call, tools_map):
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    print(f"Assistant: {name}({args})")

    # call corresponding function with provided arguments
    return tools_map[name](**args)

messages = []
while True:
    user = input("User: ")
    messages.append({"role": "user", "content": user})

    new_messages = run_full_turn(system_message, tools, messages)
    messages.extend(new_messages)

Trying to cram everything into one giant routine would be a nightmare. Instead, we can use “handoffs,” like a well-coordinated team where each member has their expertise and knows when to pass the baton. This avoids one giant AI trying to do it all. We can have specialized agents, each with focused routines and tools.

The concept of an “agent class” provides a framework for creating and managing these individual agents. You could have a refund agent, a sales assistant, and so on. The code can switch between these agents seamlessly, making the user experience smooth.

Here’s how we define a basic Agent class:

class Agent(BaseModel):
    name: str = "Agent"
    model: str = "gpt-4o-mini"
    instructions: str = "You are a helpful Agent"
    tools: list = []

# Example agents:
def execute_refund(item_name):
    return "success"

refund_agent = Agent(
    name="Refund Agent",
    instructions="You are a refund agent. Help the user with refunds.",
    tools=[execute_refund],
)

def place_order(item_name):
    return "success"

sales_assistant = Agent(
    name="Sales Assistant",
    instructions="You are a sales assistant. Sell the user a product.",
    tools=[place_order],
)

# Running multiple agents:
messages = []
user_query = "Place an order for a black boot."
print("User:", user_query)
messages.append({"role": "user", "content": user_query})

response = run_full_turn(sales_assistant, messages) # sales assistant
messages.extend(response)

user_query = "Actually, I want a refund." # implicitly refers to the last item
print("User:", user_query)
messages.append({"role": "user", "content": user_query})
response = run_full_turn(refund_agent, messages) # refund agent

Why This Matters

This isn’t just about making AI systems more efficient. It’s about fundamentally changing what’s possible. When AI agents can work together seamlessly, we can tackle problems that would be too complex for any single system.

Consider what happens in a modern software company. Different teams handle different aspects of the product - development, design, customer support, sales. Each team has its expertise, but it’s their ability to work together that creates value.

Now imagine AI systems working the same way:

  1. A triage agent that understands the initial request
  2. Specialist agents that handle specific aspects of the problem
  3. Coordination agents that ensure everything works together

Here’s where it gets really wild: the AI can figure out when to hand off the conversation on its own. We can give the AI specific functions, like “transfer to refunds,” and it learns to recognize when to use them based on the conversation. It’s like teaching the AI to know when it’s out of its depth and needs a specialist.

To make handoffs happen automatically, we can give agents a transfer_to_XXX function that returns an Agent object:

refund_agent = Agent(
    name="Refund Agent",
    instructions="You are a refund agent. Help the user with refunds.",
    tools=[execute_refund],
)

def transfer_to_refunds():
    """User for refunds and returns."""
    return refund_agent

sales_assistant = Agent(
    name="Sales Assistant",
    instructions="You are a sales assistant. Sell the user a product.",
    tools=[place_order],
)

# We need to update our Response class to track agent changes:
class Response(BaseModel):
    agent: Optional[Agent]
    messages: list

# And update run_full_turn to handle agent transfers:
def run_full_turn(agent, messages):

    current_agent = agent
    num_init_messages = len(messages)
    messages = messages.copy()

    while True:

        # turn python functions into tools and save a reverse map
        tool_schemas = [function_to_schema(tool) for tool in current_agent.tools]
        tools = {tool.__name__: tool for tool in current_agent.tools}

        # === 1. get openai completion ===
        response = client.chat.completions.create(
            model=agent.model,
            messages=[{"role": "system", "content": current_agent.instructions}]
            + messages,
            tools=tool_schemas or None,
        )
        message = response.choices[0].message
        messages.append(message)

        if message.content:  # print agent response
            print(f"{current_agent.name}:", message.content)

        if not message.tool_calls:  # if finished handling tool calls, break
            break

        # === 2. handle tool calls ===

        for tool_call in message.tool_calls:
            result = execute_tool_call(tool_call, tools, current_agent.name)

            if type(result) is Agent:  # if agent transfer, update current agent
                current_agent = result
                result = (
                    f"Transfered to {current_agent.name}. Adopt persona immediately."
                )

            result_message = {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result,
            }
            messages.append(result_message)

    # ==== 3. return last agent used and new messages =====
    return Response(agent=current_agent, messages=messages[num_init_messages:])

def execute_tool_call(tool_call, tools, agent_name):
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    print(f"{agent_name}:", f"{name}({args})")

    return tools[name](**args)  # call corresponding function with provided arguments

Here’s a more complete example with multiple agents and automatic handoffs:

def escalate_to_human(summary):
    """Only call this if explicitly asked to."""
    print("Escalating to human agent...")
    print("\n=== Escalation Report ===")
    print(f"Summary: {summary}")
    print("=========================\n")
    exit()

def transfer_to_sales_agent():
    """User for anything sales or buying related."""
    return sales_agent

def transfer_to_issues_and_repairs():
    """User for issues, repairs, or refunds."""
    return issues_and_repairs_agent

def transfer_back_to_triage():
    """Call this if the user brings up a topic outside of your purview,
    including escalating to human."""
    return triage_agent

triage_agent = Agent(
    name="Triage Agent",
    instructions=(
        "You are a customer service bot for ACME Inc. "
        "Introduce yourself. Always be very brief. "
        "Gather information to direct the customer to the right department. "
        "But make your questions subtle and natural."
    ),
    tools=[transfer_to_sales_agent, transfer_to_issues_and_repairs, escalate_to_human],
)

def execute_order(product, price: int):
    """Price should be in USD."""
    print("\n\n=== Order Summary ===")
    print(f"Product: {product}")
    print(f"Price: ${price}")
    print("=================\n")
    confirm = input("Confirm order? y/n: ").strip().lower()
    if confirm == "y":
        print("Order execution successful!")
        return "Success"
    else:
        print("Order cancelled!")
        return "User cancelled order."

sales_agent = Agent(
    name="Sales Agent",
    instructions=(
        "You are a sales agent for ACME Inc."
        "Always answer in a sentence or less."
        "Follow the following routine with the user:"
        "1. Ask them about any problems in their life related to catching roadrunners.\n"
        "2. Casually mention one of ACME's crazy made-up products can help.\n"
        " - Don't mention price.\n"
        "3. Once the user is bought in, drop a ridiculous price.\n"
        "4. Only after everything, and if the user says yes, "
        "tell them a crazy caveat and execute their order.\n"
        ""
    ),
    tools=[execute_order, transfer_back_to_triage],
)

def look_up_item(search_query):
    """Use to find item ID.
    Search query can be a description or keywords."""
    item_id = "item_132612938"
    print("Found item:", item_id)
    return item_id

def execute_refund(item_id, reason="not provided"):
    print("\n\n=== Refund Summary ===")
    print(f"Item ID: {item_id}")
    print(f"Reason: {reason}")
    print("=================\n")
    print("Refund execution successful!")
    return "success"

issues_and_repairs_agent = Agent(
    name="Issues and Repairs Agent",
    instructions=(
        "You are a customer support agent for ACME Inc."
        "Always answer in a sentence or less."
        "Follow the following routine with the user:"
        "1. First, ask probing questions and understand the user's problem deeper.\n"
        " - unless the user has already provided a reason.\n"
        "2. Propose a fix (make one up).\n"
        "3. ONLY if not satesfied, offer a refund.\n"
        "4. If accepted, search for the ID and then execute refund."
        ""
    ),
    tools=[execute_refund, look_up_item, transfer_back_to_triage],
)

# Running the system:
agent = triage_agent
messages = []

while True:
    user = input("User: ")
    messages.append({"role": "user", "content": user})

    response = run_full_turn(agent, messages)
    agent = response.agent
    messages.extend(response.messages)

This is where large language models (LLMs) shine. They can understand nuance, context, and intent in a way that was unimaginable before. The transcript gives an example of a triage agent directing a user to sales or issues and repairs based on their needs, like an intelligent receptionist for your AI system.

The Future Isn’t Just Smarter AI

Here’s what’s fascinating: as we work on this problem, we’re learning that the future of AI isn’t just about making individual systems smarter. It’s about making them work together more effectively.

This has profound implications. Just as the internet wasn’t just about connecting computers but about creating new ways for humans to collaborate, AI orchestration isn’t just about connecting AI systems - it’s about creating new possibilities for human-AI collaboration.

Looking Ahead

We’re still in the early stages of this journey. Current implementations are like early internet protocols - functional but primitive compared to what they’ll become. The exciting part is imagining what’s possible:

  • AI systems that can dynamically form teams to solve complex problems
  • Agents that learn not just from data but from working with other agents
  • New forms of software that blur the line between human and AI collaboration

The transcript highlights a sales agent example with routines designed to be humorous, asking about problems related to catching roadrunners and suggesting wacky ACME products. This shows how we can go beyond just functional routines and incorporate personality and humor into AI interactions. It’s not just about efficiency, but about creating AI that is engaging and entertaining.

This brings up the “human element.” We’re building these AI agents to interact with humans, so understanding human needs and expectations is crucial. The sales agent example wasn’t just focused on completing a transaction, but also on being engaging. This shows that even in technical interactions, personality can go a long way.

The Real Insight

The most important thing I’ve learned from working on this is that the hard problems in AI aren’t just technical. They’re about understanding the patterns of effective collaboration and translating them into something machines can use.

This feels like one of those ideas that, in retrospect, will seem obvious. Of course AI systems should work together like a well-coordinated team. The surprising thing is that it took us so long to figure out how to make it happen.

What This Means for Builders

If you’re building AI systems today, start thinking about them less as individual entities and more as potential team members. The questions become less about what a single AI can do and more about how multiple AIs can work together to achieve something greater.

Some practical tips:

  1. Design for collaboration from the start
  2. Focus on clean handoffs between different parts of your system
  3. Think deeply about how to maintain context across interactions
  4. Build in mechanisms for handling failure gracefully

There’s a sample library called Swarm that’s like a playground for exploring these ideas. It’s a proof of concept, showing the potential of this approach. It’s not ready for primetime yet, but it gives a glimpse into a future where AI systems function more like collaborative teams.

Swarm is open source, meaning anyone can contribute to its development. This could really accelerate progress in this area. It’s like crowdsourcing the future of AI. The developers of Swarm are passionate about sharing their knowledge and encouraging collaboration, which is essential in any field where we’re pushing the boundaries of human knowledge.

The Dance Continues

We’re at the beginning of something important. Just as the best human teams can achieve things that no individual could, the future of AI lies not in creating super-intelligent individual agents, but in orchestrating multiple specialized agents to work together seamlessly.

It’s a dance we’re just learning the steps to, but one that could transform how we think about software, AI, and perhaps even collaboration itself.

As AI becomes more sophisticated and mimics human behavior, we need to be careful about transparency and deception. People should always know they’re interacting with an AI. There are also broader societal implications, like the potential impact on employment and the need for equitable access to these technologies. It’s important to keep having these conversations as AI evolves.

This article gives us a glimpse into a future where AI isn’t just about automation, but about augmentation, collaboration, and even companionship. It’s about using AI to enhance our abilities, not replace them. Who knows what amazing possibilities await us as we continue to explore this fascinating field.

Many of these AI concepts are rooted in human behavior and organizational principles. We can learn from AI to improve ourselves. For example, breaking down complex tasks into smaller, manageable routines is something we do all the time in our interpersonal and professional lives.

We use routines to streamline our actions and make things more efficient, like planning a project, organizing our day, or even making a meal. The key is finding that balance between structure and flexibility. Too rigid, and you can’t adapt; too loose, and you lose the benefits of a routine. It’s about finding that sweet spot where routines support your goals but don’t stifle creativity.

Then there’s the concept of handoffs, which is about delegation and collaboration. It’s important to recognize when someone else has the expertise or resources to handle something better than we can. It’s about knowing our strengths and weaknesses and being willing to ask for help when needed.

Just like with AI agents, clear communication is essential for successful handoffs. Everyone needs to be on the same page, understand their roles, and know how to communicate effectively. So, this exploration of orchestrating language models has given us valuable insights into human collaboration and productivity.

Technology can inspire us to think differently about ourselves and how we work. The pursuit of knowledge can lead us down unexpected paths, revealing connections we never anticipated. This has been an insightful conversation, and I hope it’s given readers some new perspectives on the relationship between humans and AI.

#AI#agents#orchestration#future-of-software#collaboration
Author Photo

About Sharad Jain

Sharad Jain is an AI Engineer and Data Scientist specializing in enterprise-scale generative AI and NLP. Currently leading AI initiatives at Autoscreen.ai, he has developed ACRUE frameworks and optimized LLM performance at scale. Previously at Meta, Autodesk, and WithJoy.com, he brings extensive experience in machine learning, data analytics, and building scalable AI systems. He holds an MS in Business Analytics from UC Davis.