CodeAct Boosts Agent Performance by Cutting

CodeAct Boosts Agent Performance by Cutting Latency and Tokens in Half

Modern AI agents frequently struggle not with the quality of their language models, but with the overhead involved in orchestrating multiple tool calls. Each individual tool invocation typically forces a new round‑trip to the model, adding latency and consuming tokens. The CodeAct feature integrated into Microsoft’s Agent Framework addresses this bottleneck by allowing an agent to bundle several small steps into a single executable code block.

How CodeAct Reduces Latency and Token Usage

When agents execute a chain of lightweight operations—such as fetching data, performing basic calculations, or assembling a report—they normally alternate between the model and each tool. For example, a five‑step plan would require five separate model turns: model → tool → model → tool …. CodeAct collapses this sequence into one call to an execute_code tool. The language model writes a short Python script that calls the same tools through a helper function, call_tool(...), and runs it once inside a sandboxed environment.

Benchmarking on representative workloads shows that this approach cuts end‑to‑end latency by roughly 50 % and token consumption by more than 60 %, all while preserving safety and isolation guarantees.

Integrating CodeAct into an Agent

The recommended integration uses the HyperlightCodeActProvider from the agent_framework_hyperlight package. The provider registers an execute_code tool for every run and injects instructions that describe the sandbox, available helpers, and how to call tools inside the code block.

from agent_framework import Agent, tool from agent_framework_hyperlight import HyperlightCodeActProvider


@tool

def get_weather(city: str) -> dict[str, float | str]:

    """Return the current weather for a city."""

    return {"city": city, "temperature_c": 21.5, "conditions": "partly cloudy"}
codeact = HyperlightCodeActProvider(

    tools=[get_weather],

    approval_mode="never_require",

)
agent = Agent(

    client=client,

    name="CodeActAgent",

    instructions="You are a helpful assistant.",

    context_providers=[codeact],

)

result = await agent.run( "Get the weather for Seattle and Amsterdam and compare them." )

Why CodeAct Matters

Agents that perform frequent tool calling—such as data wrangling, light computation, chained lookups, or report generation—benefit most. By replacing multiple round‑trips with a single sandboxed execution, agents gain:

Lower latency – one model turn instead of several.
Reduced token usage – fewer messages and smaller prompts.
Compact reasoning trace – the entire plan resides in a single code block rather than being scattered across tool‑call messages.

How CodeAct Works Internally

The agent_framework_hyperlight package exposes two main entry points: HyperlightCodeActProvider and HyperlightExecuteCodeTool. The provider registers the execute_code tool on every run and injects system‑prompt instructions that describe:

The sandbox environment.
Typed helpers for file mounts and network policy.
How tools can be called via call_tool(...).

Inside the sandbox, the model‑generated Python program runs with no direct access to the host. It can only interact with the host through explicitly allowed file mounts or domain whitelists. When the script invokes call_tool("name", …), Hyperlight forwards that request back to the agent’s runtime, executes the corresponding tool there, and returns the result into the sandbox.

Approval Modes

Tools can be configured with an approval_mode that determines whether they are auto‑invoked or require a human decision. The two modes are:

never_require – the tool runs automatically.
always_require – every call triggers a function‑approval request that must be resolved by the agent host before execution.

When tools are passed to HyperlightCodeActProvider(tools=…), they are not exposed as first‑class tools to the model; instead, the model interacts with them only through the single execute_code tool. Approval applies to the entire code block rather than individual calls within it.

If a tool is also passed directly to Agent(tools=…), it becomes a first‑class tool and each call can have its own approval mode, independent of how the model chooses to invoke it (directly or via call_tool(...)). This flexibility allows developers to fine‑tune autonomy for specific tools.

When to Adopt CodeAct

CodeAct is most advantageous for agents that:

Rely heavily on chaining lightweight operations.
Need to keep token budgets low or reduce response latency.
Require a clear, auditable reasoning trail.

By encapsulating the entire workflow in a single sandboxed execution, developers can achieve significant performance gains without compromising safety or control.

AI isn’t right for every workflow, and part of our job is telling you where it isn’t. Get in touch and we’ll walk through where it makes sense, and where it doesn’t for your business.