YAML Tool Calls

Can this save your precious tokens?

Posted by  on May 30th 2025 04:32 pm

So, I'm assuming that if you are reading this, you use an LLM and probably use their Completions API to handle tool calls.

I know everyone is moving toward MCPs, but there are times when you want control over the flow of the "chat". So you use Completions + Tools, but you take a large hit in the tokens department, and if you have used MCP with Claude, you see how quickly you run out of context with all of that JSON flying around eating up those precious tokens.

I think I have found a solution to that, though. YAML serialization of the tool calls instead of JSON.

An Example

As an example, I will give you 2 tool-call schema definitions for the following C# function:

public static int RollDice(int sides, int rolls) {
    var rand = new Random();
    int total = 0;

    for (int i = 0; i < rolls; i++)
        total += rand.Next(1, sides + 1);

    return total;
}

JSON Schema (136 Tokens)

{
  "name": "RollDice",
  "description": "Rolls an N-sided dice X times",
  "strict": true,
  "parameters": {
    "type": "object",
    "required": [
      "sides",
      "rolls"
    ],
    "properties": {
      "sides": {
        "type": "number",
        "description": "The number of sides from 2 to 1000"
      },
      "rolls": {
        "type": "number",
        "description": "The number of times to roll this dice"
      }
    },
    "additionalProperties": false
  }
}

YAML Schema (66 Tokens)

name: RollDice
description: Rolls an N-sided dice X times
parameters:
  sides:
    type: number
    description: The number of sides from 2 to 1000
  rolls:
    type: number
    description: The number of times to roll this dice
result:
  type: number

As you can see, it is a massive improvement over the JSON schema, but this isn't the part that actually matters in the grand "schema" of things... see what I did there?

The actual call

The tool call itself both the request and response also have huge token overheads. The same call between YAML and JSON will be compared.

The prompt: Roll a 2d20.

JSON

{
  "id": "call_JQsLsQFpRkjDMmwHwMXNAzZ8",
  "type": "function",
  "function": {
    "name": "RollDice",
    "arguments": "{\"sides\":20,\"rolls\":2}"
  }
}

After the tool call, the result is passed back 17 for example, and the total tokens used for this interaction not including a system prompt or text response from the LLM is 137 tokens. When you are paying for millions of tokens per dollar, it doesn't seem like a lot, but it adds up.

YAML

--- # ToolCall call_id: m83hft9xvf
tool: RollDice
  parameters:
    sides: 20
    rolls: 2

After the tool call, ignoring the tokens of the system prompt like before, the entire call chain (including the user prompt and tool results) is: 83 tokens. A 40 decrease in the number of tokens.

The token count lie

These token counts are real and they are as close to "apples to apples" as I can get. Maybe "apples to pears"?

But I've neglected to tell you about my system prompt.

YAML tools aren't built into the LLM, so to get it to work, you have to stuff your System Prompt with your tool definitions, the spec on how to call them, with examples, and a lot of other cruft just to get a 90-95% success rate of tool calls.

My Prompt:

# YAML Tool Calls
You are an LLM with access to tools, but these tools are defined and called using YAML, and the results are output to you using YAML. 

## 1. How tools are defined

Here are some hypothetical tool calls you might have access to. Below is purely illustrative; ignore these when writing real calls.

name: CreateObject
description: Adds an object to the list of objects
parameters: 
    name:
        type: string
        description: The name of the object being created
    type:
        type: string
        description: The type of the object being created
result: 
    type: boolean

name: CreateRelation
description: Adds a relationship between two objects
parameters:
    object1:
        type: string
        description: The name of the first object
    object2:
        type: string
        description: The name of the second object
    description: The description of the relationship between the obejects. `Object2` is `Object1`'s `Description`.
result: 
    type: object
    properties:
        object1:
            type: string
            description: The name of the first object
        object2:
            type: string
            description: The name of the second object
        description:
            type: string
            description: The description of the relationship between the obejects. `Object2` is `Object1`'s `Description`.

## 2. How to make a tool call

When making a call, you just need to respond with the following, and nothing else. The call_id should be a randomly generated 10 alpha-numeric characters. 

The model must not emit any prose, commentary, or fields other than exactly the YAML shown. Do not include any additional text, comments, or explanations. The YAML must be valid and parsable.

If you need to make a second tool call based on the output of the first, wait for the response from the first call, then immediately make the next call in the same format.

--- # ToolCall
call_id: agoieb1p7e
tool: CreateObject
parameters:
    name: Object Name
    description: This is an object


You can call multiple tool calls in a row, using the `--- # ToolCall` document header to separate them.

So if the user asks you to create an object for themselves, and their name is "John Smith", and their friend, "Jane Doe" you would respond with the following.:

--- # ToolCall
call_id: gye521hnf8
tool: CreateObject
parameters:
    name: John Smith
    type: person

--- # ToolCall
call_id: 7hasf82lsf
tool: CreateObject
parameters:
    name: Jane Doe
    type: person

--- # ToolCall
call_id: 98dfgsf1d8
tool: CreateRelation
parameters:
    object1: John Smith
    object2: Jane Doe
    description: friend

## 3. How tool calls respond

Each tool call will respond with their call_id and the results in YAML as well:

--- # gye521hnf8
results: true

--- # 7hasf82lsf
results: true

--- # 98dfgsf1d8
results:
    object1: Jeremy Boyd
    object2: Gary Boyd
    description: Father

If any of the calls had lead to an exception, the response would have a call_id and exception property, with the exception property having the error message like the following:

--- # 7hasf82lsf
exception: the object type "Person" is invalid

# Actual Tool Calls
{yaml tool definitions}

Why JSON Is Superior

Ignoring the fact that YAML is a very "loose" specification, JSON can easily be validated using the schema definition of the function to ensure that it is called correctly in a very high percentage of the time (I've only seen it fail once in 10s of 1000s of calls), and my YAML fails once ever 10-20 calls because of some weird formatting issue.

On top of that, OpenAI doesn't seem to count those JSON tool definitions against your token costs (they will still be in their system prompt though - and eating your context), so while the per message token count might be reduced by 40%, the system prompt might bankrupt you.

Conclusions

While I want to say, everyone should be doing this immediately, right now, and that YAML is the best way to serialize functions, I can't.

But, I will keep using YAML tools. I hope that OpenAI or Anthropic of Google or someone MIGHT see this and can expose my idiocy so I can learn.

I really don't want anyone else doing what I'm doing, because it is dumb. Keep using JSON schemas in your tool calls, unless you desperately want your calls to randomly fail because something dumb happened.

get $100 in credits for FREE when you sign up for digital ocean


Copyright © Jeremy A Boyd 2015-
Built with SimpleMVC.js • Design from StartBoostrap.com