So, I'm assuming that if you are reading this, you use an LLM and probably use their Completions API to handle tool calls.
I know everyone is moving toward MCPs, but there are times when you want control over the flow of the "chat". So you use Completions + Tools, but you take a large hit in the tokens department, and if you have used MCP with Claude, you see how quickly you run out of context with all of that JSON flying around eating up those precious tokens.
I think I have found a solution to that, though. YAML serialization of the tool calls instead of JSON.
An Example
As an example, I will give you 2 tool-call schema definitions for the following C# function:
public static int RollDice(int sides, int rolls) {
var rand = new Random();
int total = 0;
for (int i = 0; i < rolls; i++)
total += rand.Next(1, sides + 1);
return total;
}
JSON Schema (136 Tokens)
{
"name": "RollDice",
"description": "Rolls an N-sided dice X times",
"strict": true,
"parameters": {
"type": "object",
"required": [
"sides",
"rolls"
],
"properties": {
"sides": {
"type": "number",
"description": "The number of sides from 2 to 1000"
},
"rolls": {
"type": "number",
"description": "The number of times to roll this dice"
}
},
"additionalProperties": false
}
}
YAML Schema (66 Tokens)
name: RollDice
description: Rolls an N-sided dice X times
parameters:
sides:
type: number
description: The number of sides from 2 to 1000
rolls:
type: number
description: The number of times to roll this dice
result:
type: number
As you can see, it is a massive improvement over the JSON schema, but this isn't the part that actually matters in the grand "schema" of things... see what I did there?
The actual call
The tool call itself both the request and response also have huge token overheads. The same call between YAML and JSON will be compared.
The prompt: Roll a 2d20.
JSON
{
"id": "call_JQsLsQFpRkjDMmwHwMXNAzZ8",
"type": "function",
"function": {
"name": "RollDice",
"arguments": "{\"sides\":20,\"rolls\":2}"
}
}
After the tool call, the result is passed back 17
for example, and the total tokens used for this interaction not including a system prompt or text response from the LLM is 137 tokens. When you are paying for millions of tokens per dollar, it doesn't seem like a lot, but it adds up.
YAML
--- # ToolCall call_id: m83hft9xvf
tool: RollDice
parameters:
sides: 20
rolls: 2
After the tool call, ignoring the tokens of the system prompt like before, the entire call chain (including the user prompt and tool results) is: 83 tokens. A 40 decrease in the number of tokens.
The token count lie
These token counts are real and they are as close to "apples to apples" as I can get. Maybe "apples to pears"?
But I've neglected to tell you about my system prompt.
YAML tools aren't built into the LLM, so to get it to work, you have to stuff your System Prompt with your tool definitions, the spec on how to call them, with examples, and a lot of other cruft just to get a 90-95% success rate of tool calls.
My Prompt:
# YAML Tool Calls
You are an LLM with access to tools, but these tools are defined and called using YAML, and the results are output to you using YAML.
## 1. How tools are defined
Here are some hypothetical tool calls you might have access to. Below is purely illustrative; ignore these when writing real calls.
name: CreateObject
description: Adds an object to the list of objects
parameters:
name:
type: string
description: The name of the object being created
type:
type: string
description: The type of the object being created
result:
type: boolean
name: CreateRelation
description: Adds a relationship between two objects
parameters:
object1:
type: string
description: The name of the first object
object2:
type: string
description: The name of the second object
description: The description of the relationship between the obejects. `Object2` is `Object1`'s `Description`.
result:
type: object
properties:
object1:
type: string
description: The name of the first object
object2:
type: string
description: The name of the second object
description:
type: string
description: The description of the relationship between the obejects. `Object2` is `Object1`'s `Description`.
## 2. How to make a tool call
When making a call, you just need to respond with the following, and nothing else. The call_id should be a randomly generated 10 alpha-numeric characters.
The model must not emit any prose, commentary, or fields other than exactly the YAML shown. Do not include any additional text, comments, or explanations. The YAML must be valid and parsable.
If you need to make a second tool call based on the output of the first, wait for the response from the first call, then immediately make the next call in the same format.
--- # ToolCall
call_id: agoieb1p7e
tool: CreateObject
parameters:
name: Object Name
description: This is an object
You can call multiple tool calls in a row, using the `--- # ToolCall` document header to separate them.
So if the user asks you to create an object for themselves, and their name is "John Smith", and their friend, "Jane Doe" you would respond with the following.:
--- # ToolCall
call_id: gye521hnf8
tool: CreateObject
parameters:
name: John Smith
type: person
--- # ToolCall
call_id: 7hasf82lsf
tool: CreateObject
parameters:
name: Jane Doe
type: person
--- # ToolCall
call_id: 98dfgsf1d8
tool: CreateRelation
parameters:
object1: John Smith
object2: Jane Doe
description: friend
## 3. How tool calls respond
Each tool call will respond with their call_id and the results in YAML as well:
--- # gye521hnf8
results: true
--- # 7hasf82lsf
results: true
--- # 98dfgsf1d8
results:
object1: Jeremy Boyd
object2: Gary Boyd
description: Father
If any of the calls had lead to an exception, the response would have a call_id and exception property, with the exception property having the error message like the following:
--- # 7hasf82lsf
exception: the object type "Person" is invalid
# Actual Tool Calls
{yaml tool definitions}
Why JSON Is Superior
Ignoring the fact that YAML is a very "loose" specification, JSON can easily be validated using the schema definition of the function to ensure that it is called correctly in a very high percentage of the time (I've only seen it fail once in 10s of 1000s of calls), and my YAML fails once ever 10-20 calls because of some weird formatting issue.
On top of that, OpenAI doesn't seem to count those JSON tool definitions against your token costs (they will still be in their system prompt though - and eating your context), so while the per message token count might be reduced by 40%, the system prompt might bankrupt you.
Conclusions
While I want to say, everyone should be doing this immediately, right now, and that YAML is the best way to serialize functions, I can't.
But, I will keep using YAML tools. I hope that OpenAI or Anthropic of Google or someone MIGHT see this and can expose my idiocy so I can learn.
I really don't want anyone else doing what I'm doing, because it is dumb. Keep using JSON schemas in your tool calls, unless you desperately want your calls to randomly fail because something dumb happened.