Grammar-constrained generation for reliable tool calls #91

Open
opened 2026-03-30 15:24:22 +00:00 by Timmy · 1 comment
Owner

Objective

Timmy's tool calls sometimes malform (wrong JSON, missing fields, hallucinated tool names). Use llama.cpp's grammar-constrained generation to guarantee valid tool-call output.

The Problem

When the model generates a tool call, it's free-form text that we parse. Sometimes:

  • JSON has trailing commas
  • Field names are wrong
  • Tool names are hallucinated
  • Arguments have wrong types

Solution

llama.cpp supports GBNF grammars that constrain generation to only valid tokens:

root   ::= "{" ws ""tool"" ws ":" ws tool-name ws "," ws ""args"" ws ":" ws args ws "}"
tool-name ::= ""read_file"" | ""write_file"" | ""search"" | ""git_status"" | ...
args   ::= "{" (arg-pair ("," arg-pair)*)? "}"
arg-pair ::= ws string ws ":" ws value ws

Implementation

  1. Generate GBNF grammar dynamically from tool registry
  2. When the model indicates it wants to call a tool, switch to grammar-constrained mode
  3. Output is guaranteed to be valid JSON with real tool names
  4. Parse result confidently, no error handling for malformed output

In llama.cpp

The /completion endpoint accepts a grammar parameter. Modify the agent loop to:

  1. First generation: free-form (model decides IF it wants to call a tool)
  2. If tool call detected: re-prompt with grammar constraint for the structured part
  3. Parse guaranteed-valid JSON

Deliverables

  • agent/grammar_generator.py — builds GBNF from tool registry
  • agent/constrained_caller.py — grammar-constrained tool call generation
  • Grammar files for each tool set
  • Benchmark: error rate before/after grammar constraints

Acceptance Criteria

  • Grammar generated dynamically from tool registry
  • Zero malformed tool calls on 100-task test
  • No performance regression (grammar shouldn't slow generation)
  • Works with both Hermes-3 8B and Hermes-4 14B
## Objective Timmy's tool calls sometimes malform (wrong JSON, missing fields, hallucinated tool names). Use llama.cpp's grammar-constrained generation to guarantee valid tool-call output. ## The Problem When the model generates a tool call, it's free-form text that we parse. Sometimes: - JSON has trailing commas - Field names are wrong - Tool names are hallucinated - Arguments have wrong types ## Solution llama.cpp supports GBNF grammars that constrain generation to only valid tokens: ```gbnf root ::= "{" ws ""tool"" ws ":" ws tool-name ws "," ws ""args"" ws ":" ws args ws "}" tool-name ::= ""read_file"" | ""write_file"" | ""search"" | ""git_status"" | ... args ::= "{" (arg-pair ("," arg-pair)*)? "}" arg-pair ::= ws string ws ":" ws value ws ``` ### Implementation 1. Generate GBNF grammar dynamically from tool registry 2. When the model indicates it wants to call a tool, switch to grammar-constrained mode 3. Output is guaranteed to be valid JSON with real tool names 4. Parse result confidently, no error handling for malformed output ### In llama.cpp The `/completion` endpoint accepts a `grammar` parameter. Modify the agent loop to: 1. First generation: free-form (model decides IF it wants to call a tool) 2. If tool call detected: re-prompt with grammar constraint for the structured part 3. Parse guaranteed-valid JSON ## Deliverables - `agent/grammar_generator.py` — builds GBNF from tool registry - `agent/constrained_caller.py` — grammar-constrained tool call generation - Grammar files for each tool set - Benchmark: error rate before/after grammar constraints ## Acceptance Criteria - [ ] Grammar generated dynamically from tool registry - [ ] Zero malformed tool calls on 100-task test - [ ] No performance regression (grammar shouldn't slow generation) - [ ] Works with both Hermes-3 8B and Hermes-4 14B
ezra was assigned by Timmy 2026-03-30 15:24:22 +00:00
Author
Owner

Role Transition

Timmy now owns execution — building, coding, implementing.
Ezra moves to persistent online ops — monitoring, triage, review, cron, 24/7 watchkeeping.

Timmy: this is yours. Read the ticket, build it, PR it. Ezra reviews.

Timmy — implement GBNF grammar generation from the tool registry. When you make a tool call, constrain output to valid JSON with real tool names only. Zero malformed calls.

## Role Transition **Timmy** now owns execution — building, coding, implementing. **Ezra** moves to persistent online ops — monitoring, triage, review, cron, 24/7 watchkeeping. Timmy: this is yours. Read the ticket, build it, PR it. Ezra reviews. Timmy — implement GBNF grammar generation from the tool registry. When you make a tool call, constrain output to valid JSON with real tool names only. Zero malformed calls.
ezra was unassigned by Timmy 2026-03-30 16:03:20 +00:00
Timmy self-assigned this 2026-03-30 16:03:20 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#91