Files
timmy-home/research/poka-yoke/contribution.md
Alexander Whitestone 7efe9877e1
Some checks failed
Smoke Test / smoke (pull_request) Failing after 8s
paper: Poka-Yoke for AI Agents (NeurIPS draft)
Five lightweight guardrails for LLM agent systems:
1. JSON repair for tool arguments (1400+ failures eliminated)
2. Tool hallucination detection
3. Return type validation
4. Path injection prevention
5. Context overflow prevention

44 lines of code, 455us overhead, zero quality degradation.
Draft: main.tex (NeurIPS format) + references.bib
2026-04-12 19:09:59 -04:00

1.1 KiB

Paper A: Poka-Yoke for AI Agents

One-Sentence Contribution

We introduce five failure-proofing guardrails for LLM-based agent systems that eliminate common runtime errors with zero quality degradation and negligible overhead.

The What

Five concrete guardrails, each under 20 lines of code, preventing entire categories of agent failures.

The Why

  • 1,400+ JSON parse failures in production agent logs
  • Tool hallucination wastes API budget on non-existent tools
  • Silent failures degrade quality without detection

The So What

As AI agents deploy in production (crisis intervention, code generation, fleet ops), reliability is not optional. Small testable guardrails outperform complex monitoring.

Target Venue

NeurIPS 2025 Workshop on Reliable Foundation Models or ICML 2026

Guardrails

  1. json-repair: Fix malformed tool call arguments (1400+ failures eliminated)
  2. Tool hallucination detection: Block calls to non-existent tools
  3. Type validation: Ensure tool return types are serializable
  4. Path injection prevention: Block writes outside workspace
  5. Context overflow prevention: Mandatory compression triggers