Files

Alexander Whitestone 7efe9877e1

Smoke Test / smoke (pull_request) Failing after 8s

Details

paper: Poka-Yoke for AI Agents (NeurIPS draft)

Five lightweight guardrails for LLM agent systems:
1. JSON repair for tool arguments (1400+ failures eliminated)
2. Tool hallucination detection
3. Return type validation
4. Path injection prevention
5. Context overflow prevention

44 lines of code, 455us overhead, zero quality degradation.
Draft: main.tex (NeurIPS format) + references.bib

2026-04-12 19:09:59 -04:00

1.1 KiB

Raw Blame History

Paper A: Poka-Yoke for AI Agents

One-Sentence Contribution

We introduce five failure-proofing guardrails for LLM-based agent systems that eliminate common runtime errors with zero quality degradation and negligible overhead.

The What

Five concrete guardrails, each under 20 lines of code, preventing entire categories of agent failures.

The Why

1,400+ JSON parse failures in production agent logs
Tool hallucination wastes API budget on non-existent tools
Silent failures degrade quality without detection

The So What

As AI agents deploy in production (crisis intervention, code generation, fleet ops), reliability is not optional. Small testable guardrails outperform complex monitoring.

Target Venue

NeurIPS 2025 Workshop on Reliable Foundation Models or ICML 2026

Guardrails

json-repair: Fix malformed tool call arguments (1400+ failures eliminated)
Tool hallucination detection: Block calls to non-existent tools
Type validation: Ensure tool return types are serializable
Path injection prevention: Block writes outside workspace
Context overflow prevention: Mandatory compression triggers

1.1 KiB Raw Blame History