Files
timmy-home/research/poka-yoke/references.md
Alexander Whitestone 7efe9877e1
Some checks failed
Smoke Test / smoke (pull_request) Failing after 8s
paper: Poka-Yoke for AI Agents (NeurIPS draft)
Five lightweight guardrails for LLM agent systems:
1. JSON repair for tool arguments (1400+ failures eliminated)
2. Tool hallucination detection
3. Return type validation
4. Path injection prevention
5. Context overflow prevention

44 lines of code, 455us overhead, zero quality degradation.
Draft: main.tex (NeurIPS format) + references.bib
2026-04-12 19:09:59 -04:00

11 KiB
Raw Blame History

Literature Review: Poka-Yoke for AI Agents

This document collects related work for a paper on "Poka-Yoke for AI Agents: Failure-Proofing LLM-Based Agent Systems."

Total papers: 31

Agent reliability and error handling (SWE-bench, AgentBench)

  • SWE-bench Goes Live!

    • Authors: Linghao Zhang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Chengxing Xie, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang
    • Venue: cs.SE, 2025
    • URL: https://arxiv.org/abs/2505.23419v2
    • Relevance: Introduces a live benchmark for evaluating software engineering agents on real-world GitHub issues.
  • Training Software Engineering Agents and Verifiers with SWE-Gym

    • Authors: Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang
    • Venue: cs.SE, 2024
    • URL: https://arxiv.org/abs/2412.21139v2
    • Relevance: Presents a gym environment for training and verifying software engineering agents using SWE-bench.
  • SWE-Bench+: Enhanced Coding Benchmark for LLMs

    • Authors: Reem Aleithan, Haoran Xue, Mohammad Mahdi Mohajer, Elijah Nnorom, Gias Uddin, Song Wang
    • Venue: cs.SE, 2024
    • URL: https://arxiv.org/abs/2410.06992v2
    • Relevance: Enhances the SWE-bench benchmark with more diverse and challenging tasks for LLM evaluation.
  • AgentBench: Evaluating LLMs as Agents

    • Authors: Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang
    • Venue: cs.AI, 2023
    • URL: https://arxiv.org/abs/2308.03688v3
    • Relevance: Provides a comprehensive benchmark for evaluating LLMs as agents across multiple environments and tasks.
  • FHIR-AgentBench: Benchmarking LLM Agents for Realistic Interoperable EHR Question Answering

    • Authors: Gyubok Lee, Elea Bach, Eric Yang, Tom Pollard, Alistair Johnson, Edward Choi, Yugang jia, Jong Ha Lee
    • Venue: cs.CL, 2025
    • URL: https://arxiv.org/abs/2509.19319v2
    • Relevance: Benchmarks LLM agents for healthcare question answering using FHIR interoperability standards.

Tool-use in LLMs (function calling, structured output)

  • MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

    • Authors: Shuo Yin, Weihao You, Zhilong Ji, Guoqiang Zhong, Jinfeng Bai
    • Venue: cs.CL, 2024
    • URL: https://arxiv.org/abs/2405.07551v1
    • Relevance: Combines tool-use LLMs with data augmentation to improve mathematical reasoning capabilities.
  • Benchmarking LLM Tool-Use in the Wild

    • Authors: Peijie Yu, Wei Liu, Yifan Yang, Jinjian Li, Zelong Zhang, Xiao Feng, Feng Zhang
    • Venue: cs.HC, 2026
    • URL: https://arxiv.org/abs/2604.06185v1
    • Relevance: Evaluates LLM tool-use capabilities in real-world scenarios with diverse tools and APIs.
  • CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

    • Authors: Duo Wu, Jinghe Wang, Yuan Meng, Yanning Zhang, Le Sun, Zhi Wang
    • Venue: cs.AI, 2024
    • URL: https://arxiv.org/abs/2411.16313v3
    • Relevance: Enables LLMs to perform cost-aware tool planning for efficient task completion.
  • Asynchronous LLM Function Calling

    • Authors: In Gim, Seung-seob Lee, Lin Zhong
    • Venue: cs.CL, 2024
    • URL: https://arxiv.org/abs/2412.07017v1
    • Relevance: Introduces asynchronous function calling mechanisms to improve LLM agent concurrency.
  • An LLM Compiler for Parallel Function Calling

    • Authors: Sehoon Kim, Suhong Moon, Ryan Tabrizi, Nicholas Lee, Michael W. Mahoney, Kurt Keutzer, Amir Gholami
    • Venue: cs.CL, 2023
    • URL: https://arxiv.org/abs/2312.04511v3
    • Relevance: Proposes a compiler that parallelizes LLM function calls for improved efficiency.

JSON repair and structured output enforcement

  • An adaptable JSON Diff Framework

  • Model and Program Repair via SAT Solving

    • Authors: Paul C. Attie, Jad Saklawi
    • Venue: cs.LO, 2007
    • URL: https://arxiv.org/abs/0710.3332v4
    • Relevance: Uses SAT solving techniques for automated repair of models and programs.
  • ASAP-Repair: API-Specific Automated Program Repair Based on API Usage Graphs

    • Authors: Sebastian Nielebock, Paul Blockhaus, Jacob Krüger, Frank Ortmeier
    • Venue: cs.SE, 2024
    • URL: https://arxiv.org/abs/2402.07542v1
    • Relevance: Automatically repairs APIrelated bugs using API usage graph analysis.
  • "We Need Structured Output": Towards User-centered Constraints on Large Language Model Output

    • Authors: Michael Xieyang Liu, Frederick Liu, Alexander J. Fiannaca, Terry Koo, Lucas Dixon, Michael Terry, Carrie J. Cai
    • Venue: "We Need Structured Output": Towards User-centered Constraints on LLM Output. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '24), May 11-16, 2024, Honolulu, HI, USA, 2024
    • URL: https://arxiv.org/abs/2404.07362v1
    • Relevance: Advocates for user-defined constraints on LLM output to ensure structured and usable responses.
  • Validation of Modern JSON Schema: Formalization and Complexity

    • Authors: Cédric L. Lourenço, Vlad A. Manea
    • Venue: arXiv, 2023
    • URL: https://arxiv.org/abs/2307.10034v2
    • Relevance: Formalizes JSON Schema validation and analyzes its computational complexity.
  • Blaze: Compiling JSON Schema for 10x Faster Validation

    • Authors: Cédric L. Lourenço, Vlad A. Manea
    • Venue: arXiv, 2025
    • URL: https://arxiv.org/abs/2503.02770v2
    • Relevance: Compiles JSON Schema to optimized code for significantly faster validation.

Software engineering fault tolerance patterns

  • Orthogonal Fault Tolerance for Dynamically Adaptive Systems

    • Authors: Sobia K Khan
    • Venue: cs.SE, 2014
    • URL: https://arxiv.org/abs/1404.6830v1
    • Relevance: Introduces orthogonal fault tolerance mechanisms for selfadaptive software systems.
  • An Introduction to Software Engineering and Fault Tolerance

    • Authors: Patrizio Pelliccione, Henry Muccini, Nicolas Guelfi, Alexander Romanovsky
    • Venue: Introduction chapter to the "SOFTWARE ENGINEERING OF FAULT TOLERANT SYSTEMS" book, Series on Software Engineering and Knowledge Eng., 2007, 2010
    • URL: https://arxiv.org/abs/1011.1551v1
    • Relevance: Foundational survey of fault tolerance concepts and techniques in software engineering.
  • Scheduling and Checkpointing optimization algorithm for Byzantine fault tolerance in Cloud Clusters

    • Authors: Sathya Chinnathambi, Agilan Santhanam
    • Venue: cs.DC, 2018
    • URL: https://arxiv.org/abs/1802.00951v1
    • Relevance: Optimizes scheduling and checkpointing for Byzantine fault tolerance in cloud environments.
  • Low-Overhead Transversal Fault Tolerance for Universal Quantum Computation

    • Authors: Hengyun Zhou, Chen Zhao, Madelyn Cain, Dolev Bluvstein, Nishad Maskara, Casey Duckering, Hong-Ye Hu, Sheng-Tao Wang, Aleksander Kubica, Mikhail D. Lukin
    • Venue: quant-ph, 2024
    • URL: https://arxiv.org/abs/2406.17653v2
    • Relevance: No summary available.
  • Application-layer Fault-Tolerance Protocols

    • Authors: Vincenzo De Florio
    • Venue: cs.SE, 2016
    • URL: https://arxiv.org/abs/1611.02273v1
    • Relevance: Surveys faulttolerance protocols at the application layer for distributed systems.

Poka-yoke (mistake-proofing) in software/ML systems

  • Some Spreadsheet Poka-Yoke

    • Authors: Bill Bekenn, Ray Hooper
    • Venue: Proc. European Spreadsheet Risks Int. Grp. (EuSpRIG) 2009 83-94 ISBN 978-1-905617-89-0, 2009
    • URL: https://arxiv.org/abs/0908.0930v1
    • Relevance: Applies pokayoke (mistakeproofing) principles to spreadsheet design and error prevention.
  • AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing Software Vulnerabilities

    • Authors: Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Yuki Kume, Van Nguyen, Dinh Phung, John Grundy
    • Venue: arXiv, 2023
    • URL: https://arxiv.org/abs/2305.16615v1
    • Relevance: Provides an AIdriven tool for predicting, classifying, and repairing software vulnerabilities.
  • Morescient GAI for Software Engineering (Extended Version)

    • Authors: Marcus Kessel, Colin Atkinson
    • Venue: arXiv, 2024
    • URL: https://arxiv.org/abs/2406.04710v2
    • Relevance: Explores trustworthy and robust AIassisted software engineering practices.
  • Holistic Adversarial Robustness of Deep Learning Models

    • Authors: Pin-Yu Chen, Sijia Liu
    • Venue: arXiv, 2022
    • URL: https://arxiv.org/abs/2202.07201v3
    • Relevance: Studies holistic adversarial robustness across multiple attack types and defenses in deep learning.
  • Defending Against Adversarial Machine Learning

    • Authors: Alison Jenkins
    • Venue: arXiv, 2019
    • URL: https://arxiv.org/abs/1911.11746v1
    • Relevance: Surveys defense techniques against adversarial attacks on machine learning models.

Hallucination detection in LLMs

  • Probabilistic distances-based hallucination detection in LLMs with RAG

    • Authors: Rodion Oblovatny, Alexandra Kuleshova, Konstantin Polev, Alexey Zaytsev
    • Venue: cs.CL, 2025
    • URL: https://arxiv.org/abs/2506.09886v2
    • Relevance: Detects hallucinations in LLMs using probabilistic distances within retrievalaugmented generation.
  • Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration

    • Authors: Qiyao Sun, Xingming Li, Xixiang He, Ao Cheng, Xuanyu Ji, Hailun Lu, Runke Huang, Qingyong Hu
    • Venue: cs.CL, 2026
    • URL: https://arxiv.org/abs/2603.22812v1
    • Relevance: No summary available.
  • Hallucination Detection with Small Language Models

    • Authors: Ming Cheung
    • Venue: Hallucination Detection with Small Language Models, IEEE International Conference on Data Engineering (ICDE), Workshop, 2025, 2025
    • URL: https://arxiv.org/abs/2506.22486v1
    • Relevance: Explores hallucination detection using smaller, more efficient language models.
  • First Hallucination Tokens Are Different from Conditional Ones

    • Authors: Jakob Snel, Seong Joon Oh
    • Venue: cs.LG, 2025
    • URL: https://arxiv.org/abs/2507.20836v4
    • Relevance: Analyzes differences between initial hallucination tokens and subsequent conditional tokens.
  • THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models

    • Authors: Mengfei Liang, Archish Arun, Zekun Wu, Cristian Munoz, Jonathan Lutch, Emre Kazim, Adriano Koshiyama, Philip Treleaven
    • Venue: NeurIPS Workshop on Socially Responsible Language Modelling Research 2024, 2024
    • URL: https://arxiv.org/abs/2409.11353v3
    • Relevance: Offers an endtoend tool for mitigating and evaluating hallucinations in LLMs.