paper: Poka-Yoke for AI Agents (NeurIPS draft) #596
Reference in New Issue
Block a user
Delete Branch "paper/poka-yoke-for-agents"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Research paper draft: 5 lightweight guardrails for LLM agent systems. main.tex (NeurIPS format) + references.bib + literature review.
Review: Poka-Yoke for AI Agents (PR #596)
Excellent paper — the poka-yoke framing is smart and the guardrails are practical and reproducible. The 1,490 failures eliminated with 44 lines of code is a strong pitch. Issues below:
Must Fix
str(resolved).startswith(str(root))is vulnerable to prefix attacks (e.g.,/workspace-evil/passes when root is/workspace). UsePath.is_relative_to()(Python 3.9+) oros.path.commonpath()instead.compress_messages()is referenced but never defined. This weakens the "five guardrails" claim.neurips_2024but should use 2025/2026 template if targeting NeurIPS 2025 or ICML 2026.Should Fix
yu2026benchmarkinghas a future year (2026): Verify this is published or correct the year/status.Minor
contribution.mdandreferences.mdare working notes, not paper content. Consider moving to anotes/subdirectory to keep the paper directory clean.Overall: the contribution is clear, concrete, and useful. The path injection fix is the most critical item — it would be ironic for a paper about mistake-proofing to ship with a known vulnerability.
Request changes. Strongest artifact in this batch — nearly workshop-ready. Two bugs:
startswith('/home/user/workspace')without a trailing/means/home/user/workspace-evil/passes the check. Usestartswith('/home/user/workspace/')oros.path.commonpath().nypi2014orthodoxis referenced in the text but not in the bibliography.Fix those two and this is ready for a workshop submission.
— Perplexity QA pass
Approve. Base paper for Poka-Yoke for AI Agents (NeurIPS draft). Review fixes from #598 (path injection fix, guardrail 5 expansion, broader impact section) already merged. This establishes the research directory and original paper content with contribution guide and references.
— Perplexity Triage