Commit Graph

1841 Commits

Author SHA1 Message Date
a2a40429bd Merge pull request '[claude] Poka-yoke: auto-revert incomplete skill edits (#923)' (#946) from claude/issue-923 into main
All checks were successful
Lint / lint (push) Successful in 10s
2026-04-21 16:38:24 +00:00
Alexander Whitestone
1fece10569 feat: poka-yoke auto-revert for incomplete skill edits (#923)
All checks were successful
Lint / lint (pull_request) Successful in 32s
Implement a transactional write-validate-commit-or-rollback pattern for
all skill_manage write operations (edit, patch, write_file):

- _backup_skill_file: timestamped .bak.{ts} snapshot before every write
- _validate_written_file: re-reads from disk after write to catch truncation,
  encoding errors, and broken YAML frontmatter
- _revert_from_backup: restores original content (or removes the corrupted
  file) on any validation failure
- _cleanup_old_backups: prunes to MAX_BACKUPS_PER_FILE (3) after success;
  failed edits keep their .bak file as a debugging aid

Also fixes pre-existing issue where _patch_skill error returns lacked a
`suggestion` field expected by test_skill_manager_error_context.py tests.

Adds 21 tests in test_skill_manager_autorevert.py covering every component
and an end-to-end simulation of mid-write failure + auto-revert.

Fixes #923

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 11:37:55 -04:00
46668505bc Merge pull request 'feat: tool fixation detection — break repetitive loops (#886)' (#914) from fix/886 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:35:08 +00:00
cac0c8224e Merge pull request 'fix: circuit breaker for error cascading (2.33x amplification)' (#927) from fix/885-circuit-breaker into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:35:04 +00:00
f38a64455d Merge pull request '[claude] Gateway config debt: add validation tests and API_SERVER_KEY warning (#892)' (#915) from claude/issue-892 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:33:19 +00:00
cf090a966d Merge pull request 'fix: Poka-yoke — detect and block tool hallucination before API calls (#922)' (#935) from fix/922 into main
Some checks failed
Lint / lint (push) Has been cancelled
2026-04-21 15:29:35 +00:00
690d100afc Merge pull request 'feat: Poka-yoke token budget — progressive context overflow guard (#925)' (#943) from burn/925-1776770102 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been skipped
Nix / nix (ubuntu-latest) (push) Failing after 5s
Tests / e2e (push) Successful in 5m8s
Tests / test (push) Failing after 30m13s
Nix / nix (macos-latest) (push) Has been cancelled
2026-04-21 15:29:02 +00:00
c6f0831738 Merge pull request 'feat: Python syntax validation before execute_code (#913)' (#917) from fix/913-syntax-validation into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:27:05 +00:00
feb24bd08c Merge pull request 'feat: Block silent credential exposure in tool outputs (#839)' (#910) from fix/839-1776403070 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:26:47 +00:00
bc55f40505 Merge pull request 'feat: time-aware model routing for cron jobs (#889)' (#909) from fix/889 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:26:43 +00:00
2adc72335e Merge pull request 'fix: profile session isolation — tag and filter by profile' (#907) from fix/891-profile-isolation into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:26:39 +00:00
ab32670464 Merge pull request 'feat: Poka-yoke — detect and block tool hallucination before API calls (#922)' (#944) from burn/922-1776770102 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / test (push) Has been cancelled
Tests / e2e (push) Has been cancelled
2026-04-21 15:23:56 +00:00
27d2f2ca0e Merge pull request 'feat: Prevent context window overflow via proactive token counting (#838)' (#905) from fix/838-1776402240 into main
Some checks failed
Docker Build and Publish / build-and-push (push) Has been cancelled
Nix / nix (macos-latest) (push) Has been cancelled
Nix / nix (ubuntu-latest) (push) Has been cancelled
Tests / e2e (push) Has been cancelled
Tests / test (push) Has been cancelled
2026-04-21 15:22:31 +00:00
08432a5618 test: poka-yoke validation tests (#922) 2026-04-21 11:59:26 +00:00
07c5b5b83d test: add token budget poka-yoke tests (#925)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Failing after 44s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 45s
Tests / test (pull_request) Failing after 25m21s
Tests / e2e (pull_request) Successful in 3m18s
2026-04-21 11:41:39 +00:00
6eeee39c10 test(#922): Add tests for tool hallucination detection
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Failing after 1m15s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 1m8s
Tests / e2e (pull_request) Successful in 3m44s
Tests / test (pull_request) Failing after 1h9m15s
Tests for validation firewall:
- Unknown tool detection
- Missing required params
- Wrong type detection
- Hallucination patterns
- Rejection stats

Refs #922
2026-04-21 05:38:54 +00:00
30509b9c7c test: circuit breaker tests
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Failing after 38s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 40s
Tests / e2e (pull_request) Successful in 1m36s
Tests / test (pull_request) Failing after 17m13s
Part of #885
2026-04-21 00:28:15 +00:00
c17f64fa2c test: add syntax validation tests (#913)
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Failing after 41s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 29s
Tests / e2e (pull_request) Successful in 2m2s
Tests / test (pull_request) Failing after 1h14m43s
2026-04-20 15:47:35 +00:00
Alexander Whitestone
c22cdcaa8e fix: add _validate_gateway_config tests and API_SERVER_KEY network binding warning
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Failing after 23s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 27s
Tests / e2e (pull_request) Successful in 1m51s
Tests / test (pull_request) Failing after 37m0s
Refs #892 - Gateway config debt: missing keys and broken fallbacks

Changes:
- Add `_is_network_accessible()` helper to gateway/config.py (avoids circular
  import with gateway.platforms.base which imports from gateway.config)
- Add API_SERVER_KEY warning in `_validate_gateway_config`: when the API server
  is enabled on a network-accessible address (0.0.0.0, public IP, hostname) but
  no key is configured, log a warning at config-load time so operators see the
  issue before any adapter initialisation runs
- Add `TestValidateGatewayConfig` in tests/gateway/test_config.py covering:
  - idle_minutes <= 0 and None are corrected to 1440 (default)
  - at_hour outside 0-23 is corrected to 4 (default)
  - Boundary hours 0 and 23 are accepted unchanged
  - Empty platform token triggers a warning log
  - Disabled platform with empty token produces no warning
  - API server on 0.0.0.0 without key logs a warning
  - API server on 127.0.0.1 without key is silent (loopback is allowed)
  - API server with a key set logs no warning regardless of bind address

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 02:18:02 -04:00
Alexander Whitestone
ab968e910c feat: tool fixation detection — break repetitive loops (#886)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Failing after 37s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 43s
Tests / e2e (pull_request) Successful in 1m57s
Tests / test (pull_request) Failing after 18m57s
Marathon sessions show tool fixation: agent latches onto one tool
and calls it repeatedly. Observed streaks of 8-25 identical calls.

New agent/tool_fixation_detector.py:
- ToolFixationDetector: tracks consecutive tool calls
- record(tool_name): returns nudge prompt when threshold reached
- Default threshold: 5 consecutive calls (configurable via
  TOOL_FIXATION_THRESHOLD env var)
- Nudge prompt explains the fixation and suggests alternatives:
  1. Read error carefully
  2. Try different tool
  3. Ask user for clarification
  4. Check if task is complete
- get_streak_info(): current streak state
- format_report(): human-readable fixation events
- Singleton via get_fixation_detector()

Config:
- TOOL_FIXATION_THRESHOLD (default: 5)
- TOOL_FIXATION_WINDOW (default: 10)

Tests: tests/test_tool_fixation_detector.py (9 tests)

Closes #886
2026-04-17 01:57:37 -04:00
cb331da4f1 test: Add credential redaction tests (#839)
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Failing after 49s
Tests / e2e (pull_request) Successful in 2m50s
Tests / test (pull_request) Failing after 11m50s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 47s
2026-04-17 05:23:48 +00:00
Alexander Whitestone
0b72884750 feat: time-aware model routing for cron jobs (#889)
Some checks failed
Tests / test (pull_request) Failing after 25m4s
Tests / e2e (pull_request) Successful in 3m19s
Contributor Attribution Check / check-attribution (pull_request) Failing after 14s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 14s
Error rate peaks at 18:00 (9.4%) during evening cron batches vs 4.0%
at 09:00 during interactive work. Route cron tasks to stronger models
during off-hours when user is not present to correct errors.

New agent/time_aware_routing.py:
- resolve_time_aware_model(): routes based on hour, error rate, task type
- Interactive sessions: always use base model (user corrects errors)
- Cron during business hours: use base model (low error rate)
- Cron during off-hours with high error rate (>6%): upgrade to strong model
- get_hour_error_rate(): error rates by hour from empirical audit
- is_off_hours(): 18:00-05:59 = off-hours
- RoutingDecision: model, provider, reason, hour, error_rate
- get_routing_report(): 24h forecast of routing decisions

Config via env vars:
- CRON_STRONG_MODEL (default: xiaomi/mimo-v2-pro)
- CRON_CHEAP_MODEL (default: qwen2.5:7b)
- CRON_ERROR_THRESHOLD (default: 6.0%)

Tests: tests/test_time_aware_routing.py (9 tests)

Closes #889
2026-04-17 01:15:09 -04:00
a0ed1e6ff2 test: profile isolation tests
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Failing after 15s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 15s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Tests / test (pull_request) Failing after 18m33s
Tests / e2e (pull_request) Successful in 1m17s
Part of #891
2026-04-17 05:13:03 +00:00
d4cdfdc604 test: Add context budget tracker tests (#838)
Some checks failed
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 19s
Contributor Attribution Check / check-attribution (pull_request) Failing after 16s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Tests / test (pull_request) Failing after 18m30s
Tests / e2e (pull_request) Successful in 1m16s
2026-04-17 05:06:54 +00:00
34e7de6a4c feat: 988 Lifeline tests (#673)
Some checks failed
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 18s
Contributor Attribution Check / check-attribution (pull_request) Failing after 17s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Tests / test (pull_request) Failing after 18m18s
Tests / e2e (pull_request) Successful in 1m13s
2026-04-17 05:04:50 +00:00
05f8c2d188 Merge PR #899
Merged PR #899: feat: Allegro worker deliverables
2026-04-17 01:52:11 +00:00
ff2ce95ade feat(research): Allegro worker deliverables — fleet research reports + skill manager test
Some checks failed
Tests / e2e (pull_request) Successful in 1m39s
Tests / test (pull_request) Failing after 1h7m45s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Successful in 24s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 28s
Research reports:
- Vector DB research
- Workflow orchestration research
- Fleet knowledge graph SOTA research
- LLM inference optimization
- Local model crisis quality
- Memory systems SOTA
- Multi-agent coordination
- R5 vs E2E gap analysis
- Text-to-music-video

Test:
- test_skill_manager_error_context.py

[Allegro] Forge workers — 2026-04-16
2026-04-16 15:04:28 +00:00
Hermes Merge Bot
aedebfdf58 Merge PR #848 2026-04-16 02:12:13 -04:00
Hermes Merge Bot
adf49b1809 Merge PR #849 2026-04-16 02:11:21 -04:00
Hermes Merge Bot
52ea3a8935 Merge PR #850 2026-04-16 02:09:00 -04:00
Hermes Merge Bot
43246d6cb4 Merge PR #852 2026-04-16 02:08:06 -04:00
Hermes Merge Bot
dff451081d Merge PR #856 2026-04-16 02:05:42 -04:00
Hermes Merge Bot
5509b157c5 Merge PR #864 2026-04-16 02:05:05 -04:00
Hermes Merge Bot
fcc322fb81 Merge PR #867 2026-04-16 02:03:23 -04:00
Hermes Merge Bot
9bba9ecc40 Merge PR #866 2026-04-16 02:02:43 -04:00
Alexander Whitestone
3238cf4eb1 feat: Tool investigation report + Mem0 local provider (#842)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 38s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 32s
Tests / test (pull_request) Failing after 43m54s
Tests / e2e (pull_request) Successful in 2m5s
## Investigation Report
- docs/tool-investigation-2026-04-15.md: Full report analyzing 414 tools
  from awesome-ai-tools. Top 5 recommendations with integration paths.
- docs/plans/awesome-ai-tools-integration.md: Implementation tracking plan.

## Mem0 Local Provider (P1)
- plugins/memory/mem0_local/: New ChromaDB-backed memory provider.
  No API key required - fully sovereign. Compatible tool schemas with
  cloud Mem0 (mem0_profile, mem0_search, mem0_conclude).
- Pattern-based fact extraction from conversations.
- Deterministic dedup via content hashing.
- Circuit breaker for resilience.
- tests/plugins/memory/test_mem0_local.py: Full test coverage.

## Issues Filed
- #857: LightRAG integration (P2)
- #858: n8n workflow orchestration (P3)
- #859: RAGFlow document understanding (P4)
- #860: tensorzero LLMOps evaluation (P3)

Closes #842
2026-04-15 23:04:41 -04:00
eed87e454e test: Benchmark Gemma 4 vision accuracy vs current approach (#817)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 26s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 26s
Tests / e2e (pull_request) Successful in 2m38s
Tests / test (pull_request) Failing after 47m49s
Vision benchmark suite comparing Gemma 4 (google/gemma-4-27b-it) vs
current Gemini 3 Flash Preview (google/gemini-3-flash-preview).

Metrics:
- OCR accuracy (character + word overlap)
- Description completeness (keyword coverage)
- Structural quality (length, sentences, numbers)
- Latency (ms per image)
- Token usage
- Consistency across runs

Features:
- 24 diverse test images (screenshots, diagrams, photos, charts)
- Category-specific evaluation prompts
- Automated verdict with composite scoring
- JSON + markdown report output
- 28 unit tests passing

Usage:
  python benchmarks/vision_benchmark.py --images benchmarks/test_images.json
  python benchmarks/vision_benchmark.py --url https://example.com/img.png
  python benchmarks/vision_benchmark.py --generate-dataset

Closes #817.
2026-04-15 23:02:02 -04:00
Alexander Whitestone
f03709aa29 test: crisis hook integration tests with agent loop (#707)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 16s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 15s
Tests / e2e (pull_request) Failing after 12m38s
Tests / test (pull_request) Failing after 25m58s
10 integration tests verifying crisis detection works correctly
when called from the agent conversation flow:

- scan_user_message detects CRITICAL/HIGH/MEDIUM/LOW levels
- Safe messages pass through without triggering
- Tool handler returns valid JSON
- Compassion injection includes 988 lifeline for CRITICAL/HIGH
- Case insensitive detection
- Empty/None text handled gracefully
- False positive resistance on common non-crisis phrases
- Config check returns bool
- Callable from agent context (not just isolation tests)
2026-04-15 23:00:12 -04:00
85a654348a feat: poka-yoke — prevent hardcoded ~/.hermes paths (closes #835)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 27s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 19s
Tests / e2e (pull_request) Successful in 1m55s
Tests / test (pull_request) Failing after 56m41s
scripts/lint_hardcoded_paths.py (new):
- Scans Python files for hardcoded home-directory paths
- Detects: Path.home()/.hermes without env fallback, /Users/<name>/, /home/<name>/
- Excludes: comments, docstrings, test files, skills, plugins, docs
- Excludes correct patterns: profiles_parent, current_default, native_home
- Supports --staged (git pre-commit), --fix (suggestions), --json output

scripts/pre-commit-hardcoded-paths.sh (new):
- Pre-commit hook that runs lint_hardcoded_paths.py --staged
- Blocks commits containing hardcoded path violations

tools/confirmation_daemon.py (fixed):
- Replaced Path.home() / '.hermes' / 'approval_whitelist.json'
  with get_hermes_home() / 'approval_whitelist.json'
- Added import of get_hermes_home from hermes_constants

tests/test_hardcoded_paths.py (new):
- 11 tests: detection, exclusion, fallback patterns, clean files
2026-04-15 22:56:32 -04:00
Alexander Whitestone
13ef670c05 feat: session compaction with fact extraction (#748)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 29s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 33s
Tests / e2e (pull_request) Successful in 3m26s
Tests / test (pull_request) Failing after 1h28m50s
Before compressing conversation context, extract durable facts
(user preferences, corrections, project details) and save to
fact store so they survive compression.

New agent/session_compactor.py:
- extract_facts_from_messages(): scans user messages for
  preferences, corrections, project/infra facts using regex
- 3 pattern categories: user_pref (5 patterns), correction
  (3 patterns), project (4 patterns)
- ExtractedFact: category, entity, content, confidence, source_turn
- save_facts_to_store(): saves to fact store (callback or auto-detect)
- extract_and_save_facts(): one-call extraction + persistence
- Deduplication by category+content
- Skips tool results, short messages, system messages
- format_facts_summary(): human-readable summary

Tests: tests/test_session_compactor.py (9 tests)

Closes #748
2026-04-15 22:41:54 -04:00
Alexander Whitestone
9f0c410481 feat: batch tool execution with parallel safety checks (#749)
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Successful in 35s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 37s
Tests / e2e (pull_request) Successful in 1m48s
Tests / test (pull_request) Failing after 36m13s
Centralized safety classification for tool call batches:

tools/batch_executor.py (new):
- classify_tool_calls() — classifies batch into parallel_safe,
  path_scoped, sequential, never_parallel tiers
- BatchExecutionPlan — structured plan with parallel and sequential batches
- Path conflict detection — write_file + patch on same file go sequential
- Destructive command detection — rm, mv, sed -i, redirects
- execute_parallel_batch() — ThreadPoolExecutor for concurrent execution

tools/registry.py (enhanced):
- ToolEntry.parallel_safe field — tools can declare parallel safety
- registry.register() accepts parallel_safe=True parameter
- registry.get_parallel_safe_tools() — query registry-declared safe tools

Safety tiers:
- parallel_safe: read_file, web_search, search_files, etc.
- path_scoped: write_file, patch (concurrent when paths don't overlap)
- sequential: terminal, delegate_task, unknown tools
- never_parallel: clarify (requires user interaction)

19 tests passing.
2026-04-15 22:17:16 -04:00
b34b5b293d test: add tests for tool hallucination prevention (#836)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 24s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 22s
Tests / e2e (pull_request) Successful in 3m6s
Tests / test (pull_request) Failing after 41m24s
2026-04-16 02:15:59 +00:00
Timmy Time
fb7464995c fix: Ultraplan Mode for daily autonomous planning (closes #840)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 37s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 39s
Tests / test (pull_request) Failing after 1h15m33s
Tests / e2e (pull_request) Successful in 2m20s
2026-04-15 22:14:16 -04:00
7c71b7e73a test: parallel tool calling — 2+ tools per response (#798)
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 45s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 1m16s
Tests / e2e (pull_request) Successful in 3m17s
Tests / test (pull_request) Failing after 1h30m54s
2026-04-16 02:13:00 +00:00
4a3068b3b5 test: add regression tests for issue #834 KeyError fix
Some checks failed
Contributor Attribution Check / check-attribution (pull_request) Successful in 39s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 44s
Tests / e2e (pull_request) Successful in 2m53s
Tests / test (pull_request) Failing after 1h28m32s
2026-04-16 02:12:36 +00:00
d8d7846897 feat: add tests/tools/test_confirmation_daemon.py from PR #397 2026-04-16 01:35:24 +00:00
6840d05554 feat: add tests/agent/test_privacy_filter.py from PR #397 2026-04-16 01:35:21 +00:00
Alexander Whitestone
30afd529ac feat: add crisis detection tool — the-door integration (#141)
Some checks failed
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Contributor Attribution Check / check-attribution (pull_request) Successful in 44s
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 59s
Tests / e2e (pull_request) Successful in 3m49s
Tests / test (pull_request) Failing after 44m1s
New tool: tools/crisis_tool.py
- Wraps the-door's canonical crisis detection (detect.py)
- Scans user messages for despair/suicidal ideation
- Classifies into NONE/LOW/MEDIUM/HIGH/CRITICAL tiers
- Provides recommended actions per tier
- Gateway hook: scan_user_message() for pre-API-call detection
- System prompt injection: compassion_injection based on crisis level
- Optional escalation logging to crisis_escalations.jsonl
- Optional bridge API POST for HIGH+ (configurable via CRISIS_BRIDGE_URL)
- Configurable via crisis_detection: true/false in config.yaml
- Follows the-door design principles: never computes life value,
  never suggests death, errs on side of higher risk

Also: tests/test_crisis_tool.py (9 tests, all passing)
2026-04-15 21:00:06 -04:00
asheriif
33ae403890 fix(gateway): fix matrix lingering typing indicator 2026-04-15 04:16:16 -07:00
Teknium
47e6ea84bb fix: file handle bug, warning text, and tests for Discord media send
- Fix file handle closed before POST: nest session.post() inside
  the 'with open()' block so aiohttp can read the file during upload
- Update warning text to include weixin (also supports media delivery)
- Add 8 unit tests covering: text+media, media-only, missing files,
  upload failures, multiple files, and _send_to_platform routing
2026-04-15 04:16:06 -07:00