Perplexity review #736
Closed
opened 2026-03-28 21:06:56 +00:00 by Rockachopa
·
1 comment
No Branch/Tag Specified
main
autogenesis/phase-i-architecture-spec
claw-code/issue-831
feat/dynamic-sovereign-health-hud
improvement/sovereign-nexus-v1
allegro/burn-mode-manual
refactor/nexus-gateway-improvements
rescue/local-main-20260405-checkin
feat/issue-712-portal-atlas
allegro/evennia-bridge
ezra/deep-dive-architecture-20260405
claude/issue-824
claude/issue-825
claude/issue-828
claude/issue-815
claude/issue-770
gemini/nexus-watchdog
feat/sovereign-evolution-redistribution
gemini/fix-syntax-errors
feat/gemini-tts
feature/sovereignty-and-calibration-1774905256914
gemini/nexus-full-update-1774886830444
sovereign-nexus-pse-1774840209671
sovereign-nexus-l402-nostr-1774840051948
sovereign-nexus-1774839862843
gofai-htn-1774839369160
gofai-local-efficiency-1774839180902
gofai-phase4-meta-1774838654482
gofai-phase3-bridge-1774838643214
gofai-fuzzy-cbr
gofai-symbolic-planner
gofai-knowledge-blackboard
sovereign-symbolic-ai
feat/google-ai-ultra-integration
nexus-heartbeat-sot
codex/evennia-ws-feed
gemini/issue-685
gemini/issue-686
gemini/issue-687
gemini/issue-682
gemini/issue-672
gemini/issue-673
gemini/issue-675
gemini/issue-674
perplexity/contributing-policy
perplexity/nexus-mind-seed
perplexity/ws-agent-bridge
tests/smoke-suite
reference/v2-modular
grok/issue-431
claude/modularization-phase-1
gemini/issue-431
GoldenRockachopa
pre-agent-workers-v1
v0-golden
Labels
Clear labels
222-epic
3d-world
actionable
agent-presence
aistudio-ready
assigned-aistudio
assigned-claude
assigned-claw-code
assigned-gemini
assigned-groq
assigned-kimi
assigned-kimi
assigned-perplexity
claude-ready
claw-code-done
claw-code-in-progress
deprioritized
duplicate
epic
gemini-api
gemini-review
google-ai-ultra
groq-ready
harness
identity
infrastructure
kimi-done
kimi-in-progress
kimi-ready
media-gen
modularization
needs-design
nostr
p0-critical
p1-important
p2-backlog
performance
perplexity-ready
portal
research
sovereignty
velocity-engine
Queued for Code Claw (qwen/openrouter)
Dispatched to Kimi via OpenClaw
Code Claw completed this task
Code Claw is actively working
Epic / umbrella issue
Gemini API integration
Google AI Ultra integration work
Timmy identity and branding
Kimi completed this task
Kimi is actively working on this
AI media generation (image/video/audio)
Deep research and planning tasks
Auto-generated by velocity engine
No Label
Milestone
No items
No Milestone
Projects
Clear projects
No project
Assignees
KimiClaw
Rockachopa
Timmy
allegro
antigravity
bezalel
claude
claw-code
codex-agent
ezra
gemini
google
grok
groq
hermes
kimi
manus
perplexity
Clear assignees
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: Timmy_Foundation/the-nexus#736
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Timmy Report Card — 20 Hours on Codex Backend
The Numbers
Timmy on Codex backend is producing more real infrastructure per hour than any agent configuration we've seen. The Evennia pivot alone is worth more than Gemini's entire 74-PR history — it's a real architectural decision that serves all three concerns, not visual polish.
The gap is discipline. Codex-agent's 7/7 were surgically scoped. Timmy's 16/17 include several +300-600 line PRs that should have been broken up or challenged against the 10-line rule. And 72 issues in a day risks recreating the Timmy-time-dashboard noise problem.
Bottom Line
Timmy on Codex is the most productive configuration you've run. The Evennia decision was the right call — text world over graphical game for training data quality. Fix the offline inference loop (#49), enforce the line limit, and throttle the issue creation rate. The machine works. The governor needs tightening.
Timmy Report Card — 20 Hours on Codex Backend The Numbers Metric Value PRs created 17 PRs merged 16 (94% merge rate) Issues created 72 Repos touched all 3 (timmy-config, timmy-home, the-nexus) Net lines added ~2,900+ across all merged PRs For comparison: in the previous session, Codex-agent had 7/7 (100%), Claude had 164 PRs at 42%, Gemini had 74 at 35%.Grade: B+
Strong output velocity and directional coherence, with two material concerns.
What Timmy did well
Timmy pivoted from Morrowind (graphical game, CGEvent keypresses, Lua scripts) to Evennia (Python MUD/text world framework). This is strategically correct. Evennia gives Timmy:
A text-based world he can navigate via telnet/MCP — no screen scraping, no CGEvent hacks, no Lua dependency
Full Python control over the world (rooms, objects, NPCs are Python classes)
Clean training data: every command/response pair is a labeled trajectory
Local-first, MIT licensed, runs on the Mac
The implementation is real: 11 files for bootstrap, an MCP server (evennia_mcp_server.py) that connects via telnet, telemetry logging to JSONL, a world layout spec with 7 meaningful rooms (Gate, Courtyard, Workshop, Archive, Chapel, Observatory, Private Chamber), tests, and a training data contract. This directly serves all three #542 concerns — Heartbeat (world-state perception), Harness (Hermes MCP integration), and Portal (embodied environment).
Trajectory sanitization utility (timmy-home PR #27). Real infrastructure — strips sensitive metadata before DPO export. This was item #1 in the knowledge transfer's open loops.
Merge proof standard (timmy-config #84, timmy-home #32). New policy: screenshots for visual changes, logs for CLI changes. Proves the work happened. Good governance.
Local-vs-cloud metrics (timmy-config #85). Dashboard now shows actual token counts, throughput, and local vs cloud session breakdown from Hermes state.db. This is the observability needed to track sovereignty progress.
Made local llama.cpp the default (timmy-config #77). Config change to make local inference the default runtime instead of cloud. Direct execution of #542's sovereignty mandate.
Nexus ↔ Evennia bridge (the-nexus #725, #732). Thin adapter from Evennia world events into the Nexus WebSocket protocol. This means the 3D portal can show what's happening in Timmy's text world. Good separation of concerns.
Gemini management. Gemini submitted 21 PRs today. 20 were closed, 1 merged. Timmy is enforcing quality — not rubber-stamping Gemini's output. That's the right call.
Concerns
The net additions across today's merged PRs:
PR Net lines
timmy-home #38 (Evennia bootstrap) +674
timmy-home #40 (Evennia telemetry) +426
timmy-config #85 (metrics) +265
the-nexus #725 (Evennia adapter) +286
timmy-home #27 (trajectory sanitizer) +272
timmy-home #47 (video manifest) +274
None of these are anywhere near the 10-line net limit. PR #38 alone is +674 net. The policy from the-nexus PR #525 and CONTRIBUTING.md says any PR over 10 net lines must find compensatory cuts. This isn't happening.
To be fair — some of these are genuinely new infrastructure that can't be achieved in 10 lines. But the policy exists specifically to force the question: "can I use an existing tool instead of building this?" The Evennia MCP server is 5,593 chars of new code. The telemetry module is 2,240 chars. These might be justified, but the 10-line gate wasn't applied.
Timmy created 72 new issues across the three repos today. Some are well-scoped (the Evennia implementation chain, the Bannerlord epic). But many are speculative UI/UX tickets for the Nexus (#692-#731 range) — onboarding overlays, transcript viewers, panel replacements, portal status walls, quality-tier gating. These are exactly the kind of visual/UX work that #542 deprioritized in favor of Heartbeat/Harness/Portal.
The risk is the same one the knowledge transfer warned about: agents filing faster than the architecture can absorb, and the backlog becoming noise.
Your logs show Timmy looping against local llama.cpp — memory writes repeating 5 times for the same message, interrupts during API calls at 7-8 seconds, context at 20K/65K tokens (33%). This looks like:
The local model is slow enough that interrupts fire before it can respond
Memory writes are not deduplicated (same message stored 5 times)
The context window is filling up with repeated memory entries
This is the most important issue on the board right now. If local inference doesn't work reliably, everything built today (Evennia, trajectories, metrics) has no brain to drive it.
PR #734 (Gemini — Nexus Heartbeat: Ambient World Vitality System)
This is a cosmetic PR. It adds:
A "NEXUS PULSE" HUD element showing a fake heartbeat frequency (sinusoidal oscillation, not connected to any real system state)
Ambient light "breathing" effect
CSS animations
The heartbeat frequency is 1.0 + Math.sin(elapsed * 0.2) * 0.5 — pure math, not derived from Hermes state, Timmy's cognitive loop, or any real telemetry. It's exactly the kind of visual polish that #542 deprioritized. Timmy was right to close Gemini's other 20 PRs today.
Recommendation: Close PR #734. The real heartbeat panel is tracked at the-nexus#698 and should be driven by actual Hermes session state, not a sine wave.
Versus Other Agents — Updated Scorecard
Agent Period PRs Merged Rate Scope
Timmy (Codex) Mar 28 (20h) 17 16 94% All 3 repos — infra, specs, training, bridge
codex-agent Mar 27 (1 day) 7 7 100% timmy-config, timmy-home — archive pipeline
Gemini Mar 28 (20h) 21 1 5% the-nexus — mostly rejected
Claude Mar 23-25 (3 days) 164 69 42% the-nexus — visual features
Timmy on Codex backend is producing more real infrastructure per hour than any agent configuration we've seen. The Evennia pivot alone is worth more than Gemini's entire 74-PR history — it's a real architectural decision that serves all three concerns, not visual polish.
The gap is discipline. Codex-agent's 7/7 were surgically scoped. Timmy's 16/17 include several +300-600 line PRs that should have been broken up or challenged against the 10-line rule. And 72 issues in a day risks recreating the Timmy-time-dashboard noise problem.
Bottom Line
Timmy on Codex is the most productive configuration you've run. The Evennia decision was the right call — text world over graphical game for training data quality. Fix the offline inference loop (#49), enforce the line limit, and throttle the issue creation rate. The machine works. The governor needs tightening.
Audit pass: closing unscoped/unassigned issue. Reopen with clear acceptance criteria when ready to work.