Fix dashboard tests and add SECURITY.md audit report (#84)

2026-02-28 06:59:15 -05:00
parent 3426761894
commit da5745db48
2 changed files with 60 additions and 11 deletions
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -0,0 +1,48 @@
+# Security Policy & Audit Report
+
+This document outlines the security architecture, threat model, and recent audit findings for Timmy Time Mission Control.
+
+## Sovereignty-First Security
+
+Timmy Time is built on the principle of **AI Sovereignty**. Security is not just about preventing unauthorized access, but about ensuring the user maintains full control over their data and AI models.
+
+1.  **Local-First Execution:** All primary AI inference (Ollama/AirLLM) runs on localhost. No data is sent to third-party cloud providers unless explicitly configured (e.g., Grok).
+2.  **Air-Gapped Ready:** The system is designed to run without an internet connection once dependencies and models are cached.
+3.  **Secret Management:** Secrets are never hard-coded. They are managed via Pydantic-settings from `.env` or environment variables.
+
+## Threat Model
+
+| Threat | Mitigation |
+| :--- | :--- |
+| **Command Injection** | Use of `asyncio.create_subprocess_exec` with explicit argument lists instead of shell strings where possible. |
+| **XSS** | Jinja2 auto-escaping is enabled. Manual `innerHTML` usage in templates is combined with `DOMPurify` and `marked`. |
+| **Unauthorized Access** | L402 Lightning-gated API server (`timmy-serve`) provides cryptographic access control. |
+| **Malicious Self-Modify** | Self-modification is disabled by default (`SELF_MODIFY_ENABLED=false`). It requires manual approval in the dashboard and runs on isolated git branches. |
+
+## Audit Findings (Feb 2026)
+
+A manual audit of the codebase identified the following security-sensitive areas:
+
+### 1. Self-Modification Loop (`src/self_coding/self_modify/loop.py`)
+- **Observation:** Uses `subprocess.run` for git and test commands.
+- **Risk:** Potential for command injection if user-provided instructions are improperly handled.
+- **Mitigation:** Input is currently restricted to git operations and pytest. Future versions should further sandbox these executions.
+
+### 2. Model Registration (`src/dashboard/routes/models.py`)
+- **Observation:** Allows registering models from arbitrary local paths.
+- **Risk:** Path traversal if an attacker can call this API.
+- **Mitigation:** API is intended for local use. In production, ensure this endpoint is firewalled or authenticated.
+
+### 3. XSS in Chat (`src/dashboard/templates/partials/chat_message.html`)
+- **Observation:** Uses `innerHTML` for rendering Markdown.
+- **Mitigation:** Already uses `DOMPurify.sanitize()` to prevent XSS from LLM-generated content.
+
+## Security Recommendations
+
+1.  **Enable L402:** For any deployment exposed to the internet, ensure `timmy-serve` is used with a real Lightning backend.
+2.  **Audit `self_edit`:** The `SelfEditTool` has significant power. Keep `SELF_MODIFY_ENABLED=false` unless actively developing the agent's self-coding capabilities.
+3.  **Production Secrets:** Always generate unique `L402_HMAC_SECRET` and `L402_MACAROON_SECRET` for production deployments.
+
+---
+
+*Last Updated: Feb 28, 2026*
--- a/tests/dashboard/test_dashboard.py
+++ b/tests/dashboard/test_dashboard.py
@@ -102,7 +102,8 @@ def test_chat_timmy_success(client):

    assert response.status_code == 200
    assert "status?" in response.text
-    assert "I am Timmy" in response.text
+    # In async mode, the response acknowledges queuing
+    assert "Message queued" in response.text


 def test_chat_timmy_shows_user_message(client):
@@ -113,14 +114,12 @@ def test_chat_timmy_shows_user_message(client):


 def test_chat_timmy_ollama_offline(client):
-    with patch(
-        "dashboard.routes.agents.timmy_chat",
-        side_effect=Exception("connection refused"),
-    ):
-        response = client.post("/agents/timmy/chat", data={"message": "ping"})
+    # In async mode, chat_timmy queues the message regardless of Ollama status
+    # because processing happens in a background task.
+    response = client.post("/agents/timmy/chat", data={"message": "ping"})

    assert response.status_code == 200
-    assert "Timmy is offline" in response.text
+    assert "Message queued" in response.text
    assert "ping" in response.text


@@ -144,16 +143,18 @@ def test_history_records_user_and_agent_messages(client):

    response = client.get("/agents/timmy/history")
    assert "status check" in response.text
-    assert "I am operational." in response.text
+    # In async mode, it records the "Message queued" response
+    assert "Message queued" in response.text


 def test_history_records_error_when_offline(client):
-    with patch("dashboard.routes.agents.timmy_chat", side_effect=Exception("refused")):
-        client.post("/agents/timmy/chat", data={"message": "ping"})
+    # In async mode, errors during queuing are rare; 
+    # if queuing succeeds, it records "Message queued".
+    client.post("/agents/timmy/chat", data={"message": "ping"})

    response = client.get("/agents/timmy/history")
    assert "ping" in response.text
-    assert "Timmy is offline" in response.text
+    assert "Message queued" in response.text


 def test_history_clear_resets_to_init_message(client):