Files
timmy-home/docs/USERNAME_OSINT_POLICY.md
Step35 Burn Agent 27df53f340
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 14s
Smoke Test / smoke (pull_request) Failing after 15s
Agent PR Gate / gate (pull_request) Failing after 32s
Agent PR Gate / report (pull_request) Successful in 19s
[Sherlock] Study packet — comparison, operator policy, and knowledge artifact
Creates a bounded username OSINT research packet comparing Sherlock, Maigret,
and Socialscan against a common 5-username × 4-platform sample set. Provides:

- research/username-osint/tool-comparison.md — full technical matrix
  covering coverage, install friction, maintenance state, sovereignty fit,
  output schema differences, and false-positive behavior
- research/username-osint/decision-memo.md — executive summary and
  clear recommendation: adopt Maigret as primary, keep Socialscan as
  fast secondary for CI, archive Sherlock to reference-only
- docs/USERNAME_OSINT_POLICY.md — operator policy governing invocation,
  storage boundaries, provenance envelope (YAML frontmatter requirement),
  interpretation guardrails (handle-found vs identity-proven language),
  review/retention rules, and audit-trail logging mandate

Also documents the small bounded sample used and records verbosity/
accuracy trade-offs.

Closes #875
2026-04-26 17:31:04 -04:00

4.8 KiB

Username OSINT Operator Policy

Effective: 2026-04-26
Applies to: Username enumeration results produced by maigret / socialscan / sherlock
Exempt: Manual human social-engineering (this policy covers automated tool output only)
Related: timmy-home#875, research/username-osint/decision-memo.md


1. Purpose

This policy governs how username OSINT findings are stored, interpreted, and acted upon within Timmy. It exists to prevent:

  • Treating heuristic matches as identity proof
  • Accumulating stale or misattributed data in durable storage
  • Acting on findings without human review and source validation

2. Scope

This policy applies when any of the following tools are invoked:

  • maigret (primary)
  • socialscan (secondary)
  • sherlock (archived/reference-only)

Tools may be invoked:

  • via hermes session with explicit instruction
  • via standalone script in scripts/username-osint/
  • via ad-hoc terminal command (operator discretion)

3. Storage boundaries

3.1 File locations

  • Research packets (bounded study artifacts) → research/username-osint/
  • Single-use findings (ad-hoc runs not tied to a study) → /tmp/ (ephemeral)
  • Canonical knowledge (vetted, review-approved) → knowledge/username-handles/ (if such a directory exists; otherwise never write to durable knowledge store)

3.2 Naming & provenance envelope

Every saved artifact (to research/username-osint/ or any durable location) must include a YAML frontmatter block:

---
date: YYYY-MM-DD
tool: maigret|socialscan|sherlock  # exact command line used
tool_version: <pip show version output>
username_pattern: <pattern or list used; e.g. "alice,bob,charlie" or "@corp-employees.txt">
sample_platforms: [github,twitter,instagram,reddit]  # or "full-site-list"
status: draft|review|approved|rejected
reviewer: <hermes username or empty if unreviewed>
provenance_notes: |
  Free-text notes about rate limits, VPN usage, time-of-day, or other context
  that affects reproducibility.
---

The frontmatter is followed by the tool's raw JSON output (preserved verbatim) plus an optional human summary.


4. Invocation rules

Invocation type Allowed Conditions
Explicit Hermes command User must name the tool and sample set explicitly in the session
Automated pipeline ⚠️ Must include --json flag and write to research/username-osint/ with provenance frontmatter
Blind/autonomous discovery Agent may NOT autonomously decide to run username enumeration

No silent runs. Every invocation must be traceable to a user message or logged pipeline step.


5. Interpretation guardrails

5.1 Language conventions (what you CAN say)

  • "Handle alice is found on GitHub (HTTP 200)"
  • "Platform presence detected for alice on 4 of 4 checked services"
  • "No public handle matches were found in the sample set"

5.2 Prohibited language (what you CANNOT say)

  • "alice is the identity of the target"
  • "This proves alice owns these accounts"
  • "These accounts belong to the subject"
  • "We have identified the person behind handle X"

Rationale: HTTP presence ≠ identity ownership. Platform migration, shared devices, and impersonation are common. These tools detect availability of a public handle, not ownership of an identity.


6. Review & retention

6.1 Review requirement

Any artifact promoted from research/username-osint/ to knowledge/ (if such exists) must be reviewed by a human operator. Review checklist:

  • Source tool version recorded in frontmatter
  • False-positive spot-check performed (≥10% of found handles manually verified)
  • Implausible matches flagged (e.g., handles that are 10+ years old but target is known to be <5)
  • Storage location confirmed appropriate (research vs knowledge)

6.2 Retention & deletion

  • Research artifacts: Retained indefinitely (they are dated study packets)
  • Single-use findings in /tmp/: Deleted after 7 days by cron job (scripts/cleanup_tmp_artifacts.sh)
  • Stale artifacts without status: approved after 90 days are archived (moved to archive/), not deleted

7. Audit trail

All tool invocations that write to durable storage must log to ~/.timmy/logs/username-osint.log with:

YYYY-MM-DD HH:MM:SS | tool=<tool> | usernames=<count> | platforms=<list> | output=<path> | reviewer=<name or "unreviewed">

This enables traceability from any stored JSON back to the exact run.


8. Exceptions

Requests for exception to this policy require:

  1. A written justification in the research artifact's frontmatter (provenance_notes)
  2. Human reviewer sign-off in the reviewer field
  3. Explicit status: approved designation

No exceptions are granted for autonomous or unattended runs.