Creates a bounded username OSINT research packet comparing Sherlock, Maigret, and Socialscan against a common 5-username × 4-platform sample set. Provides: - research/username-osint/tool-comparison.md — full technical matrix covering coverage, install friction, maintenance state, sovereignty fit, output schema differences, and false-positive behavior - research/username-osint/decision-memo.md — executive summary and clear recommendation: adopt Maigret as primary, keep Socialscan as fast secondary for CI, archive Sherlock to reference-only - docs/USERNAME_OSINT_POLICY.md — operator policy governing invocation, storage boundaries, provenance envelope (YAML frontmatter requirement), interpretation guardrails (handle-found vs identity-proven language), review/retention rules, and audit-trail logging mandate Also documents the small bounded sample used and records verbosity/ accuracy trade-offs. Closes #875
127 lines
4.8 KiB
Markdown
127 lines
4.8 KiB
Markdown
# Username OSINT Operator Policy
|
|
|
|
**Effective**: 2026-04-26
|
|
**Applies to**: Username enumeration results produced by `maigret` / `socialscan` / `sherlock`
|
|
**Exempt**: Manual human social-engineering (this policy covers automated tool output only)
|
|
**Related**: timmy-home#875, `research/username-osint/decision-memo.md`
|
|
|
|
---
|
|
|
|
## 1. Purpose
|
|
|
|
This policy governs how username OSINT findings are stored, interpreted, and acted upon within Timmy. It exists to prevent:
|
|
- Treating heuristic matches as identity proof
|
|
- Accumulating stale or misattributed data in durable storage
|
|
- Acting on findings without human review and source validation
|
|
|
|
---
|
|
|
|
## 2. Scope
|
|
|
|
This policy applies when any of the following tools are invoked:
|
|
- `maigret` (primary)
|
|
- `socialscan` (secondary)
|
|
- `sherlock` (archived/reference-only)
|
|
|
|
Tools may be invoked:
|
|
- via `hermes` session with explicit instruction
|
|
- via standalone script in `scripts/username-osint/`
|
|
- via ad-hoc terminal command (operator discretion)
|
|
|
|
---
|
|
|
|
## 3. Storage boundaries
|
|
|
|
### 3.1 File locations
|
|
- **Research packets** (bounded study artifacts) → `research/username-osint/`
|
|
- **Single-use findings** (ad-hoc runs not tied to a study) → `/tmp/` (ephemeral)
|
|
- **Canonical knowledge** (vetted, review-approved) → `knowledge/username-handles/` (if such a directory exists; otherwise never write to durable knowledge store)
|
|
|
|
### 3.2 Naming & provenance envelope
|
|
Every saved artifact (to `research/username-osint/` or any durable location) **must** include a YAML frontmatter block:
|
|
|
|
```yaml
|
|
---
|
|
date: YYYY-MM-DD
|
|
tool: maigret|socialscan|sherlock # exact command line used
|
|
tool_version: <pip show version output>
|
|
username_pattern: <pattern or list used; e.g. "alice,bob,charlie" or "@corp-employees.txt">
|
|
sample_platforms: [github,twitter,instagram,reddit] # or "full-site-list"
|
|
status: draft|review|approved|rejected
|
|
reviewer: <hermes username or empty if unreviewed>
|
|
provenance_notes: |
|
|
Free-text notes about rate limits, VPN usage, time-of-day, or other context
|
|
that affects reproducibility.
|
|
---
|
|
```
|
|
|
|
The frontmatter is followed by the tool's raw JSON output (preserved verbatim) plus an optional human summary.
|
|
|
|
---
|
|
|
|
## 4. Invocation rules
|
|
|
|
| Invocation type | Allowed | Conditions |
|
|
|---|---|---|
|
|
| **Explicit Hermes command** | ✅ | User must name the tool and sample set explicitly in the session |
|
|
| **Automated pipeline** | ⚠️ | Must include `--json` flag and write to `research/username-osint/` with provenance frontmatter |
|
|
| **Blind/autonomous discovery** | ❌ | Agent may NOT autonomously decide to run username enumeration |
|
|
|
|
**No silent runs**. Every invocation must be traceable to a user message or logged pipeline step.
|
|
|
|
---
|
|
|
|
## 5. Interpretation guardrails
|
|
|
|
### 5.1 Language conventions (what you CAN say)
|
|
- ✅ "Handle `alice` is found on GitHub (HTTP 200)"
|
|
- ✅ "Platform presence detected for `alice` on 4 of 4 checked services"
|
|
- ✅ "No public handle matches were found in the sample set"
|
|
|
|
### 5.2 Prohibited language (what you CANNOT say)
|
|
- ❌ "`alice` is the identity of the target"
|
|
- ❌ "This proves `alice` owns these accounts"
|
|
- ❌ "These accounts belong to the subject"
|
|
- ❌ "We have identified the person behind handle X"
|
|
|
|
**Rationale**: HTTP presence ≠ identity ownership. Platform migration, shared devices, and impersonation are common. These tools detect *availability of a public handle*, not *ownership of an identity*.
|
|
|
|
---
|
|
|
|
## 6. Review & retention
|
|
|
|
### 6.1 Review requirement
|
|
Any artifact promoted from `research/username-osint/` to `knowledge/` (if such exists) **must** be reviewed by a human operator. Review checklist:
|
|
- [ ] Source tool version recorded in frontmatter
|
|
- [ ] False-positive spot-check performed (≥10% of found handles manually verified)
|
|
- [ ] Implausible matches flagged (e.g., handles that are 10+ years old but target is known to be <5)
|
|
- [ ] Storage location confirmed appropriate (research vs knowledge)
|
|
|
|
### 6.2 Retention & deletion
|
|
- **Research artifacts**: Retained indefinitely (they are dated study packets)
|
|
- **Single-use findings** in `/tmp/`: Deleted after 7 days by cron job (`scripts/cleanup_tmp_artifacts.sh`)
|
|
- Stale artifacts without `status: approved` after 90 days are **archived** (moved to `archive/`), not deleted
|
|
|
|
---
|
|
|
|
## 7. Audit trail
|
|
|
|
All tool invocations that write to durable storage **must** log to `~/.timmy/logs/username-osint.log` with:
|
|
```
|
|
YYYY-MM-DD HH:MM:SS | tool=<tool> | usernames=<count> | platforms=<list> | output=<path> | reviewer=<name or "unreviewed">
|
|
```
|
|
|
|
This enables traceability from any stored JSON back to the exact run.
|
|
|
|
---
|
|
|
|
## 8. Exceptions
|
|
|
|
Requests for exception to this policy require:
|
|
1. A written justification in the research artifact's frontmatter (`provenance_notes`)
|
|
2. Human reviewer sign-off in the `reviewer` field
|
|
3. Explicit `status: approved` designation
|
|
|
|
No exceptions are granted for autonomous or unattended runs.
|
|
|