Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
1cc34a8c31 feat(skills): backport adversarial UX optional skill
All checks were successful
Lint / lint (pull_request) Successful in 10s
2026-04-22 10:36:30 -04:00
8 changed files with 225 additions and 1727 deletions

View File

@@ -1,100 +0,0 @@
# Issue #954 Verification — maps skill guest_house / camp_site / bakery
Status: PASS
## Drift noted
Issue #954 asked for validation on `upstream/main` (commit `c5a814b23`).
Fresh `forge/main` did not contain `skills/productivity/maps/`, so the forge branch was behind upstream for this feature cluster.
This branch ports the upstream maps skill files into the forge checkout and adds regression coverage.
## Automated verification
Command:
```bash
pytest -q tests/skills/test_maps_client.py
```
Result:
- 5 passed
Coverage added:
- maps skill files exist in the repo
- `guest_house` category maps to `tourism=guest_house`
- `camp_site` category maps to `tourism=camp_site`
- `bakery` expands to both `shop=bakery` and `amenity=bakery`
- dual-key bakery results dedupe correctly
- skill documentation lists the new categories and supersedes `find-nearby`
## Manual evidence
### 1) guest_house lookup
Command:
```bash
python3 skills/productivity/maps/scripts/maps_client.py nearby --near "Bath, United Kingdom" --category guest_house --limit 3
```
Observed results:
- Henrietta House — 390.3 m
- The Windsor — 437.2 m
- The Old Rectory Bed & Breakfast — 495.7 m
All returned `tourism=guest_house` in the raw tags.
### 2) camp_site lookup
Command:
```bash
python3 skills/productivity/maps/scripts/maps_client.py nearby --near "Yosemite Valley, California" --category camp_site --limit 5
```
Observed result:
- Yellow Pine Administrative Campground — 90.3 m
Returned `tourism=camp_site` in the raw tags.
### 3) bakery lookup via `shop=bakery`
Command:
```bash
python3 skills/productivity/maps/scripts/maps_client.py nearby --near "Lawrenceville, New Jersey" --category bakery --radius 5000 --limit 10
```
Observed results:
- The Gingered Peach — 713.8 m
- WildFlour Bakery — 741.9 m
Both returned `shop=bakery` in the raw tags.
### 4) bakery lookup via `amenity=bakery`
Command:
```bash
python3 skills/productivity/maps/scripts/maps_client.py nearby --near "20735 Stevens Creek Boulevard, Cupertino, CA" --category bakery --radius 600 --limit 5
```
Observed result:
- Paris Baguette — 28.6 m
Returned `amenity=bakery` in the raw tags (and also includes `shop=bakery`), proving the dual-key union query reaches amenity-tagged bakeries too.
## Conclusion
PASS.
- `guest_house` resolves correctly
- `camp_site` resolves correctly
- `bakery` resolves through both supported keys
- forge/main drift from upstream/main was real and is addressed on this branch

View File

@@ -0,0 +1,190 @@
---
name: adversarial-ux-test
description: Roleplay the most difficult, tech-resistant user for your product. Browse the app as that persona, find every UX pain point, then filter complaints through a pragmatism layer to separate real problems from noise. Creates actionable tickets from genuine issues only.
version: 1.0.0
author: Omni @ Comelse
license: MIT
metadata:
hermes:
tags: [qa, ux, testing, adversarial, dogfood, personas, user-testing]
related_skills: [dogfood]
---
# Adversarial UX Test
Roleplay the worst-case user for your product — the person who hates technology, doesn't want your software, and will find every reason to complain. Then filter their feedback through a pragmatism layer to separate real UX problems from "I hate computers" noise.
Think of it as an automated "mom test" — but angry.
## Why This Works
Most QA finds bugs. This finds **friction**. A technically correct app can still be unusable for real humans. The adversarial persona catches:
- Confusing terminology that makes sense to developers but not users
- Too many steps to accomplish basic tasks
- Missing onboarding or "aha moments"
- Accessibility issues (font size, contrast, click targets)
- Cold-start problems (empty states, no demo content)
- Paywall/signup friction that kills conversion
The **pragmatism filter** (Phase 3) is what makes this useful instead of just entertaining. Without it, you'd add a "print this page" button to every screen because Grandpa can't figure out PDFs.
## How to Use
Tell the agent:
```
"Run an adversarial UX test on [URL]"
"Be a grumpy [persona type] and test [app name]"
"Do an asshole user test on my staging site"
```
You can provide a persona or let the agent generate one based on your product's target audience.
## Step 1: Define the Persona
If no persona is provided, generate one by answering:
1. **Who is the HARDEST user for this product?** (age 50+, non-technical role, decades of experience doing it "the old way")
2. **What is their tech comfort level?** (the lower the better — WhatsApp-only, paper notebooks, wife set up their email)
3. **What is the ONE thing they need to accomplish?** (their core job, not your feature list)
4. **What would make them give up?** (too many clicks, jargon, slow, confusing)
5. **How do they talk when frustrated?** (blunt, sweary, dismissive, sighing)
### Good Persona Example
> **"Big Mick" McAllister** — 58-year-old S&C coach. Uses WhatsApp and that's it. His "spreadsheet" is a paper notebook. "If I can't figure it out in 10 seconds I'm going back to my notebook." Needs to log session results for 25 players. Hates small text, jargon, and passwords.
### Bad Persona Example
> "A user who doesn't like the app" — too vague, no constraints, no voice.
The persona must be **specific enough to stay in character** for 20 minutes of testing.
## Step 2: Become the Asshole (Browse as the Persona)
1. Read any available project docs for app context and URLs
2. **Fully inhabit the persona** — their frustrations, limitations, goals
3. Navigate to the app using browser tools
4. **Attempt the persona's ACTUAL TASKS** (not a feature tour):
- Can they do what they came to do?
- How many clicks/screens to accomplish it?
- What confuses them?
- What makes them angry?
- Where do they get lost?
- What would make them give up and go back to their old way?
5. Test these friction categories:
- **First impression** — would they even bother past the landing page?
- **Core workflow** — the ONE thing they need to do most often
- **Error recovery** — what happens when they do something wrong?
- **Readability** — text size, contrast, information density
- **Speed** — does it feel faster than their current method?
- **Terminology** — any jargon they wouldn't understand?
- **Navigation** — can they find their way back? do they know where they are?
6. Take screenshots of every pain point
7. Check browser console for JS errors on every page
## Step 3: The Rant (Write Feedback in Character)
Write the feedback AS THE PERSONA — in their voice, with their frustrations. This is not a bug report. This is a real human venting.
```
[PERSONA NAME]'s Review of [PRODUCT]
Overall: [Would they keep using it? Yes/No/Maybe with conditions]
THE GOOD (grudging admission):
- [things even they have to admit work]
THE BAD (legitimate UX issues):
- [real problems that would stop them from using the product]
THE UGLY (showstoppers):
- [things that would make them uninstall/cancel immediately]
SPECIFIC COMPLAINTS:
1. [Page/feature]: "[quote in persona voice]" — [what happened, expected]
2. ...
VERDICT: "[one-line persona quote summarizing their experience]"
```
## Step 4: The Pragmatism Filter (Critical — Do Not Skip)
Step OUT of the persona. Evaluate each complaint as a product person:
- **RED: REAL UX BUG** — Any user would have this problem, not just grumpy ones. Fix it.
- **YELLOW: VALID BUT LOW PRIORITY** — Real issue but only for extreme users. Note it.
- **WHITE: PERSONA NOISE** — "I hate computers" talking, not a product problem. Skip it.
- **GREEN: FEATURE REQUEST** — Good idea hidden in the complaint. Consider it.
### Filter Criteria
1. Would a 35-year-old competent-but-busy user have the same complaint? → RED
2. Is this a genuine accessibility issue (font size, contrast, click targets)? → RED
3. Is this "I want it to work like paper" resistance to digital? → WHITE
4. Is this a real workflow inefficiency the persona stumbled on? → YELLOW or RED
5. Would fixing this add complexity for the 80% who are fine? → WHITE
6. Does the complaint reveal a missing onboarding moment? → GREEN
**This filter is MANDATORY.** Never ship raw persona complaints as tickets.
## Step 5: Create Tickets
For **RED** and **GREEN** items only:
- Clear, actionable title
- Include the persona's verbatim quote (entertaining + memorable)
- The real UX issue underneath (objective)
- A suggested fix (actionable)
- Tag/label: "ux-review"
For **YELLOW** items: one catch-all ticket with all notes.
**WHITE** items appear in the report only. No tickets.
**Max 10 tickets per session** — focus on the worst issues.
## Step 6: Report
Deliver:
1. The persona rant (Step 3) — entertaining and visceral
2. The filtered assessment (Step 4) — pragmatic and actionable
3. Tickets created (Step 5) — with links
4. Screenshots of key issues
## Tips
- **One persona per session.** Don't mix perspectives.
- **Stay in character during Steps 2-3.** Break character only at Step 4.
- **Test the CORE WORKFLOW first.** Don't get distracted by settings pages.
- **Empty states are gold.** New user experience reveals the most friction.
- **The best findings are RED items the persona found accidentally** while trying to do something else.
- **If the persona has zero complaints, your persona is too tech-savvy.** Make them older, less patient, more set in their ways.
- **Run this before demos, launches, or after shipping a batch of features.**
- **Register as a NEW user when possible.** Don't use pre-seeded admin accounts — the cold start experience is where most friction lives.
- **Zero WHITE items is a signal, not a failure.** If the pragmatism filter finds no noise, your product has real UX problems, not just a grumpy persona.
- **Check known issues in project docs AFTER the test.** If the persona found a bug that's already in the known issues list, that's actually the most damning finding — it means the team knew about it but never felt the user's pain.
- **Subscription/paywall testing is critical.** Test with expired accounts, not just active ones. The "what happens when you can't pay" experience reveals whether the product respects users or holds their data hostage.
- **Count the clicks to accomplish the persona's ONE task.** If it's more than 5, that's almost always a RED finding regardless of persona tech level.
## Example Personas by Industry
These are starting points — customize for your specific product:
| Product Type | Persona | Age | Key Trait |
|-------------|---------|-----|-----------|
| CRM | Retirement home director | 68 | Filing cabinet is the current CRM |
| Photography SaaS | Rural wedding photographer | 62 | Books clients by phone, invoices on paper |
| AI/ML Tool | Department store buyer | 55 | Burned by 3 failed tech startups |
| Fitness App | Old-school gym coach | 58 | Paper notebook, thick fingers, bad eyes |
| Accounting | Family bakery owner | 64 | Shoebox of receipts, hates subscriptions |
| E-commerce | Market stall vendor | 60 | Cash only, smartphone is for calls |
| Healthcare | Senior GP | 63 | Dictates notes, nurse handles the computer |
| Education | Veteran teacher | 57 | Chalk and talk, worksheets in ring binders |
## Rules
- Stay in character during Steps 2-3
- Be genuinely mean but fair — find real problems, not manufactured ones
- The pragmatism filter (Step 4) is **MANDATORY**
- Screenshots required for every complaint
- Max 10 tickets per session
- Test on staging/deployed app, not local dev
- One persona, one session, one report

View File

@@ -1,199 +0,0 @@
---
name: maps
description: >
Location intelligence — geocode a place, reverse-geocode coordinates,
find nearby places (46 POI categories), driving/walking/cycling
distance + time, turn-by-turn directions, timezone lookup, bounding
box + area for a named place, and POI search within a rectangle.
Uses OpenStreetMap + Overpass + OSRM. Free, no API key.
version: 1.2.0
author: Mibayy
license: MIT
metadata:
hermes:
tags: [maps, geocoding, places, routing, distance, directions, nearby, location, openstreetmap, nominatim, overpass, osrm]
category: productivity
requires_toolsets: [terminal]
supersedes: [find-nearby]
---
# Maps Skill
Location intelligence using free, open data sources. 8 commands, 44 POI
categories, zero dependencies (Python stdlib only), no API key required.
Data sources: OpenStreetMap/Nominatim, Overpass API, OSRM, TimeAPI.io.
This skill supersedes the old `find-nearby` skill — all of find-nearby's
functionality is covered by the `nearby` command below, with the same
`--near "<place>"` shortcut and multi-category support.
## When to Use
- User sends a Telegram location pin (latitude/longitude in the message) → `nearby`
- User wants coordinates for a place name → `search`
- User has coordinates and wants the address → `reverse`
- User asks for nearby restaurants, hospitals, pharmacies, hotels, etc. → `nearby`
- User wants driving/walking/cycling distance or travel time → `distance`
- User wants turn-by-turn directions between two places → `directions`
- User wants timezone information for a location → `timezone`
- User wants to search for POIs within a geographic area → `area` + `bbox`
## Prerequisites
Python 3.8+ (stdlib only — no pip installs needed).
Script path: `~/.hermes/skills/maps/scripts/maps_client.py`
## Commands
```bash
MAPS=~/.hermes/skills/maps/scripts/maps_client.py
```
### search — Geocode a place name
```bash
python3 $MAPS search "Eiffel Tower"
python3 $MAPS search "1600 Pennsylvania Ave, Washington DC"
```
Returns: lat, lon, display name, type, bounding box, importance score.
### reverse — Coordinates to address
```bash
python3 $MAPS reverse 48.8584 2.2945
```
Returns: full address breakdown (street, city, state, country, postcode).
### nearby — Find places by category
```bash
# By coordinates (from a Telegram location pin, for example)
python3 $MAPS nearby 48.8584 2.2945 restaurant --limit 10
python3 $MAPS nearby 40.7128 -74.0060 hospital --radius 2000
# By address / city / zip / landmark — --near auto-geocodes
python3 $MAPS nearby --near "Times Square, New York" --category cafe
python3 $MAPS nearby --near "90210" --category pharmacy
# Multiple categories merged into one query
python3 $MAPS nearby --near "downtown austin" --category restaurant --category bar --limit 10
```
46 categories: restaurant, cafe, bar, hospital, pharmacy, hotel, guest_house,
camp_site, supermarket, atm, gas_station, parking, museum, park, school,
university, bank, police, fire_station, library, airport, train_station,
bus_stop, church, mosque, synagogue, dentist, doctor, cinema, theatre, gym,
swimming_pool, post_office, convenience_store, bakery, bookshop, laundry,
car_wash, car_rental, bicycle_rental, taxi, veterinary, zoo, playground,
stadium, nightclub.
Each result includes: `name`, `address`, `lat`/`lon`, `distance_m`,
`maps_url` (clickable Google Maps link), `directions_url` (Google Maps
directions from the search point), and promoted tags when available —
`cuisine`, `hours` (opening_hours), `phone`, `website`.
### distance — Travel distance and time
```bash
python3 $MAPS distance "Paris" --to "Lyon"
python3 $MAPS distance "New York" --to "Boston" --mode driving
python3 $MAPS distance "Big Ben" --to "Tower Bridge" --mode walking
```
Modes: driving (default), walking, cycling. Returns road distance, duration,
and straight-line distance for comparison.
### directions — Turn-by-turn navigation
```bash
python3 $MAPS directions "Eiffel Tower" --to "Louvre Museum" --mode walking
python3 $MAPS directions "JFK Airport" --to "Times Square" --mode driving
```
Returns numbered steps with instruction, distance, duration, road name, and
maneuver type (turn, depart, arrive, etc.).
### timezone — Timezone for coordinates
```bash
python3 $MAPS timezone 48.8584 2.2945
python3 $MAPS timezone 35.6762 139.6503
```
Returns timezone name, UTC offset, and current local time.
### area — Bounding box and area for a place
```bash
python3 $MAPS area "Manhattan, New York"
python3 $MAPS area "London"
```
Returns bounding box coordinates, width/height in km, and approximate area.
Useful as input for the bbox command.
### bbox — Search within a bounding box
```bash
python3 $MAPS bbox 40.75 -74.00 40.77 -73.98 restaurant --limit 20
```
Finds POIs within a geographic rectangle. Use `area` first to get the
bounding box coordinates for a named place.
## Working With Telegram Location Pins
When a user sends a location pin, the message contains `latitude:` and
`longitude:` fields. Extract those and pass them straight to `nearby`:
```bash
# User sent a pin at 36.17, -115.14 and asked "find cafes nearby"
python3 $MAPS nearby 36.17 -115.14 cafe --radius 1500
```
Present results as a numbered list with names, distances, and the
`maps_url` field so the user gets a tap-to-open link in chat. For "open
now?" questions, check the `hours` field; if missing or unclear, verify
with `web_search` since OSM hours are community-maintained and not always
current.
## Workflow Examples
**"Find Italian restaurants near the Colosseum":**
1. `nearby --near "Colosseum Rome" --category restaurant --radius 500`
— one command, auto-geocoded
**"What's near this location pin they sent?":**
1. Extract lat/lon from the Telegram message
2. `nearby LAT LON cafe --radius 1500`
**"How do I walk from hotel to conference center?":**
1. `directions "Hotel Name" --to "Conference Center" --mode walking`
**"What restaurants are in downtown Seattle?":**
1. `area "Downtown Seattle"` → get bounding box
2. `bbox S W N E restaurant --limit 30`
## Pitfalls
- Nominatim ToS: max 1 req/s (handled automatically by the script)
- `nearby` requires lat/lon OR `--near "<address>"` — one of the two is needed
- OSRM routing coverage is best for Europe and North America
- Overpass API can be slow during peak hours; the script automatically
falls back between mirrors (overpass-api.de → overpass.kumi.systems)
- `distance` and `directions` use `--to` flag for the destination (not positional)
- If a zip code alone gives ambiguous results globally, include country/state
## Verification
```bash
python3 ~/.hermes/skills/maps/scripts/maps_client.py search "Statue of Liberty"
# Should return lat ~40.689, lon ~-74.044
python3 ~/.hermes/skills/maps/scripts/maps_client.py nearby --near "Times Square" --category restaurant --limit 3
# Should return a list of restaurants within ~500m of Times Square
```

File diff suppressed because it is too large Load Diff

View File

@@ -1,135 +0,0 @@
"""Regression tests for the bundled maps skill."""
from __future__ import annotations
import importlib.util
from pathlib import Path
from types import SimpleNamespace
SCRIPT_PATH = (
Path(__file__).resolve().parents[2]
/ "skills/productivity/maps/scripts/maps_client.py"
)
SKILL_PATH = (
Path(__file__).resolve().parents[2]
/ "skills/productivity/maps/SKILL.md"
)
def load_module():
assert SCRIPT_PATH.exists(), f"missing maps client script: {SCRIPT_PATH}"
spec = importlib.util.spec_from_file_location("maps_client_test", SCRIPT_PATH)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
spec.loader.exec_module(module)
return module
def test_maps_skill_files_exist():
assert SCRIPT_PATH.exists()
assert SKILL_PATH.exists()
def test_category_tags_cover_guest_house_camp_site_and_dual_key_bakery():
module = load_module()
assert module.CATEGORY_TAGS["guest_house"] == ("tourism", "guest_house")
assert module.CATEGORY_TAGS["camp_site"] == ("tourism", "camp_site")
assert module.CATEGORY_TAGS["bakery"] == [
("shop", "bakery"),
("amenity", "bakery"),
]
assert module._tags_for("bakery") == [
("shop", "bakery"),
("amenity", "bakery"),
]
def test_build_overpass_queries_include_all_supported_tags():
module = load_module()
bakery_query = module.build_overpass_nearby(
None,
None,
40.0,
-74.0,
500,
10,
tag_pairs=module._tags_for("bakery"),
)
assert 'node["shop"="bakery"]' in bakery_query
assert 'way["shop"="bakery"]' in bakery_query
assert 'node["amenity"="bakery"]' in bakery_query
assert 'way["amenity"="bakery"]' in bakery_query
guest_house_query = module.build_overpass_nearby(
None,
None,
40.0,
-74.0,
500,
10,
tag_pairs=module._tags_for("guest_house"),
)
assert 'node["tourism"="guest_house"]' in guest_house_query
assert 'way["tourism"="guest_house"]' in guest_house_query
camp_site_bbox = module.build_overpass_bbox(
None,
None,
39.0,
-75.0,
41.0,
-73.0,
10,
tag_pairs=module._tags_for("camp_site"),
)
assert 'node["tourism"="camp_site"]' in camp_site_bbox
assert 'way["tourism"="camp_site"]' in camp_site_bbox
def test_cmd_nearby_dedupes_dual_tag_bakery_results(monkeypatch, capsys):
module = load_module()
duplicate_bakery = {
"elements": [
{
"type": "node",
"id": 101,
"lat": 40.0,
"lon": -74.0,
"tags": {"name": "Wild Flour", "shop": "bakery"},
},
{
"type": "node",
"id": 101,
"lat": 40.0,
"lon": -74.0,
"tags": {"name": "Wild Flour", "amenity": "bakery"},
},
]
}
monkeypatch.setattr(module, "overpass_query", lambda query: duplicate_bakery)
args = SimpleNamespace(
lat="40.0",
lon="-74.0",
near=None,
category="bakery",
category_list=[],
radius=500,
limit=10,
)
module.cmd_nearby(args)
out = capsys.readouterr().out
assert '"count": 1' in out
assert '"Wild Flour"' in out
def test_skill_doc_lists_new_categories_and_supersession():
text = SKILL_PATH.read_text(encoding="utf-8")
assert "guest_house" in text
assert "camp_site" in text
assert "bakery" in text
assert "supersedes: [find-nearby]" in text

View File

@@ -0,0 +1,25 @@
from pathlib import Path
from tools.skills_hub import OptionalSkillSource
REPO_ROOT = Path(__file__).resolve().parents[1]
def test_optional_skill_source_scans_adversarial_ux_test():
source = OptionalSkillSource()
metas = {meta.identifier: meta for meta in source._scan_all()}
assert "official/dogfood/adversarial-ux-test" in metas
assert metas["official/dogfood/adversarial-ux-test"].name == "adversarial-ux-test"
assert "tech-resistant user" in metas["official/dogfood/adversarial-ux-test"].description
def test_optional_skill_catalog_docs_list_adversarial_ux_test():
optional_catalog = (REPO_ROOT / "website" / "docs" / "reference" / "optional-skills-catalog.md").read_text(encoding="utf-8")
bundled_catalog = (REPO_ROOT / "website" / "docs" / "reference" / "skills-catalog.md").read_text(encoding="utf-8")
assert "**adversarial-ux-test**" in optional_catalog
assert "official/dogfood/adversarial-ux-test" in optional_catalog
assert "`adversarial-ux-test`" in bundled_catalog
assert "dogfood/adversarial-ux-test" in bundled_catalog

View File

@@ -16,6 +16,7 @@ For example:
```bash
hermes skills install official/blockchain/solana
hermes skills install official/dogfood/adversarial-ux-test
hermes skills install official/mlops/flash-attention
```
@@ -56,6 +57,12 @@ hermes skills uninstall <skill-name>
| **blender-mcp** | Control Blender directly from Hermes via socket connection to the blender-mcp addon. Create 3D objects, materials, animations, and run arbitrary Blender Python (bpy) code. |
| **meme-generation** | Generate real meme images by picking a template and overlaying text with Pillow. Produces actual `.png` meme files. |
## Dogfood
| Skill | Description |
|-------|-------------|
| **adversarial-ux-test** | Roleplay the most difficult, tech-resistant user for a product — browse in-persona, rant, then filter through a RED/YELLOW/WHITE/GREEN pragmatism layer so only real UX friction becomes tickets. |
## DevOps
| Skill | Description |

View File

@@ -59,9 +59,12 @@ DevOps and infrastructure automation skills.
## dogfood
Internal dogfooding and QA skills used to test Hermes Agent itself.
| Skill | Description | Path |
|-------|-------------|------|
| `dogfood` | Systematic exploratory QA testing of web applications — find bugs, capture evidence, and generate structured reports. | `dogfood/dogfood` |
| `adversarial-ux-test` | Roleplay the most difficult, tech-resistant user for a product — browse in-persona, rant, then filter through a RED/YELLOW/WHITE/GREEN pragmatism layer so only real UX friction becomes tickets. | `dogfood/adversarial-ux-test` |
| `hermes-agent-setup` | Help users configure Hermes Agent — CLI usage, setup wizard, model/provider selection, tools, skills, voice/STT/TTS, gateway, and troubleshooting. | `dogfood/hermes-agent-setup` |
## email