Files

kaos35 2595d81733 feat: Add Superpowers software development skills

Add 5 new skills for professional software development workflows,
adapted from the Superpowers project ( obra/superpowers ):

- test-driven-development: RED-GREEN-REFACTOR cycle enforcement
- systematic-debugging: 4-phase root cause investigation
- subagent-driven-development: Structured delegation with two-stage review
- writing-plans: Comprehensive implementation planning
- requesting-code-review: Systematic code review process

These skills provide structured development workflows that transform
Hermes from a general assistant into a professional software engineer
with defined processes for quality assurance.

Skills are organized under software-development category and follow
Hermes skill format with proper frontmatter, examples, and integration
guidance with existing skills.

2026-02-27 15:32:58 +01:00

6.8 KiB

Raw Blame History

name, description, version, author, license, metadata

name

description

version

author

license

metadata

test-driven-development

Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach.

1.0.0

Hermes Agent (adapted from Superpowers)

MIT

hermes

Test-Driven Development (TDD)

Overview

Write the test first. Watch it fail. Write minimal code to pass.

Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.

Violating the letter of the rules is violating the spirit of the rules.

When to Use

Always:

New features
Bug fixes
Refactoring
Behavior changes

Exceptions (ask your human partner):

Throwaway prototypes
Generated code
Configuration files

Thinking "skip TDD just this once"? Stop. That's rationalization.

The Iron Law

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

Write code before the test? Delete it. Start over.

No exceptions:

Don't keep it as "reference"
Don't "adapt" it while writing tests
Don't look at it
Delete means delete

Implement fresh from tests. Period.

Red-Green-Refactor Cycle

RED - Write Failing Test

Write one minimal test showing what should happen.

Good Example:

def test_retries_failed_operations_3_times():
    attempts = 0
    def operation():
        nonlocal attempts
        attempts += 1
        if attempts < 3:
            raise Exception('fail')
        return 'success'
    
    result = retry_operation(operation)
    
    assert result == 'success'
    assert attempts == 3

Clear name, tests real behavior, one thing

Bad Example:

def test_retry_works():
    mock = MagicMock()
    mock.side_effect = [Exception(), Exception(), 'success']
    
    result = retry_operation(mock)
    
    assert result == 'success'  # What about retry count? Timing?

Vague name, mocks behavior not reality, unclear what it tests

Verify RED

Run the test. It MUST fail.

If it passes:

Test is wrong (not testing what you think)
Code already exists (delete it, start over)
Wrong test file running

What to check:

Error message makes sense
Fails for expected reason
Stack trace points to right place

GREEN - Minimal Code

Write just enough code to pass. Nothing more.

Good:

def add(a, b):
    return a + b  # Nothing extra

Bad:

def add(a, b):
    result = a + b
    logging.info(f"Adding {a} + {b} = {result}")  # Extra!
    return result

Cheating is OK in GREEN:

Hardcode return values
Copy-paste
Duplicate code
Skip edge cases

We'll fix it in refactor.

Verify GREEN

Run tests. All must pass.

If fails:

Fix minimal code
Don't expand scope
Stay in GREEN

REFACTOR - Clean Up

Now improve the code while keeping tests green.

Safe refactorings:

Rename variables/functions
Extract helper functions
Remove duplication
Simplify expressions
Improve readability

Golden rule: Tests stay green throughout.

If tests fail during refactor:

Undo immediately
Smaller refactoring steps
Check you didn't change behavior

Implementation Workflow

1. Create Test File

# Python
touch tests/test_feature.py

# JavaScript
touch tests/feature.test.js

# Rust
touch tests/feature_tests.rs

2. Write First Failing Test

# tests/test_calculator.py
def test_adds_two_numbers():
    calc = Calculator()
    result = calc.add(2, 3)
    assert result == 5

3. Run and Verify Failure

pytest tests/test_calculator.py -v
# Expected: FAIL - Calculator not defined

4. Write Minimal Implementation

# src/calculator.py
class Calculator:
    def add(self, a, b):
        return a + b  # Minimal!

5. Run and Verify Pass

pytest tests/test_calculator.py -v
# Expected: PASS

6. Commit

git add tests/test_calculator.py src/calculator.py
git commit -m "feat: add calculator with add method"

7. Next Test

def test_adds_negative_numbers():
    calc = Calculator()
    result = calc.add(-2, -3)
    assert result == -5

Repeat cycle.

Testing Anti-Patterns

Mocking What You Don't Own

Bad: Mock database, HTTP client, file system Good: Abstract behind interface, test interface

Testing Implementation Details

Bad: Test that function was called Good: Test the result/behavior

Happy Path Only

Bad: Only test expected inputs Good: Test edge cases, errors, boundaries

Brittle Tests

Bad: Test breaks when refactoring Good: Tests verify behavior, not structure

Common Pitfalls

"I'll Write Tests After"

No, you won't. Write them first.

"This is Too Simple to Test"

Simple bugs cause complex problems. Test everything.

"I Need to See It Work First"

Temporary code becomes permanent. Test first.

"Tests Take Too Long"

Untested code takes longer. Invest in tests.

Language-Specific Commands

Python (pytest)

# Run all tests
pytest

# Run specific test
pytest tests/test_feature.py::test_name -v

# Run with coverage
pytest --cov=src --cov-report=term-missing

# Watch mode
pytest-watch

JavaScript (Jest)

# Run all tests
npm test

# Run specific test
npm test -- test_name

# Watch mode
npm test -- --watch

# Coverage
npm test -- --coverage

TypeScript (Jest with ts-jest)

# Run tests
npx jest

# Run specific file
npx jest tests/feature.test.ts

Go

# Run all tests
go test ./...

# Run specific test
go test -run TestName

# Verbose
go test -v

# Coverage
go test -cover

Rust

# Run tests
cargo test

# Run specific test
cargo test test_name

# Show output
cargo test -- --nocapture

Integration with Other Skills

With writing-plans

Every plan task should specify:

What test to write
Expected test failure
Minimal implementation
Refactoring opportunities

With systematic-debugging

When fixing bugs:

Write test that reproduces bug
Verify test fails (RED)
Fix bug (GREEN)
Refactor if needed

With subagent-driven-development

Subagent implements one test at a time:

Write failing test
Minimal code to pass
Commit
Next test

Success Indicators

You're doing TDD right when:

Tests fail before code exists
You write <10 lines between test runs
Refactoring feels safe
Bugs are caught immediately
Code is simpler than expected

Red flags:

Writing 50+ lines without running tests
Tests always pass
Fear of refactoring
"I'll test later"

Remember

RED - Write failing test
GREEN - Minimal code to pass
REFACTOR - Clean up safely
REPEAT - Next behavior

If you didn't see it fail, it doesn't test what you think.

6.8 KiB Raw Blame History