Add 5 new skills for professional software development workflows, adapted from the Superpowers project ( obra/superpowers ): - test-driven-development: RED-GREEN-REFACTOR cycle enforcement - systematic-debugging: 4-phase root cause investigation - subagent-driven-development: Structured delegation with two-stage review - writing-plans: Comprehensive implementation planning - requesting-code-review: Systematic code review process These skills provide structured development workflows that transform Hermes from a general assistant into a professional software engineer with defined processes for quality assurance. Skills are organized under software-development category and follow Hermes skill format with proper frontmatter, examples, and integration guidance with existing skills.
6.8 KiB
name, description, version, author, license, metadata
| name | description | version | author | license | metadata | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| test-driven-development | Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach. | 1.0.0 | Hermes Agent (adapted from Superpowers) | MIT |
|
Test-Driven Development (TDD)
Overview
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
When to Use
Always:
- New features
- Bug fixes
- Refactoring
- Behavior changes
Exceptions (ask your human partner):
- Throwaway prototypes
- Generated code
- Configuration files
Thinking "skip TDD just this once"? Stop. That's rationalization.
The Iron Law
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
Implement fresh from tests. Period.
Red-Green-Refactor Cycle
RED - Write Failing Test
Write one minimal test showing what should happen.
Good Example:
def test_retries_failed_operations_3_times():
attempts = 0
def operation():
nonlocal attempts
attempts += 1
if attempts < 3:
raise Exception('fail')
return 'success'
result = retry_operation(operation)
assert result == 'success'
assert attempts == 3
- Clear name, tests real behavior, one thing
Bad Example:
def test_retry_works():
mock = MagicMock()
mock.side_effect = [Exception(), Exception(), 'success']
result = retry_operation(mock)
assert result == 'success' # What about retry count? Timing?
- Vague name, mocks behavior not reality, unclear what it tests
Verify RED
Run the test. It MUST fail.
If it passes:
- Test is wrong (not testing what you think)
- Code already exists (delete it, start over)
- Wrong test file running
What to check:
- Error message makes sense
- Fails for expected reason
- Stack trace points to right place
GREEN - Minimal Code
Write just enough code to pass. Nothing more.
Good:
def add(a, b):
return a + b # Nothing extra
Bad:
def add(a, b):
result = a + b
logging.info(f"Adding {a} + {b} = {result}") # Extra!
return result
Cheating is OK in GREEN:
- Hardcode return values
- Copy-paste
- Duplicate code
- Skip edge cases
We'll fix it in refactor.
Verify GREEN
Run tests. All must pass.
If fails:
- Fix minimal code
- Don't expand scope
- Stay in GREEN
REFACTOR - Clean Up
Now improve the code while keeping tests green.
Safe refactorings:
- Rename variables/functions
- Extract helper functions
- Remove duplication
- Simplify expressions
- Improve readability
Golden rule: Tests stay green throughout.
If tests fail during refactor:
- Undo immediately
- Smaller refactoring steps
- Check you didn't change behavior
Implementation Workflow
1. Create Test File
# Python
touch tests/test_feature.py
# JavaScript
touch tests/feature.test.js
# Rust
touch tests/feature_tests.rs
2. Write First Failing Test
# tests/test_calculator.py
def test_adds_two_numbers():
calc = Calculator()
result = calc.add(2, 3)
assert result == 5
3. Run and Verify Failure
pytest tests/test_calculator.py -v
# Expected: FAIL - Calculator not defined
4. Write Minimal Implementation
# src/calculator.py
class Calculator:
def add(self, a, b):
return a + b # Minimal!
5. Run and Verify Pass
pytest tests/test_calculator.py -v
# Expected: PASS
6. Commit
git add tests/test_calculator.py src/calculator.py
git commit -m "feat: add calculator with add method"
7. Next Test
def test_adds_negative_numbers():
calc = Calculator()
result = calc.add(-2, -3)
assert result == -5
Repeat cycle.
Testing Anti-Patterns
Mocking What You Don't Own
Bad: Mock database, HTTP client, file system Good: Abstract behind interface, test interface
Testing Implementation Details
Bad: Test that function was called Good: Test the result/behavior
Happy Path Only
Bad: Only test expected inputs Good: Test edge cases, errors, boundaries
Brittle Tests
Bad: Test breaks when refactoring Good: Tests verify behavior, not structure
Common Pitfalls
"I'll Write Tests After"
No, you won't. Write them first.
"This is Too Simple to Test"
Simple bugs cause complex problems. Test everything.
"I Need to See It Work First"
Temporary code becomes permanent. Test first.
"Tests Take Too Long"
Untested code takes longer. Invest in tests.
Language-Specific Commands
Python (pytest)
# Run all tests
pytest
# Run specific test
pytest tests/test_feature.py::test_name -v
# Run with coverage
pytest --cov=src --cov-report=term-missing
# Watch mode
pytest-watch
JavaScript (Jest)
# Run all tests
npm test
# Run specific test
npm test -- test_name
# Watch mode
npm test -- --watch
# Coverage
npm test -- --coverage
TypeScript (Jest with ts-jest)
# Run tests
npx jest
# Run specific file
npx jest tests/feature.test.ts
Go
# Run all tests
go test ./...
# Run specific test
go test -run TestName
# Verbose
go test -v
# Coverage
go test -cover
Rust
# Run tests
cargo test
# Run specific test
cargo test test_name
# Show output
cargo test -- --nocapture
Integration with Other Skills
With writing-plans
Every plan task should specify:
- What test to write
- Expected test failure
- Minimal implementation
- Refactoring opportunities
With systematic-debugging
When fixing bugs:
- Write test that reproduces bug
- Verify test fails (RED)
- Fix bug (GREEN)
- Refactor if needed
With subagent-driven-development
Subagent implements one test at a time:
- Write failing test
- Minimal code to pass
- Commit
- Next test
Success Indicators
You're doing TDD right when:
- Tests fail before code exists
- You write <10 lines between test runs
- Refactoring feels safe
- Bugs are caught immediately
- Code is simpler than expected
Red flags:
- Writing 50+ lines without running tests
- Tests always pass
- Fear of refactoring
- "I'll test later"
Remember
- RED - Write failing test
- GREEN - Minimal code to pass
- REFACTOR - Clean up safely
- REPEAT - Next behavior
If you didn't see it fail, it doesn't test what you think.