Merge PR #137: feat: Add Superpowers software development skills

Authored by kaos35. Adds 5 software development workflow skills adapted
from obra/superpowers: test-driven-development, systematic-debugging,
subagent-driven-development, writing-plans, requesting-code-review.
This commit is contained in:
teknium1
2026-03-03 04:00:00 -08:00
5 changed files with 2048 additions and 0 deletions

View File

@@ -0,0 +1,393 @@
---
name: requesting-code-review
description: Use when completing tasks, implementing major features, or before merging. Validates work meets requirements through systematic review process.
version: 1.0.0
author: Hermes Agent (adapted from Superpowers)
license: MIT
metadata:
hermes:
tags: [code-review, quality, validation, workflow, review]
related_skills: [subagent-driven-development, writing-plans, test-driven-development]
---
# Requesting Code Review
## Overview
Systematic code review catches issues before they cascade. Review early, review often.
**Core principle:** Fresh perspective finds issues you'll miss.
## When to Request Review
**Mandatory Reviews:**
- After each task in subagent-driven development
- After completing major features
- Before merge to main
- After bug fixes
**Optional but Valuable:**
- When stuck (fresh perspective)
- Before refactoring (baseline check)
- After complex logic implementation
- When touching critical code (auth, payments, data)
**Don't skip because:**
- "It's simple" (simple bugs compound)
- "I'm in a hurry" (reviews save time)
- "I tested it" (you have blind spots)
## Review Process
### Step 1: Prepare Context
Gather:
- What was implemented
- Original requirements/plan
- Files changed
- Test results
```bash
# Get changed files
git diff --name-only HEAD~1
# Get diff summary
git diff --stat HEAD~1
# Get commit messages
git log --oneline HEAD~5
```
### Step 2: Self-Review First
Before requesting external review:
**Checklist:**
- [ ] Code follows project conventions
- [ ] All tests pass
- [ ] No debug print statements
- [ ] No hardcoded secrets
- [ ] Error handling in place
- [ ] Documentation updated
- [ ] Commit messages are clear
```bash
# Run full test suite
pytest
# Check for debug code
grep -r "print(" src/ --include="*.py"
grep -r "debugger" src/ --include="*.js"
# Check for TODOs
grep -r "TODO" src/ --include="*.py"
```
### Step 3: Request Review
Use `delegate_task` to dispatch a reviewer subagent:
```python
delegate_task(
goal="Review implementation for quality and correctness",
context="""
WHAT WAS IMPLEMENTED: [Brief description]
ORIGINAL REQUIREMENTS: [From plan or issue]
FILES CHANGED:
- src/feature.py (added X function)
- tests/test_feature.py (added tests)
COMMIT RANGE: [SHA range or branch]
CHECK FOR:
- Correctness (does it do what it should?)
- Edge cases handled?
- Error handling adequate?
- Code quality and style
- Test coverage
- Security issues
- Performance concerns
OUTPUT FORMAT:
- Summary: [brief assessment]
- Critical Issues: [must fix]
- Important Issues: [should fix]
- Minor Issues: [nice to have]
- Verdict: [APPROVE / REQUEST_CHANGES / NEEDS_WORK]
""",
toolsets=['file']
)
```
### Step 4: Act on Feedback
**Critical Issues (Block merge):**
- Security vulnerabilities
- Broken functionality
- Data loss risk
- Test failures
**Action:** Fix immediately before proceeding
**Important Issues (Should fix):**
- Missing edge case handling
- Poor error messages
- Unclear code
- Missing tests
**Action:** Fix before merge if possible
**Minor Issues (Nice to have):**
- Style preferences
- Refactoring suggestions
- Documentation improvements
**Action:** Note for later or quick fix
## Review Dimensions
### Correctness
**Questions:**
- Does it implement the requirements?
- Are there logic errors?
- Do edge cases work?
- Are there race conditions?
**Check:**
- Read implementation against requirements
- Trace through edge cases
- Check boundary conditions
### Code Quality
**Questions:**
- Is code readable?
- Are names clear?
- Is it too complex?
- Is there duplication?
**Check:**
- Function length (aim <20 lines)
- Cyclomatic complexity
- DRY violations
- Naming clarity
### Testing
**Questions:**
- Are there tests?
- Do they cover edge cases?
- Are they meaningful?
- Do they pass?
**Check:**
- Test coverage
- Edge case coverage
- Test clarity
- Assertion quality
### Security
**Questions:**
- Any injection vulnerabilities?
- Proper input validation?
- Secrets handled correctly?
- Access control in place?
**Check:**
- Input sanitization
- Authentication/authorization
- Secret management
- SQL/query safety
### Performance
**Questions:**
- Any N+1 queries?
- Unnecessary computation?
- Memory leaks?
- Scalability concerns?
**Check:**
- Database queries
- Algorithm complexity
- Resource usage
- Caching opportunities
## Review Output Format
Standard review format:
```markdown
## Review Summary
**Assessment:** [Brief overall assessment]
**Verdict:** [APPROVE / REQUEST_CHANGES / NEEDS_WORK]
---
## Critical Issues (Fix Required)
1. **[Issue title]**
- Location: `file.py:45`
- Problem: [Description]
- Suggestion: [How to fix]
---
## Important Issues (Should Fix)
1. **[Issue title]**
- Location: `file.py:67`
- Problem: [Description]
- Suggestion: [How to fix]
---
## Minor Issues (Optional)
1. **[Issue title]**
- Suggestion: [Improvement idea]
---
## Strengths
- [What was done well]
```
## Integration with Other Skills
### With subagent-driven-development
Review after EACH task:
1. Subagent implements task
2. Request code review
3. Fix issues
4. Proceed to next task
### With test-driven-development
Review checks:
- Tests written first?
- Tests are meaningful?
- Edge cases covered?
- All tests pass?
### With writing-plans
Review validates:
- Implementation matches plan?
- All tasks completed?
- Quality standards met?
## Common Review Patterns
### Pre-Merge Review
Before merging feature branch:
```bash
# Create review checkpoint
git log --oneline main..feature-branch
# Get summary of changes
git diff --stat main..feature-branch
# Request review
delegate_task(
goal="Pre-merge review of feature branch",
context="[changes, requirements, test results]"
)
# Address feedback
# Merge when approved
```
### Continuous Review
During subagent-driven development:
```python
# After each task
if task_complete:
review_result = request_review(task)
if review_result.has_critical_issues():
fix_issues(review_result.critical)
re_review()
proceed_to_next_task()
```
### Emergency Review
When fixing production bugs:
1. Fix with tests
2. Self-review
3. Quick peer review
4. Deploy
5. Full review post-deploy
## Review Best Practices
### As Review Requester
**Do:**
- Provide complete context
- Highlight areas of concern
- Ask specific questions
- Be responsive to feedback
- Fix issues promptly
**Don't:**
- Rush the reviewer
- Argue without evidence
- Ignore feedback
- Take criticism personally
### As Reviewer (via subagent)
**Do:**
- Be specific about issues
- Suggest improvements
- Acknowledge what works
- Prioritize issues
**Don't:**
- Nitpick style (unless project requires)
- Make vague comments
- Block without explanation
- Be overly critical
## Quality Gates
**Must pass before merge:**
- [ ] No critical issues
- [ ] All tests pass
- [ ] Review approved
- [ ] Requirements met
**Should pass before merge:**
- [ ] No important issues
- [ ] Documentation updated
- [ ] Performance acceptable
**Nice to have:**
- [ ] No minor issues
- [ ] Extra polish
## Remember
```
Review early
Review often
Be specific
Fix critical issues first
Quality over speed
```
**A good review catches what you missed.**

View File

@@ -0,0 +1,375 @@
---
name: subagent-driven-development
description: Use when executing implementation plans with independent tasks. Dispatches fresh delegate_task per task with two-stage review (spec compliance then code quality).
version: 1.0.0
author: Hermes Agent (adapted from Superpowers)
license: MIT
metadata:
hermes:
tags: [delegation, subagent, implementation, workflow, parallel]
related_skills: [writing-plans, requesting-code-review, test-driven-development]
---
# Subagent-Driven Development
## Overview
Execute implementation plans by dispatching fresh subagents per task with systematic two-stage review.
**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
## When to Use
Use this skill when:
- You have an implementation plan (from writing-plans skill)
- Tasks are mostly independent
- You want to stay in the current session
- Quality and spec compliance are important
**vs. Manual execution:**
- Parallel task execution possible
- Automated review process
- Consistent quality checks
- Better for complex multi-step plans
## The Process
### 1. Read and Parse Plan
```markdown
[Read plan file once: docs/plans/feature-plan.md]
[Extract all tasks with full text and context]
[Create todo list with all tasks]
```
**Action:** Read plan, extract tasks, create todo list.
### 2. Per-Task Workflow
For EACH task in the plan:
#### Step 1: Dispatch Implementer Subagent
Use `delegate_task` with:
- **goal:** Implement [specific task from plan]
- **context:** Full task description from plan, project structure, relevant files
- **toolsets:** ['terminal', 'file', 'web'] (or as needed)
**Example:**
```python
# Task: Add user authentication middleware
delegate_task(
goal="Implement JWT authentication middleware as specified in Task 3 of the plan",
context="""
Task from plan:
- Create: src/middleware/auth.py
- Validate JWT tokens from Authorization header
- Return 401 for invalid tokens
- Attach user info to request object
Project structure:
- Flask app in src/app.py
- Uses PyJWT library
- Existing middleware pattern in src/middleware/
""",
toolsets=['terminal', 'file']
)
```
#### Step 2: Implementer Subagent Works
The subagent will:
1. Ask questions if needed (you answer)
2. Implement the task following TDD
3. Write tests
4. Run tests to verify
5. Self-review
6. Report completion
**Your role:** Answer questions, provide context.
#### Step 3: Spec Compliance Review
Dispatch reviewer subagent:
```python
delegate_task(
goal="Review if implementation matches spec from plan",
context="""
Original task spec: [copy from plan]
Implementation: [file paths and key code]
Check:
- All requirements from spec implemented?
- File paths match spec?
- Behavior matches spec?
- Nothing extra added?
""",
toolsets=['file']
)
```
**If spec issues found:**
- Subagent fixes gaps
- Re-run spec review
- Continue only when spec-compliant
#### Step 4: Code Quality Review
Dispatch quality reviewer:
```python
delegate_task(
goal="Review code quality and best practices",
context="""
Code to review: [file paths]
Check:
- Follows project style?
- Proper error handling?
- Good naming?
- Test coverage adequate?
- No obvious bugs?
""",
toolsets=['file']
)
```
**If quality issues found:**
- Subagent fixes issues
- Re-run quality review
- Continue only when approved
#### Step 5: Mark Complete
Update todo list, mark task complete.
### 3. Final Review
After ALL tasks complete:
```python
delegate_task(
goal="Review entire implementation for consistency",
context="All tasks completed, review for integration issues",
toolsets=['file']
)
```
### 4. Branch Cleanup
Use `finishing-a-development-branch` skill:
- Verify all tests pass
- Present merge options
- Clean up worktree
## Task Granularity
**Good task size:** 2-5 minutes of focused work
**Examples:**
**Too big:**
- "Implement user authentication system"
**Right size:**
- "Create User model with email and password fields"
- "Add password hashing function"
- "Create login endpoint"
- "Add JWT token generation"
## Communication Pattern
### You to Subagent
**Provide:**
- Clear task description
- Exact file paths
- Expected behavior
- Success criteria
- Relevant context
**Example:**
```
Task: Add email validation
Files: Create src/validators/email.py
Expected: Function returns True for valid emails, False for invalid
Success: Tests pass for 10 test cases including edge cases
Context: Used in user registration flow
```
### Subagent to You
**Expect:**
- Questions for clarification
- Progress updates
- Completion report
- Self-review summary
**Respond to:**
- Answer questions promptly
- Provide missing context
- Approve approach decisions
## Two-Stage Review Details
### Stage 1: Spec Compliance
**Checks:**
- [ ] All requirements from plan implemented
- [ ] File paths match specification
- [ ] Function signatures match spec
- [ ] Behavior matches expected
- [ ] No scope creep (nothing extra)
**Output:** PASS or list of spec gaps
### Stage 2: Code Quality
**Checks:**
- [ ] Follows language conventions
- [ ] Consistent with project style
- [ ] Clear variable/function names
- [ ] Proper error handling
- [ ] Adequate test coverage
- [ ] No obvious bugs/edge cases missed
- [ ] Documentation if needed
**Output:** APPROVED or list of issues (critical/important/minor)
## Handling Issues
### Critical Issues
**Examples:** Security vulnerability, broken functionality, data loss risk
**Action:** Must fix before proceeding
### Important Issues
**Examples:** Missing tests, poor error handling, unclear code
**Action:** Should fix before proceeding
### Minor Issues
**Examples:** Style inconsistency, minor refactoring opportunity
**Action:** Note for later, optional fix
## Integration with Other Skills
### With test-driven-development
Subagent should:
1. Write failing test first
2. Implement minimal code
3. Verify test passes
4. Commit
### With systematic-debugging
If subagent encounters bugs:
1. Pause implementation
2. Debug systematically
3. Fix root cause
4. Resume
### With writing-plans
This skill EXECUTES plans created by writing-plans skill.
**Sequence:**
1. brainstorming → writing-plans → subagent-driven-development
### With requesting-code-review
After subagent completes task, use requesting-code-review skill for final validation.
## Common Patterns
### Pattern: Fresh Subagent Per Task
**Why:** Prevents context pollution
**How:** New delegate_task for each task
**Result:** Each subagent has clean context
### Pattern: Two-Stage Review
**Why:** Catch issues early, ensure quality
**How:** Spec review → Quality review
**Result:** High-quality, spec-compliant code
### Pattern: Frequent Checkpoints
**Why:** Catch issues before they compound
**How:** Review after each task
**Result:** Issues don't cascade
## Best Practices
1. **Clear Task Boundaries**
- One task = one focused change
- Independent where possible
- Clear success criteria
2. **Complete Context**
- Provide all needed files
- Explain project conventions
- Share relevant examples
3. **Review Discipline**
- Don't skip spec review
- Address critical issues immediately
- Keep quality bar consistent
4. **Communication**
- Answer subagent questions quickly
- Clarify when needed
- Provide feedback on reviews
## Example Workflow
```markdown
User: Implement user authentication
You: I'll use subagent-driven development. Let me create a plan first.
[Uses writing-plans skill]
Plan created with 5 tasks:
1. Create User model
2. Add password hashing
3. Implement login endpoint
4. Add JWT middleware
5. Create registration endpoint
--- Task 1 ---
[Dispatch implementer subagent for User model]
[Subagent asks: "Should email be unique?"]
You: Yes, email must be unique
[Subagent implements]
[Dispatch spec reviewer - PASS]
[Dispatch quality reviewer - APPROVED]
Task 1 complete
--- Task 2 ---
[Dispatch implementer for password hashing]
...
[After all tasks]
[Final review]
[Merge branch]
```
## Remember
```
Fresh subagent per task
Two-stage review every time
Spec compliance first
Code quality second
Never skip reviews
Catch issues early
```
**Quality is not an accident. It's the result of systematic process.**

View File

@@ -0,0 +1,443 @@
---
name: systematic-debugging
description: Use when encountering any bug, test failure, or unexpected behavior. 4-phase root cause investigation process - NO fixes without understanding the problem first.
version: 1.0.0
author: Hermes Agent (adapted from Superpowers)
license: MIT
metadata:
hermes:
tags: [debugging, troubleshooting, problem-solving, root-cause, investigation]
related_skills: [test-driven-development, writing-plans, subagent-driven-development]
---
# Systematic Debugging
## Overview
Random fixes waste time and create new bugs. Quick patches mask underlying issues.
**Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
**Violating the letter of this process is violating the spirit of debugging.**
## The Iron Law
```
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
```
If you haven't completed Phase 1, you cannot propose fixes.
## When to Use
Use for ANY technical issue:
- Test failures
- Bugs in production
- Unexpected behavior
- Performance problems
- Build failures
- Integration issues
**Use this ESPECIALLY when:**
- Under time pressure (emergencies make guessing tempting)
- "Just one quick fix" seems obvious
- You've already tried multiple fixes
- Previous fix didn't work
- You don't fully understand the issue
**Don't skip when:**
- Issue seems simple (simple bugs have root causes too)
- You're in a hurry (rushing guarantees rework)
- Manager wants it fixed NOW (systematic is faster than thrashing)
## The Four Phases
You MUST complete each phase before proceeding to the next.
---
## Phase 1: Root Cause Investigation
**BEFORE attempting ANY fix:**
### 1. Read Error Messages Carefully
- Don't skip past errors or warnings
- They often contain the exact solution
- Read stack traces completely
- Note line numbers, file paths, error codes
**Action:** Copy full error message to your notes.
### 2. Reproduce Consistently
- Can you trigger it reliably?
- What are the exact steps?
- Does it happen every time?
- If not reproducible → gather more data, don't guess
**Action:** Write down exact reproduction steps.
### 3. Check Recent Changes
- What changed that could cause this?
- Git diff, recent commits
- New dependencies, config changes
- Environmental differences
**Commands:**
```bash
# Recent commits
git log --oneline -10
# Uncommitted changes
git diff
# Changes in specific file
git log -p --follow src/problematic_file.py
```
### 4. Gather Evidence in Multi-Component Systems
**WHEN system has multiple components (CI pipeline, API service, database layer):**
**BEFORE proposing fixes, add diagnostic instrumentation:**
For EACH component boundary:
- Log what data enters component
- Log what data exits component
- Verify environment/config propagation
- Check state at each layer
Run once to gather evidence showing WHERE it breaks
THEN analyze evidence to identify failing component
THEN investigate that specific component
**Example (multi-layer system):**
```python
# Layer 1: Entry point
def entry_point(input_data):
print(f"DEBUG: Input received: {input_data}")
result = process_layer1(input_data)
print(f"DEBUG: Layer 1 output: {result}")
return result
# Layer 2: Processing
def process_layer1(data):
print(f"DEBUG: Layer 1 received: {data}")
# ... processing ...
print(f"DEBUG: Layer 1 returning: {result}")
return result
```
**Action:** Add logging, run once, analyze output.
### 5. Isolate the Problem
- Comment out code until problem disappears
- Binary search through recent changes
- Create minimal reproduction case
- Test with fresh environment
**Action:** Create minimal reproduction case.
### Phase 1 Completion Checklist
- [ ] Error messages fully read and understood
- [ ] Issue reproduced consistently
- [ ] Recent changes identified and reviewed
- [ ] Evidence gathered (logs, state)
- [ ] Problem isolated to specific component/code
- [ ] Root cause hypothesis formed
**STOP:** Do not proceed to Phase 2 until you understand WHY it's happening.
---
## Phase 2: Solution Design
**Given the root cause, design the fix:**
### 1. Understand the Fix Area
- Read relevant code thoroughly
- Understand data flow
- Identify affected components
- Check for similar issues elsewhere
**Action:** Read all relevant code files.
### 2. Design Minimal Fix
- Smallest change that fixes root cause
- Avoid scope creep
- Don't refactor while fixing
- Fix one issue at a time
**Action:** Write down the exact fix before coding.
### 3. Consider Side Effects
- What else could this change affect?
- Are there dependencies?
- Will this break other functionality?
**Action:** Identify potential side effects.
### Phase 2 Completion Checklist
- [ ] Fix area code fully understood
- [ ] Minimal fix designed
- [ ] Side effects identified
- [ ] Fix approach documented
---
## Phase 3: Implementation
**Make the fix:**
### 1. Write Test First (if possible)
```python
def test_should_handle_empty_input():
"""Regression test for bug #123"""
result = process_data("")
assert result == expected_empty_result
```
### 2. Implement Fix
```python
# Before (buggy)
def process_data(data):
return data.split(",")[0]
# After (fixed)
def process_data(data):
if not data:
return ""
return data.split(",")[0]
```
### 3. Verify Fix
```bash
# Run the specific test
pytest tests/test_data.py::test_should_handle_empty_input -v
# Run all tests to check for regressions
pytest
```
### Phase 3 Completion Checklist
- [ ] Test written that reproduces the bug
- [ ] Minimal fix implemented
- [ ] Test passes
- [ ] No regressions introduced
---
## Phase 4: Verification
**Confirm it's actually fixed:**
### 1. Reproduce Original Issue
- Follow original reproduction steps
- Verify issue is resolved
- Test edge cases
### 2. Regression Testing
```bash
# Full test suite
pytest
# Integration tests
pytest tests/integration/
# Check related areas
pytest -k "related_feature"
```
### 3. Monitor After Deploy
- Watch logs for related errors
- Check metrics
- Verify fix in production
### Phase 4 Completion Checklist
- [ ] Original issue cannot be reproduced
- [ ] All tests pass
- [ ] No new warnings/errors
- [ ] Fix documented (commit message, comments)
---
## Debugging Techniques
### Root Cause Tracing
Ask "why" 5 times:
1. Why did it fail? → Null pointer
2. Why was it null? → Function returned null
3. Why did function return null? → Missing validation
4. Why was validation missing? → Assumed input always valid
5. Why was that assumption wrong? → API changed
**Root cause:** API change not accounted for
### Defense in Depth
Don't fix just the symptom:
**Bad:** Add null check at crash site
**Good:**
1. Add validation at API boundary
2. Add null check at crash site
3. Add test for both
4. Document API behavior
### Condition-Based Waiting
For timing/race conditions:
```python
# Bad - arbitrary sleep
import time
time.sleep(5) # "Should be enough"
# Good - wait for condition
from tenacity import retry, wait_exponential, stop_after_attempt
@retry(wait=wait_exponential(multiplier=1, min=4, max=10),
stop=stop_after_attempt(5))
def wait_for_service():
response = requests.get(health_url)
assert response.status_code == 200
```
---
## Common Debugging Pitfalls
### Fix Without Understanding
**Symptom:** "Just add a try/catch"
**Problem:** Masks the real issue
**Solution:** Complete Phase 1 before any fix
### Shotgun Debugging
**Symptom:** Change 5 things at once
**Problem:** Don't know what fixed it
**Solution:** One change at a time, verify each
### Premature Optimization
**Symptom:** Rewrite while debugging
**Problem:** Introduces new bugs
**Solution:** Fix first, refactor later
### Assuming Environment
**Symptom:** "Works on my machine"
**Problem:** Environment differences
**Solution:** Check environment variables, versions, configs
---
## Language-Specific Tools
### Python
```python
# Add debugger
import pdb; pdb.set_trace()
# Or use ipdb for better experience
import ipdb; ipdb.set_trace()
# Log state
import logging
logging.debug(f"Variable state: {variable}")
# Stack trace
import traceback
traceback.print_exc()
```
### JavaScript/TypeScript
```javascript
// Debugger
debugger;
// Console with context
console.log("State:", { var1, var2, var3 });
// Stack trace
console.trace("Here");
// Error with context
throw new Error(`Failed with input: ${JSON.stringify(input)}`);
```
### Go
```go
// Print state
fmt.Printf("Debug: variable=%+v\n", variable)
// Stack trace
import "runtime/debug"
debug.PrintStack()
// Panic with context
if err != nil {
panic(fmt.Sprintf("unexpected error: %v", err))
}
```
---
## Integration with Other Skills
### With test-driven-development
When debugging:
1. Write test that reproduces bug
2. Debug systematically
3. Fix root cause
4. Test passes
### With writing-plans
Include debugging tasks in plans:
- "Add diagnostic logging"
- "Create reproduction test"
- "Verify fix resolves issue"
### With subagent-driven-development
If subagent gets stuck:
1. Switch to systematic debugging
2. Analyze root cause
3. Provide findings to subagent
4. Resume implementation
---
## Remember
```
PHASE 1: Investigate → Understand WHY
PHASE 2: Design → Plan the fix
PHASE 3: Implement → Make the fix
PHASE 4: Verify → Confirm it's fixed
```
**No shortcuts. No guessing. Systematic always wins.**

View File

@@ -0,0 +1,384 @@
---
name: test-driven-development
description: Use when implementing any feature or bugfix, before writing implementation code. Enforces RED-GREEN-REFACTOR cycle with test-first approach.
version: 1.0.0
author: Hermes Agent (adapted from Superpowers)
license: MIT
metadata:
hermes:
tags: [testing, tdd, development, quality, red-green-refactor]
related_skills: [systematic-debugging, writing-plans, subagent-driven-development]
---
# Test-Driven Development (TDD)
## Overview
Write the test first. Watch it fail. Write minimal code to pass.
**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
**Violating the letter of the rules is violating the spirit of the rules.**
## When to Use
**Always:**
- New features
- Bug fixes
- Refactoring
- Behavior changes
**Exceptions (ask your human partner):**
- Throwaway prototypes
- Generated code
- Configuration files
Thinking "skip TDD just this once"? Stop. That's rationalization.
## The Iron Law
```
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
```
Write code before the test? Delete it. Start over.
**No exceptions:**
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
Implement fresh from tests. Period.
## Red-Green-Refactor Cycle
### RED - Write Failing Test
Write one minimal test showing what should happen.
**Good Example:**
```python
def test_retries_failed_operations_3_times():
attempts = 0
def operation():
nonlocal attempts
attempts += 1
if attempts < 3:
raise Exception('fail')
return 'success'
result = retry_operation(operation)
assert result == 'success'
assert attempts == 3
```
- Clear name, tests real behavior, one thing
**Bad Example:**
```python
def test_retry_works():
mock = MagicMock()
mock.side_effect = [Exception(), Exception(), 'success']
result = retry_operation(mock)
assert result == 'success' # What about retry count? Timing?
```
- Vague name, mocks behavior not reality, unclear what it tests
### Verify RED
Run the test. It MUST fail.
**If it passes:**
- Test is wrong (not testing what you think)
- Code already exists (delete it, start over)
- Wrong test file running
**What to check:**
- Error message makes sense
- Fails for expected reason
- Stack trace points to right place
### GREEN - Minimal Code
Write just enough code to pass. Nothing more.
**Good:**
```python
def add(a, b):
return a + b # Nothing extra
```
**Bad:**
```python
def add(a, b):
result = a + b
logging.info(f"Adding {a} + {b} = {result}") # Extra!
return result
```
Cheating is OK in GREEN:
- Hardcode return values
- Copy-paste
- Duplicate code
- Skip edge cases
**We'll fix it in refactor.**
### Verify GREEN
Run tests. All must pass.
**If fails:**
- Fix minimal code
- Don't expand scope
- Stay in GREEN
### REFACTOR - Clean Up
Now improve the code while keeping tests green.
**Safe refactorings:**
- Rename variables/functions
- Extract helper functions
- Remove duplication
- Simplify expressions
- Improve readability
**Golden rule:** Tests stay green throughout.
**If tests fail during refactor:**
- Undo immediately
- Smaller refactoring steps
- Check you didn't change behavior
## Implementation Workflow
### 1. Create Test File
```bash
# Python
touch tests/test_feature.py
# JavaScript
touch tests/feature.test.js
# Rust
touch tests/feature_tests.rs
```
### 2. Write First Failing Test
```python
# tests/test_calculator.py
def test_adds_two_numbers():
calc = Calculator()
result = calc.add(2, 3)
assert result == 5
```
### 3. Run and Verify Failure
```bash
pytest tests/test_calculator.py -v
# Expected: FAIL - Calculator not defined
```
### 4. Write Minimal Implementation
```python
# src/calculator.py
class Calculator:
def add(self, a, b):
return a + b # Minimal!
```
### 5. Run and Verify Pass
```bash
pytest tests/test_calculator.py -v
# Expected: PASS
```
### 6. Commit
```bash
git add tests/test_calculator.py src/calculator.py
git commit -m "feat: add calculator with add method"
```
### 7. Next Test
```python
def test_adds_negative_numbers():
calc = Calculator()
result = calc.add(-2, -3)
assert result == -5
```
Repeat cycle.
## Testing Anti-Patterns
### Mocking What You Don't Own
**Bad:** Mock database, HTTP client, file system
**Good:** Abstract behind interface, test interface
### Testing Implementation Details
**Bad:** Test that function was called
**Good:** Test the result/behavior
### Happy Path Only
**Bad:** Only test expected inputs
**Good:** Test edge cases, errors, boundaries
### Brittle Tests
**Bad:** Test breaks when refactoring
**Good:** Tests verify behavior, not structure
## Common Pitfalls
### "I'll Write Tests After"
No, you won't. Write them first.
### "This is Too Simple to Test"
Simple bugs cause complex problems. Test everything.
### "I Need to See It Work First"
Temporary code becomes permanent. Test first.
### "Tests Take Too Long"
Untested code takes longer. Invest in tests.
## Language-Specific Commands
### Python (pytest)
```bash
# Run all tests
pytest
# Run specific test
pytest tests/test_feature.py::test_name -v
# Run with coverage
pytest --cov=src --cov-report=term-missing
# Watch mode
pytest-watch
```
### JavaScript (Jest)
```bash
# Run all tests
npm test
# Run specific test
npm test -- test_name
# Watch mode
npm test -- --watch
# Coverage
npm test -- --coverage
```
### TypeScript (Jest with ts-jest)
```bash
# Run tests
npx jest
# Run specific file
npx jest tests/feature.test.ts
```
### Go
```bash
# Run all tests
go test ./...
# Run specific test
go test -run TestName
# Verbose
go test -v
# Coverage
go test -cover
```
### Rust
```bash
# Run tests
cargo test
# Run specific test
cargo test test_name
# Show output
cargo test -- --nocapture
```
## Integration with Other Skills
### With writing-plans
Every plan task should specify:
- What test to write
- Expected test failure
- Minimal implementation
- Refactoring opportunities
### With systematic-debugging
When fixing bugs:
1. Write test that reproduces bug
2. Verify test fails (RED)
3. Fix bug (GREEN)
4. Refactor if needed
### With subagent-driven-development
Subagent implements one test at a time:
1. Write failing test
2. Minimal code to pass
3. Commit
4. Next test
## Success Indicators
**You're doing TDD right when:**
- Tests fail before code exists
- You write <10 lines between test runs
- Refactoring feels safe
- Bugs are caught immediately
- Code is simpler than expected
**Red flags:**
- Writing 50+ lines without running tests
- Tests always pass
- Fear of refactoring
- "I'll test later"
## Remember
1. **RED** - Write failing test
2. **GREEN** - Minimal code to pass
3. **REFACTOR** - Clean up safely
4. **REPEAT** - Next behavior
**If you didn't see it fail, it doesn't test what you think.**

View File

@@ -0,0 +1,453 @@
---
name: writing-plans
description: Use when you have a spec or requirements for a multi-step task. Creates comprehensive implementation plans with bite-sized tasks, exact file paths, and complete code examples.
version: 1.0.0
author: Hermes Agent (adapted from Superpowers)
license: MIT
metadata:
hermes:
tags: [planning, design, implementation, workflow, documentation]
related_skills: [subagent-driven-development, test-driven-development, requesting-code-review]
---
# Writing Implementation Plans
## Overview
Transform specifications into actionable implementation plans. Write comprehensive plans that any developer can follow - even with zero context about your codebase.
**Core principle:** Document everything: exact file paths, complete code, test commands, verification steps.
**Assume the implementer:**
- Is a skilled developer
- Knows almost nothing about your codebase
- Has questionable taste in code style
- Needs explicit guidance
## When to Use
**Always use this skill:**
- Before implementing multi-step features
- After design approval (from brainstorming)
- When breaking down complex requirements
- Before delegating to subagents
**Don't skip when:**
- Feature seems simple (assumptions cause bugs)
- You plan to implement yourself (future you needs guidance)
- Working alone (documentation matters)
## Plan Document Structure
### Header (Required)
Every plan MUST start with:
```markdown
# [Feature Name] Implementation Plan
> **For Hermes:** Use subagent-driven-development or executing-plans skill to implement this plan.
**Goal:** One sentence describing what this builds
**Architecture:** 2-3 sentences about approach
**Tech Stack:** Key technologies/libraries
---
```
### Task Structure
Each task follows this format:
```markdown
### Task N: [Descriptive Name]
**Objective:** What this task accomplishes (one sentence)
**Files:**
- Create: `exact/path/to/new_file.py`
- Modify: `exact/path/to/existing.py:45-67` (line numbers if known)
- Test: `tests/path/to/test_file.py`
**Implementation Steps:**
**Step 1: [Action description]**
```python
# Complete code to write
class NewClass:
def __init__(self):
self.value = None
def process(self, input):
return input.upper()
```
**Step 2: [Action description]**
```bash
# Command to run
pytest tests/test_new.py -v
```
Expected: Tests pass with 3 green dots
**Step 3: [Action description]**
```python
# More code if needed
```
**Verification:**
- [ ] Test passes
- [ ] Function returns expected output
- [ ] No syntax errors
**Commit:**
```bash
git add src/new_file.py tests/test_new.py
git commit -m "feat: add new feature component"
```
```
## Task Granularity
**Each task = 2-5 minutes of work**
**Break down:**
- "Write failing test" - one task
- "Run test to verify it fails" - one task
- "Implement minimal code" - one task
- "Run test to verify pass" - one task
- "Commit" - one task
**Examples:**
**Too big:**
```markdown
### Task 1: Build authentication system
[50 lines of code across 5 files]
```
**Right size:**
```markdown
### Task 1: Create User model with email field
[10 lines, 1 file]
### Task 2: Add password hash field to User
[8 lines, 1 file]
### Task 3: Create password hashing utility
[15 lines, 1 file]
```
## Principles
### DRY (Don't Repeat Yourself)
**Bad:** Copy-paste validation in 3 places
**Good:** Extract validation function, use everywhere
```python
# Good - DRY
def validate_email(email):
if not re.match(r'^[^@]+@[^@]+$', email):
raise ValueError("Invalid email")
return email
# Use everywhere
validate_email(user_input)
validate_email(config_email)
validate_email(imported_data)
```
### YAGNI (You Aren't Gonna Need It)
**Bad:** Add "flexibility" for future requirements
**Good:** Implement only what's needed now
```python
# Bad - YAGNI violation
class User:
def __init__(self, name, email):
self.name = name
self.email = email
self.preferences = {} # Not needed yet!
self.metadata = {} # Not needed yet!
self.settings = {} # Not needed yet!
# Good - YAGNI
class User:
def __init__(self, name, email):
self.name = name
self.email = email
```
### TDD (Test-Driven Development)
Every task that produces code should include:
1. Write failing test
2. Run to verify failure
3. Write minimal code
4. Run to verify pass
See `test-driven-development` skill for details.
### Frequent Commits
Commit after every task:
```bash
git add [files]
git commit -m "type: description"
```
## Writing Process
### Step 1: Understand Requirements
Read and understand:
- Feature requirements
- Design documents
- Acceptance criteria
- Constraints
### Step 2: Explore Codebase
```bash
# Understand project structure
find src -type f -name "*.py" | head -20
# Look at similar features
grep -r "similar_pattern" src/
# Check existing tests
ls tests/
```
### Step 3: Design Approach
Decide:
- Architecture pattern
- File organization
- Dependencies needed
- Testing strategy
### Step 4: Write Tasks
Create tasks in order:
1. Setup/infrastructure
2. Core functionality
3. Edge cases
4. Integration
5. Cleanup
### Step 5: Add Details
For each task, add:
- Exact file paths
- Complete code examples
- Exact commands
- Expected outputs
- Verification steps
### Step 6: Review Plan
Check:
- [ ] Tasks are sequential and logical
- [ ] Each task is bite-sized (2-5 min)
- [ ] File paths are exact
- [ ] Code examples are complete
- [ ] Commands are exact with expected output
- [ ] No missing context
### Step 7: Save Plan
```bash
# Create plans directory
mkdir -p docs/plans
# Save plan
cat > docs/plans/YYYY-MM-DD-feature-name.md << 'EOF'
[plan content]
EOF
# Commit plan
git add docs/plans/YYYY-MM-DD-feature-name.md
git commit -m "docs: add implementation plan for feature"
```
## Example Plan
```markdown
# User Authentication Implementation Plan
> **For Hermes:** Use subagent-driven-development to implement this plan.
**Goal:** Add JWT-based user authentication to the Flask API
**Architecture:** Use PyJWT for tokens, bcrypt for hashing. Middleware validates tokens on protected routes.
**Tech Stack:** Python, Flask, PyJWT, bcrypt
---
### Task 1: Create User model
**Objective:** Define User model with email and hashed password
**Files:**
- Create: `src/models/user.py`
- Test: `tests/models/test_user.py`
**Step 1: Write failing test**
```python
def test_user_creation():
user = User(email="test@example.com", password="secret123")
assert user.email == "test@example.com"
assert user.password_hash is not None
assert user.password_hash != "secret123" # Should be hashed
```
**Step 2: Run to verify failure**
```bash
pytest tests/models/test_user.py -v
```
Expected: FAIL - User class not defined
**Step 3: Implement User model**
```python
import bcrypt
class User:
def __init__(self, email, password):
self.email = email
self.password_hash = bcrypt.hashpw(
password.encode(),
bcrypt.gensalt()
)
```
**Step 4: Run to verify pass**
```bash
pytest tests/models/test_user.py -v
```
Expected: PASS
**Commit:**
```bash
git add src/models/user.py tests/models/test_user.py
git commit -m "feat: add User model with password hashing"
```
### Task 2: Create login endpoint
**Objective:** Add POST /login endpoint that returns JWT
**Files:**
- Modify: `src/app.py`
- Test: `tests/test_login.py`
[Continue...]
```
## Common Mistakes
### Vague Tasks
**Bad:**
```markdown
### Task 1: Add authentication
```
**Good:**
```markdown
### Task 1: Create User model with email and password_hash fields
```
### Incomplete Code
**Bad:**
```markdown
Step 1: Add validation function
```
**Good:**
```markdown
Step 1: Add validation function
```python
def validate_email(email):
"""Validate email format."""
import re
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
if not re.match(pattern, email):
raise ValueError(f"Invalid email: {email}")
return email
```
### Missing Verification
**Bad:**
```markdown
Step 3: Test it works
```
**Good:**
```markdown
Step 3: Verify authentication
```bash
curl -X POST http://localhost:5000/login \
-H "Content-Type: application/json" \
-d '{"email":"test@example.com","password":"secret123"}'
```
Expected: Returns 200 with JWT token in response
```
## Integration with Other Skills
### With brainstorming
**Sequence:**
1. brainstorming → Explore and refine design
2. writing-plans → Create implementation plan
3. subagent-driven-development → Execute plan
### With subagent-driven-development
Plans feed into subagent-driven-development:
- Subagents implement each task
- Two-stage review ensures quality
- Plan provides context and requirements
### With test-driven-development
Every code-producing task should follow TDD:
1. Write failing test
2. Verify failure
3. Write minimal code
4. Verify pass
## Success Checklist
Before considering a plan complete:
- [ ] Header with goal, architecture, tech stack
- [ ] All tasks are bite-sized (2-5 min each)
- [ ] Exact file paths for every file
- [ ] Complete code examples (not partial)
- [ ] Exact commands with expected output
- [ ] Verification steps for each task
- [ ] Commit commands included
- [ ] DRY, YAGNI, TDD principles applied
- [ ] Tasks are sequential and logical
- [ ] Plan saved to docs/plans/
## Remember
```
Bite-sized tasks
Exact file paths
Complete code
Exact commands
Verification steps
DRY, YAGNI, TDD
```
**A good plan makes implementation obvious.**