4.8 KiB
4.8 KiB
CI Troubleshooting Quick Reference
Common CI failure patterns and how to diagnose them from the logs.
Reading CI Logs
# With gh
gh run view <RUN_ID> --log-failed
# With curl — download and extract
curl -sL -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/$GH_OWNER/$GH_REPO/actions/runs/<RUN_ID>/logs \
-o /tmp/ci-logs.zip && unzip -o /tmp/ci-logs.zip -d /tmp/ci-logs
Common Failure Patterns
Test Failures
Signatures in logs:
FAILED tests/test_foo.py::test_bar - AssertionError
E assert 42 == 43
ERROR tests/test_foo.py - ModuleNotFoundError
Diagnosis:
- Find the test file and line number from the traceback
- Use
read_fileto read the failing test - Check if it's a logic error in the code or a stale test assertion
- Look for
ModuleNotFoundError— usually a missing dependency in CI
Common fixes:
- Update assertion to match new expected behavior
- Add missing dependency to requirements.txt / pyproject.toml
- Fix flaky test (add retry, mock external service, fix race condition)
Lint / Formatting Failures
Signatures in logs:
src/auth.py:45:1: E302 expected 2 blank lines, got 1
src/models.py:12:80: E501 line too long (95 > 88 characters)
error: would reformat src/utils.py
Diagnosis:
- Read the specific file:line numbers mentioned
- Check which linter is complaining (flake8, ruff, black, isort, mypy)
Common fixes:
- Run the formatter locally:
black .,isort .,ruff check --fix . - Fix the specific style violation by editing the file
- If using
patch, make sure to match existing indentation style
Type Check Failures (mypy / pyright)
Signatures in logs:
src/api.py:23: error: Argument 1 to "process" has incompatible type "str"; expected "int"
src/models.py:45: error: Missing return statement
Diagnosis:
- Read the file at the mentioned line
- Check the function signature and what's being passed
Common fixes:
- Add type cast or conversion
- Fix the function signature
- Add
# type: ignorecomment as last resort (with explanation)
Build / Compilation Failures
Signatures in logs:
ModuleNotFoundError: No module named 'some_package'
ERROR: Could not find a version that satisfies the requirement foo==1.2.3
npm ERR! Could not resolve dependency
Diagnosis:
- Check requirements.txt / package.json for the missing or incompatible dependency
- Compare local vs CI Python/Node version
Common fixes:
- Add missing dependency to requirements file
- Pin compatible version
- Update lockfile (
pip freeze,npm install)
Permission / Auth Failures
Signatures in logs:
fatal: could not read Username for 'https://github.com': No such device or address
Error: Resource not accessible by integration
403 Forbidden
Diagnosis:
- Check if the workflow needs special permissions (token scopes)
- Check if secrets are configured (missing
GITHUB_TOKENor custom secrets)
Common fixes:
- Add
permissions:block to workflow YAML - Verify secrets exist:
gh secret listor check repo settings - For fork PRs: some secrets aren't available by design
Timeout Failures
Signatures in logs:
Error: The operation was canceled.
The job running on runner ... has exceeded the maximum execution time
Diagnosis:
- Check which step timed out
- Look for infinite loops, hung processes, or slow network calls
Common fixes:
- Add timeout to the specific step:
timeout-minutes: 10 - Fix the underlying performance issue
- Split into parallel jobs
Docker / Container Failures
Signatures in logs:
docker: Error response from daemon
failed to solve: ... not found
COPY failed: file not found in build context
Diagnosis:
- Check Dockerfile for the failing step
- Verify the referenced files exist in the repo
Common fixes:
- Fix path in COPY/ADD command
- Update base image tag
- Add missing file to
.dockerignoreexclusion or remove from it
Auto-Fix Decision Tree
CI Failed
├── Test failure
│ ├── Assertion mismatch → update test or fix logic
│ └── Import/module error → add dependency
├── Lint failure → run formatter, fix style
├── Type error → fix types
├── Build failure
│ ├── Missing dep → add to requirements
│ └── Version conflict → update pins
├── Permission error → update workflow permissions (needs user)
└── Timeout → investigate perf (may need user input)
Re-running After Fix
git add <fixed_files> && git commit -m "fix: resolve CI failure" && git push
# Then monitor
gh pr checks --watch 2>/dev/null || \
echo "Poll with: curl -s -H 'Authorization: token ...' https://api.github.com/repos/.../commits/$(git rev-parse HEAD)/status"