All checks were successful
Smoke Test / smoke (pull_request) Successful in 15s
Refs #57
2.0 KiB
2.0 KiB
Metal Shader Security Tests (Issue #57)
Bounds Checking Verification
This document describes the security fixes applied to ggml-metal-turbo.metal and how to verify them.
Changes Made
1. kernel_fwht_128
- Before: No bounds checking on
tid - After: Added
n_elementsparameter, early exit iftid >= n_elements - Impact: Prevents GPU out-of-bounds reads when grid size exceeds data elements
2. kernel_turbo4_dequant
- Before: No validation of
srcornormsbuffer sizes - After: Added
src_sizeandnorms_sizeparameters, validates all buffer accesses - Impact: Prevents reading beyond allocated buffers
3. kernel_attention_turbo4
- Before: Empty body (incomplete implementation)
- After: Fully implemented with bounds checking for Q, K, and scores buffers
- Impact: Proper attention computation with security guarantees
4. kernel_softmax (new)
- Added softmax kernel for attention pipeline
- Includes bounds checking for row/column indices
Testing
Manual Testing (Metal)
// Test with oversized grid
MTLSize grid = MTLSizeMake(1000, 1, 1); // More threads than elements
[encoder dispatchThreads:grid threadsPerThreadgroup:MTLSizeMake(128, 1, 1)];
Validation Checklist
kernel_fwht_128: Threads beyondn_elementsreturn immediatelykernel_turbo4_dequant: Buffer sizes validated before accesskernel_attention_turbo4: All buffer accesses boundedkernel_softmax: Row index validated- No GPU crashes with oversized dispatch grids
- No memory leaks from OOB reads
Performance Impact
Bounds checking adds minimal overhead:
- One comparison per thread (negligible)
- Early exit saves computation for invalid threads
- No impact on valid (within-bounds) execution paths
Security Assessment
Before: Medium risk — GPU OOB reads could leak memory or crash
After: Low risk — All buffer accesses validated
Severity: Medium → Low