59 lines
2.0 KiB
Markdown
59 lines
2.0 KiB
Markdown
|
|
# Metal Shader Security Tests (Issue #57)
|
||
|
|
|
||
|
|
## Bounds Checking Verification
|
||
|
|
|
||
|
|
This document describes the security fixes applied to `ggml-metal-turbo.metal` and how to verify them.
|
||
|
|
|
||
|
|
## Changes Made
|
||
|
|
|
||
|
|
### 1. `kernel_fwht_128`
|
||
|
|
- **Before:** No bounds checking on `tid`
|
||
|
|
- **After:** Added `n_elements` parameter, early exit if `tid >= n_elements`
|
||
|
|
- **Impact:** Prevents GPU out-of-bounds reads when grid size exceeds data elements
|
||
|
|
|
||
|
|
### 2. `kernel_turbo4_dequant`
|
||
|
|
- **Before:** No validation of `src` or `norms` buffer sizes
|
||
|
|
- **After:** Added `src_size` and `norms_size` parameters, validates all buffer accesses
|
||
|
|
- **Impact:** Prevents reading beyond allocated buffers
|
||
|
|
|
||
|
|
### 3. `kernel_attention_turbo4`
|
||
|
|
- **Before:** Empty body (incomplete implementation)
|
||
|
|
- **After:** Fully implemented with bounds checking for Q, K, and scores buffers
|
||
|
|
- **Impact:** Proper attention computation with security guarantees
|
||
|
|
|
||
|
|
### 4. `kernel_softmax` (new)
|
||
|
|
- Added softmax kernel for attention pipeline
|
||
|
|
- Includes bounds checking for row/column indices
|
||
|
|
|
||
|
|
## Testing
|
||
|
|
|
||
|
|
### Manual Testing (Metal)
|
||
|
|
|
||
|
|
```metal
|
||
|
|
// Test with oversized grid
|
||
|
|
MTLSize grid = MTLSizeMake(1000, 1, 1); // More threads than elements
|
||
|
|
[encoder dispatchThreads:grid threadsPerThreadgroup:MTLSizeMake(128, 1, 1)];
|
||
|
|
```
|
||
|
|
|
||
|
|
### Validation Checklist
|
||
|
|
|
||
|
|
- [ ] `kernel_fwht_128`: Threads beyond `n_elements` return immediately
|
||
|
|
- [ ] `kernel_turbo4_dequant`: Buffer sizes validated before access
|
||
|
|
- [ ] `kernel_attention_turbo4`: All buffer accesses bounded
|
||
|
|
- [ ] `kernel_softmax`: Row index validated
|
||
|
|
- [ ] No GPU crashes with oversized dispatch grids
|
||
|
|
- [ ] No memory leaks from OOB reads
|
||
|
|
|
||
|
|
## Performance Impact
|
||
|
|
|
||
|
|
Bounds checking adds minimal overhead:
|
||
|
|
- One comparison per thread (negligible)
|
||
|
|
- Early exit saves computation for invalid threads
|
||
|
|
- No impact on valid (within-bounds) execution paths
|
||
|
|
|
||
|
|
## Security Assessment
|
||
|
|
|
||
|
|
**Before:** Medium risk — GPU OOB reads could leak memory or crash
|
||
|
|
**After:** Low risk — All buffer accesses validated
|
||
|
|
**Severity:** Medium → Low
|