# Metal Shader Security Tests (Issue #57)

## Bounds Checking Verification

This document describes the security fixes applied to `ggml-metal-turbo.metal` and how to verify them.

## Changes Made

### 1. `kernel_fwht_128`
- **Before:** No bounds checking on `tid`
- **After:** Added `n_elements` parameter, early exit if `tid >= n_elements`
- **Impact:** Prevents GPU out-of-bounds reads when grid size exceeds data elements

### 2. `kernel_turbo4_dequant`
- **Before:** No validation of `src` or `norms` buffer sizes
- **After:** Added `src_size` and `norms_size` parameters, validates all buffer accesses
- **Impact:** Prevents reading beyond allocated buffers

### 3. `kernel_attention_turbo4`
- **Before:** Empty body (incomplete implementation)
- **After:** Fully implemented with bounds checking for Q, K, and scores buffers
- **Impact:** Proper attention computation with security guarantees

### 4. `kernel_softmax` (new)
- Added softmax kernel for attention pipeline
- Includes bounds checking for row/column indices

## Testing

### Manual Testing (Metal)

```metal
// Test with oversized grid
MTLSize grid = MTLSizeMake(1000, 1, 1);  // More threads than elements
[encoder dispatchThreads:grid threadsPerThreadgroup:MTLSizeMake(128, 1, 1)];
```

### Validation Checklist

- [ ] `kernel_fwht_128`: Threads beyond `n_elements` return immediately
- [ ] `kernel_turbo4_dequant`: Buffer sizes validated before access
- [ ] `kernel_attention_turbo4`: All buffer accesses bounded
- [ ] `kernel_softmax`: Row index validated
- [ ] No GPU crashes with oversized dispatch grids
- [ ] No memory leaks from OOB reads

## Performance Impact

Bounds checking adds minimal overhead:
- One comparison per thread (negligible)
- Early exit saves computation for invalid threads
- No impact on valid (within-bounds) execution paths

## Security Assessment

**Before:** Medium risk — GPU OOB reads could leak memory or crash  
**After:** Low risk — All buffer accesses validated  
**Severity:** Medium → Low