This directory contains organized experiment runs. Each experiment tests a specific set of variables while keeping the system environment constant.
Core principle: Test everything systematically. Failures are valuable data - they tell you what’s incompatible with your kernel/firmware/distro combination.
experiments/
├── README.md # This file
├── TEMPLATE/ # Template for new experiments
│ ├── README.md # Experiment documentation template
│ ├── record-environment.sh # Record non-mutable system variables
│ ├── run-all-benchmarks.sh # Master runner for all benchmarks
│ └── raw_results/ # Created when benchmarks run
└── YYYY-MM-DD_description/ # Your experiment runs
├── ENVIRONMENT.txt # Recorded system state
├── README.md # What variables you're testing
├── run-all-benchmarks.sh # Modified from template
├── raw_results/ # Raw benchmark outputs
└── FINDINGS.md # LLM-analyzed summary
cp -r experiments/TEMPLATE experiments/$(date +%Y-%m-%d)_my-experiment
cd experiments/$(date +%Y-%m-%d)_my-experiment
./record-environment.sh > ENVIRONMENT.txt
IMAGES array with images to test./run-all-benchmarks.sh
This will create raw_results/ with all test outputs.
These MUST remain constant within a single experiment:
Record these with record-environment.sh before starting.
Each experiment should vary ONE or more of:
pytorch-simple.sh - Can PyTorch access GPU?llama-simple.sh - Can llama.cpp load model and generate tokens?whisper-simple.sh - Can whisper.cpp transcribe audio?Run these FIRST to identify which configurations work at all.
pytorch-gemm.sh - Matrix multiplication TFLOPSllama-bench.sh - Prompt processing and token generation t/swhisper-bench.sh - Transcription speed (realtime multiplier)Run these ONLY on working configurations.
These are expensive. Run on promising configurations only.
# Create experiment
cp -r experiments/TEMPLATE experiments/2026-01-26_rocm-version-comparison
cd experiments/2026-01-26_rocm-version-comparison
# Record environment
./record-environment.sh > ENVIRONMENT.txt
# Edit run-all-benchmarks.sh
# Set IMAGES to:
# softab:llama-hip-rocm644
# softab:llama-hip-rocm711
# softab:llama-hip-rocm72
# Run
./run-all-benchmarks.sh
# Results in raw_results/:
# softab__llama-hip-rocm644_llama_simple.log
# softab__llama-hip-rocm644_llama_bench.log
# softab__llama-hip-rocm711_llama_simple.log (might fail!)
# softab__llama-hip-rocm72_llama_simple.log
# softab__llama-hip-rocm72_llama_bench.log
# Analyze with Claude:
# "Analyze these benchmark results. Which ROCm version works best?"
Good failure documentation helps everyone:
# FINDINGS.md example
## Environment
- Kernel: 6.18.6-200.fc43
- Firmware: linux-firmware-20260110
## Results
### ✅ Working Configurations
- softab:llama-vulkan-radv - 45 t/s prompt processing
- softab:llama-therock - 38 t/s prompt processing
### ❌ Failed Configurations
- softab:llama-hip-rocm644
Error: "invalid device function"
Cause: Standard ROCm packages don't support kernel 6.18.6
- softab:pytorch-rocm72-fedora
Error: Build fails with missing gfx1151 kernels
Cause: Fedora ROCm packages lack gfx1151 support
## Conclusion
Kernel 6.18.6 requires TheRock nightlies or Vulkan backend.
Standard ROCm packages are incompatible.
This documentation prevents others from wasting time on known-broken combinations.