Active community maintaining Strix Halo AI software stack
| Resource | URL | Focus |
|---|---|---|
| Strix Halo Wiki | https://strixhalo.wiki | Comprehensive setup guides |
| lhl’s testing repo | https://github.com/lhl/strix-halo-testing | Benchmarks, build scripts |
| llm-tracker.info | https://llm-tracker.info/_TOORG/Strix-Halo | Deep technical documentation |
| kyuz0 toolboxes | https://github.com/kyuz0/amd-strix-halo-toolboxes | Docker containers |
| Repository | Purpose | Best For |
|---|---|---|
| kyuz0/amd-strix-halo-toolboxes | llama.cpp containers | LLM inference |
| kyuz0/amd-strix-halo-vllm-toolboxes | vLLM containers | Production serving |
| scottt/rocm-TheRock | PyTorch wheels | CV/ML research |
| lhl/strix-halo-testing | Comprehensive test scripts | Benchmarking |
| llm-tracker.info | Ongoing performance notes | Latest findings |
| lemonade-sdk/llamacpp-rocm | Pre-built llama.cpp binaries | Quick setup |
| hjc4869/llama.cpp | Optimized fork | David Huang’s optimizations |
| nix-strix-halo | Nix flake | Reproducible builds |
For Computer Vision/PyTorch work: Use
scottt/rocm-TheRockpath. The kyuz0 llama.cpp containers are LLM-focused and won’t help for EfficientLoFTR, diffusion models, etc.
| Resource | URL |
|---|---|
| Main site | https://lemonade-server.ai/ |
| FAQ/Docs | https://lemonade-server.ai/docs/faq/ |
| GitHub | https://github.com/lemonade-sdk/lemonade |
| ROCm llama.cpp | https://github.com/lemonade-sdk/llamacpp-rocm |
| AMD Tutorial | https://www.amd.com/en/developer/resources/technical-articles/2025/ryzen-ai-radeon-llms-with-lemonade.html |
| Resource | URL |
|---|---|
| Ryzen AI SDK docs | https://ryzenai.docs.amd.com/en/latest/relnotes.html |
| Level1Techs benchmarks | https://forum.level1techs.com/t/strix-halo-ryzen-ai-max-395-llm-benchmark-results/233796 |
| ROCm Issue Tracker | https://github.com/ROCm/ROCm/issues |
| AMD Developer Forums | https://community.amd.com/t5/ai/ct-p/amd-ai |
Critical: Containers are NOT virtual machines. They share the host kernel.
| Component | Location | Can Vary in Container? |
|---|---|---|
| Kernel | Host | ❌ NO - shared by all containers |
Kernel Modules (amdgpu, etc.) |
Host | ❌ NO - loaded in host kernel |
Device Files (/dev/kfd, /dev/dri) |
Host | ❌ NO - bind-mounted from host |
| ROCm Userspace (rocBLAS, hipBLAS) | Container | ✅ YES - part of container image |
| Python Version | Container | ✅ YES - isolated per container |
| Application Binaries | Container | ✅ YES - isolated per container |
The host kernel directly impacts GPU performance and compatibility:
| Kernel Feature | Impact on Strix Halo |
|---|---|
amdgpu driver version |
Determines GPU support, bug fixes |
| VRAM visibility | 6.16.9+ fixes bug showing only ~15.5GB |
| Kernel ABI | 6.18.4+ may need matching ROCm userspace |
| Memory management | GTT/GART configuration via kernel params |
Container Image Contains:
Host Provides:
To test different kernel versions, you must:
Recommendation for ablation studies: Treat kernel as a constant, not a variable. Testing kernels requires host reboots and serialized testing, which is expensive and out of scope for software stack configuration testing.
# Check current kernel
uname -r
# List installed kernels
rpm -qa kernel | sort -V # Fedora
dpkg -l | grep linux-image # Ubuntu
# List boot entries
sudo grubby --info=ALL | grep -E "(title|kernel)"
# Set kernel for next boot
sudo grubby --set-default /boot/vmlinuz-VERSION
# Reboot
sudo reboot
| Term | Definition |
|---|---|
| gfx1151 | AMD GPU architecture identifier for Strix Halo |
| RDNA 3.5 | GPU architecture (between RDNA 3 and RDNA 4) |
| GTT | Graphics Translation Table - dynamic GPU memory |
| GART | Graphics Address Remapping Table - fixed GPU memory |
| rocWMMA | ROCm Wave Matrix Multiply-Accumulate library |
| hipBLASLt | AMD’s lightweight BLAS library for HIP |
| AOTriton | AMD’s OpenAI Triton fork |
| TheRock | AMD’s ROCm nightly build system |
| pp/tg | Prompt processing / token generation (llama.cpp metrics) |
| FA | Flash Attention |
| Toolbox | Fedora’s containerized development environment |
| Distrobox | Universal container manager supporting multiple distros |
| VRAM estimator | Tool to calculate GPU memory needs for GGUF models |
| RPC | Remote Procedure Call - enables distributed inference |
| MES | Micro Engine Scheduler (GPU job scheduler) |
| SDMA | System DMA engine (can cause artifacts if enabled) |
| HSA | Heterogeneous System Architecture (AMD’s programming model) |
Back to: KNOWLEDGE_BASE.md