Supported GPU families

From edge accelerators through Blackwell Ultra, one workflow for the GPUs you already run.

T4L4A100A800L40SRTX 6000 AdaH100H200GH200B100B200GB200 NVL72B300 ✦GB300 NVL72 ✦

60+ knobs spanning 7 optimization layers

Tuning spans the stack from driver and CUDA through PyTorch, NCCL, vLLM, Hugging Face, and the host OS.

Pre-flight readiness check
Scores every GPU 0–100 before each deploy
One-shot LLM optimization profile
Clocks, power, compute mode via NVML in one command
Full env script generation
Auto-exports all 53 env vars across CUDA, PyTorch, NCCL, vLLM, HF
Live telemetry monitor
Real-time SM clock, VRAM, power, P-state per GPU
HBM bandwidth benchmark
Actual vs theoretical memory throughput via C++ NVML
InferenceIQ registry export
Feeds GPU config into the Inference Knowledge Repository

Feature comparison: nvidia-smi, DCGM, Nsight, and IGT
Capability	smi	DCGM	Nsight	IGT
Readiness score	✗	✗	✗	✓
LLM opt. profile	✗	✗	✗	✓
60+ env knobs	✗	✗	✗	✓
Pre-flight gate	✗	✗	✗	✓
Clock + power lock	✓	✗	✗	✓
Fleet monitoring	✗	✓	✗	✓
HBM benchmark	✗	✗	~	✓
Kernel profiling	✗	✗	✓	✗
Registry export	✗	✗	✗	✓