14
GPU models supported
60+
Tuning knobs across 7 layers
0–100
Inference readiness score
2
Languages: Python + C++
Supported GPU families
From edge accelerators through Blackwell Ultra, one workflow for the GPUs you already run.
60+ knobs spanning 7 optimization layers
Tuning spans the stack from driver and CUDA through PyTorch, NCCL, vLLM, Hugging Face, and the host OS.
- L1NVML / Driver9 knobs
- L2CUDA Runtime9 knobs
- L3PyTorch8 knobs
- L4NCCL14 knobs
- L5vLLM8 knobs
- L6HuggingFace5 knobs
- L7System / OS7 knobs
What IGT does
Pre-flight readiness check
Scores every GPU 0–100 before each deploy
One-shot LLM optimization profile
Clocks, power, compute mode via NVML in one command
Full env script generation
Auto-exports all 53 env vars across CUDA, PyTorch, NCCL, vLLM, HF
Live telemetry monitor
Real-time SM clock, VRAM, power, P-state per GPU
HBM bandwidth benchmark
Actual vs theoretical memory throughput via C++ NVML
InferenceIQ registry export
Feeds GPU config into the Inference Knowledge Repository
Capability coverage
| Capability | smi | DCGM | Nsight | IGT |
|---|---|---|---|---|
| Readiness score | ✗ | ✗ | ✗ | ✓ |
| LLM opt. profile | ✗ | ✗ | ✗ | ✓ |
| 60+ env knobs | ✗ | ✗ | ✗ | ✓ |
| Pre-flight gate | ✗ | ✗ | ✗ | ✓ |
| Clock + power lock | ✓ | ✗ | ✗ | ✓ |
| Fleet monitoring | ✗ | ✓ | ✗ | ✓ |
| HBM benchmark | ✗ | ✗ | ~ | ✓ |
| Kernel profiling | ✗ | ✗ | ✓ | ✗ |
| Registry export | ✗ | ✗ | ✗ | ✓ |