Introducing IGT InferenceIQ GPU Tuner. Built for the full NVIDIA GPU family.

A Python + C++ open tool that configures, validates, and exports GPU optimization profiles for LLM inference, from edge T4s to Blackwell Ultra B300 clusters.

14

GPU models supported

60+

Tuning knobs across 7 layers

0–100

Inference readiness score

2

Languages: Python + C++

Supported GPU families

From edge accelerators through Blackwell Ultra, one workflow for the GPUs you already run.

T4L4A100A800L40SRTX 6000 AdaH100H200GH200B100B200GB200 NVL72B300 ✦GB300 NVL72 ✦

60+ knobs spanning 7 optimization layers

Tuning spans the stack from driver and CUDA through PyTorch, NCCL, vLLM, Hugging Face, and the host OS.

  • L1NVML / Driver9 knobs
  • L2CUDA Runtime9 knobs
  • L3PyTorch8 knobs
  • L4NCCL14 knobs
  • L5vLLM8 knobs
  • L6HuggingFace5 knobs
  • L7System / OS7 knobs

What IGT does

  • Pre-flight readiness check

    Scores every GPU 0–100 before each deploy

  • One-shot LLM optimization profile

    Clocks, power, compute mode via NVML in one command

  • Full env script generation

    Auto-exports all 53 env vars across CUDA, PyTorch, NCCL, vLLM, HF

  • Live telemetry monitor

    Real-time SM clock, VRAM, power, P-state per GPU

  • HBM bandwidth benchmark

    Actual vs theoretical memory throughput via C++ NVML

  • InferenceIQ registry export

    Feeds GPU config into the Inference Knowledge Repository

Capability coverage

Feature comparison: nvidia-smi, DCGM, Nsight, and IGT
CapabilitysmiDCGMNsightIGT
Readiness score
LLM opt. profile
60+ env knobs
Pre-flight gate
Clock + power lock
Fleet monitoring
HBM benchmark~
Kernel profiling
Registry export