Serving and Inference Gateway User Guide

Manage inference endpoints, keys, requests, observability, incidents, policies, rollouts, snapshots, routing, and failover.

Who This Guide Is For

Page	Use It For
`/serving`	Serving overview.
`/serving/endpoints`	Endpoint catalog.
`/serving/keys`	Serving API keys.
`/serving/requests`	Request inspection.
`/serving/observability`	Serving metrics and traces.
`/serving/policies`	Routing and access policies.
`/serving/rollouts`	Canary, blue/green, and rollout controls.
`/serving/incidents`	Serving incidents.
`/serving/snapshots`	Endpoint and routing snapshots.
`/serving/rim`	Runtime intelligence and routing signals.

Concept	Meaning
Endpoint	A stable serving interface for applications.
Inference Gateway	The request entry point that authenticates, routes, and observes inference calls.
Neural Router	The routing decision engine for deployment selection, traffic split, failover, and policy enforcement.
Rollout	A controlled release pattern such as canary, blue/green, or shadow.
Snapshot	A saved view of endpoint or routing configuration for review and rollback.