Self-host with Docker, Helm, or a single binary.
# Docker (preview, image lands with the first release)
docker run -p 8080:8080 ghcr.io/vaultplane/gateway:latestSee the full install guide One surface for commercial and self-hosted models, with streaming pass-through.
Route by policy, fail over automatically, and keep traffic flowing.
Attribute spend by app and team. The FinOps hook for AI traffic.
Cut cost and latency with exact-match caching; semantic caching on Cloud.
Every call emits traces and metrics. Plug into the observability you already run.
Extend the data plane with Wasm and gRPC sidecar plugins.
We publish numbers only with context. These land when the runtime stabilizes.