AI Gateway Analytics & Monitoring
The AI Gateway records all passing requests and presents usage, cost, and operational status across Overview, Statistics, and Monitoring pages. This article explains the content and features of each page.
Overview
The Overview page gives you a quick snapshot of the entire AI Gateway status:
- KPI Cards — Active channel count, virtual key count, today’s requests, today’s cost (and token usage).
- Channel Health Summary — Real-time health status of each channel (Normal / Degraded / Unavailable).
- Recent Events — Stream of recent routing events (event type, model, channel, latency, timestamp).
The Overview is perfect for daily checks to see if the gateway is operational, as well as tracking today’s spending and request volume.
Statistics
The Statistics page enables in-depth analysis of usage and costs.
KPI Summary
At the top, essential metrics are shown: Total Requests, Total Token Usage, Cost (USD), Average Latency (ms).
Aggregation & Grouping
- Overall Summary — Total / Successful / Failed request counts, input / output / total token counts, average tokens per request, total cost, and distribution by event type (Routed / Failover / Failed / No Channel).
- Group by Dimensions — Switch between grouped insights:
- By Model — Request counts, tokens, cost, and latency per model.
- By Key — Consumption by each virtual key.
- By Date — Daily usage trends.
- By Request Type — Grouped by request type.
Multimodal Usage
In addition to text-based interactions, the Statistics page separately displays multimodal usage: Image generations, voice input units, voice output units — making it easy to account for image and voice invocations.
Trends
Line charts visualize day-by-day aggregate request trends, helping you spot usage changes over time.
Budget Management (VIP)
- Gauge dashboards visualize utilized budget and percentage for each budget control, with alerts as thresholds approach.
- Budget management is a premium feature; if unsubscribed, this section is locked with an upgrade prompt.
Monitoring
The Monitoring page provides a more real-time view of operations:
- Live Event Logs — Every routing event (request ID, model, channel, latency, etc.) shown in real time.
- Event Type Counts — Categorized counts for Routed, Failed, Failover, No Channel, etc. for quick anomaly detection.
Prerequisites
- You are logged into your ServBay account with channels and virtual keys configured.
- Actual requests have passed through the gateway (otherwise statistics will be empty).
Usage Tips
- Control Costs — Group by "Model" or "Key" to pinpoint major cost drivers. Where needed, set rate limits for virtual keys or quotas for channels.
- Troubleshoot Issues — If success rates drop, review the share of failed / no_channel events and use channel health to locate problematic channels.
- Optimize Latency — Monitor average latency KPIs and per-model latency to assess the responsiveness of different vendors or regional endpoints.
FAQ
- Q: Why is there no data on the Statistics page?
- A: Stats are based on real requests routed through the gateway. Ensure your applications or integrated tools are actually making requests via the gateway.
- Q: Why is the budget management section locked?
- A: Budget management is a premium feature. Subscribe to unlock these controls.
- Q: How is cost calculated?
- A: Costs are estimated based on each channel’s pricing (including your multiplier settings) and actual token/call volume. Pricing parameters can be adjusted under Channel Advanced Settings.
Summary
With the Overview, Statistics, and Monitoring pages, every AI Gateway call is fully observable — from fast daily spend checks, to deep-dive analytics by model/key/date, real-time event streams, and budget management. Harness these dashboards to continually optimize the cost and reliability of your AI development.
