Latency Issues Across Various Models

Resolved·Degraded performance

This incident has been resolved.

Mon, Aug 25, 2025, 11:42 PM

(3 weeks ago)

Affected components

Aug 25, 2025, 10:13 PM

11:25 PM

API

Updates

Resolved

This incident has been resolved.

Mon, Aug 25, 2025, 11:42 PM

Monitoring

the issue causing latency across various models has been resolved.

Mon, Aug 25, 2025, 11:32 PM(10 minutes earlier)

Monitoring

performance has normalized across qwen/qwen3-32b, openai/gpt-20b, openai/gpt-120b and llama-3.1-8B-instance. The team is continuing to monitor.

Mon, Aug 25, 2025, 11:25 PM

Monitoring

We are seeing latency issues starting to recover across openai/gpt-oss-20b, llama-3.1-8B-instance, and qwen/qwen3-32b following a fix that has been implemented by the team. We continue to monitor to ensure the issue is fully remediated.

Mon, Aug 25, 2025, 10:41 PM(44 minutes earlier)

Investigating

The current models experiencing latency issues are: openai/gpt-oss-20b, llama-3.1-8B-instance, and qwen/qwen3-32b. They team is continuing to implement a fix.

Mon, Aug 25, 2025, 10:27 PM(14 minutes earlier)

Investigating

We are currently investigating latency across various models. TTFT may be impacted. Our engineers are analyzing current system performance and dependencies to isolate the issue.

Mon, Aug 25, 2025, 10:13 PM(13 minutes earlier)