Long Inference Request Queues

Incident Report for Modelbit

Resolved

A bug in Modelbit allowed a customer with a slow GPU-requiring model to, when sending a large batch of inferences, consume all available GPUs. This resulted in all other customers timing out inferences while Modelbit processed the batch of inferences. The team is fixing this issue now.

Posted May 06, 2024 - 19:41 UTC

Update

Inferences are running normally at present, but the team is still investigating the cause of the spike in queue length and timeouts.

Posted May 06, 2024 - 19:33 UTC

Investigating

We are currently investigating long inference request queues leading to delayed inferences and timeouts.

Posted May 06, 2024 - 19:23 UTC