Long Inference Request Queues
Incident Report for Modelbit
Resolved
A bug in Modelbit allowed a customer with a slow GPU-requiring model to, when sending a large batch of inferences, consume all available GPUs. This resulted in all other customers timing out inferences while Modelbit processed the batch of inferences. The team is fixing this issue now.
Posted May 06, 2024 - 19:41 UTC
Update
Inferences are running normally at present, but the team is still investigating the cause of the spike in queue length and timeouts.
Posted May 06, 2024 - 19:33 UTC
Investigating
We are currently investigating long inference request queues leading to delayed inferences and timeouts.
Posted May 06, 2024 - 19:23 UTC
This incident affected: app.modelbit.com (Running Models).