tag:status.modelbit.com,2005:/historyModelbit Status - Incident History2024-03-29T05:23:19ZModelbittag:status.modelbit.com,2005:Incident/199692182024-02-12T19:00:00Z2024-02-13T20:56:22ZCold starts and GPU inference outage<p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>19:00</var> UTC</small><br><strong>Resolved</strong> - Due to a bad SSL cert, containers running models rebooted across Modelbit at about 11:13am US Pacific Time. Most models restarted instantly and experienced a cold boot and start. Large models that require GPUs took longer to reboot, with the longest taking about 15 minutes. For those models, inference requests during that window errored or timed out. <br /><br />Inferences are currently running normally. The Modelbit team will perform a full post-mortem.</p>tag:status.modelbit.com,2005:Incident/192374302023-11-28T21:54:12Z2023-11-28T21:54:12ZModelbit web application and Python API outage<p><small>Nov <var data-var='date'>28</var>, <var data-var='time'>21:54</var> UTC</small><br><strong>Resolved</strong> - The outage has been resolved.</p><p><small>Nov <var data-var='date'>28</var>, <var data-var='time'>21:42</var> UTC</small><br><strong>Monitoring</strong> - Modelbit's systems have recovered. We are monitoring the web application and Python API for any lingering effects.</p><p><small>Nov <var data-var='date'>28</var>, <var data-var='time'>21:38</var> UTC</small><br><strong>Investigating</strong> - Modelbit's web application and Python API are experiencing an outage in the Ohio region. Other regions, and all customer production deployments, are running normally.</p>tag:status.modelbit.com,2005:Incident/171832782023-05-08T18:22:58Z2023-05-08T18:22:58ZNew deployments returning errors<p><small>May <var data-var='date'> 8</var>, <var data-var='time'>18:22</var> UTC</small><br><strong>Resolved</strong> - The issue has been fixed. Models deployed during the outage returned errors in production. Those models have now been redeployed, and are working as intended.</p><p><small>May <var data-var='date'> 8</var>, <var data-var='time'>17:47</var> UTC</small><br><strong>Identified</strong> - We have identified the cause of the issue and are validating a fix.</p><p><small>May <var data-var='date'> 8</var>, <var data-var='time'>17:27</var> UTC</small><br><strong>Investigating</strong> - We are investigating an issue in which newly deployed models return errors at inference time. Previously-deployed, currently running models are unaffected.</p>