Monitoring and web system outage

Incident Report for NTP Pool

Resolved

This incident has been resolved.
Posted Sep 01, 2023 - 08:39 UTC

Monitoring

Most or all services should be back up again.
Posted Sep 01, 2023 - 04:12 UTC

Identified

We're rebooting some servers to get the Ceph cluster storage working again.
Posted Sep 01, 2023 - 03:47 UTC

Investigating

Many containers across the central cluster restarted; monitoring and web services are unavailable or sporadically working. The DNS / NTP service is unaffected.
Posted Sep 01, 2023 - 03:46 UTC
This incident affected: Management Portal, Public website, and DNS updates.