Microsoft System Resolve outage cased Increased application Load time and Telemetry pipeline issues
Incident Report for CustomerSuccessBox Status
Resolved
This incident has been resolved.
Posted Aug 30, 2022 - 15:08 UTC
Update
We are continuing to monitor for any further issues.
Posted Aug 30, 2022 - 15:07 UTC
Monitoring
Issue:
Delays in health processing and application load times have increased.
No application data has been lost during this time. System processing should be caught up shortly.
Permanent loss of product telemetry data sent to us from the ~8 hours window from 06:55 to 14:48 UTC
Cause:
Microsoft is facing global System Resolve outage, i.e. Internal name server resolution service outage.
This is caused due to bug in Ubuntu latest release:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1988119
https://status.azure.com/status
Resolution:
Load balancer was able to distribute the load to the primary server and there was no downtime experience because of this issue, apart for performance downgrade. Temp resolution has been applied to bring the application performance times to normal. We will revert to full performance once Microsoft service is fully operational.
Posted Aug 30, 2022 - 06:55 UTC
This incident affected: CustomerSuccessBox Application and Feature Ingestion (Telemetry) - API/JS/Streaming Data.