Resolved: Service Interruption

This notice relates to the following platforms: VIP Go

Service has been restored following a 4 min interruption (08:23 – 08:27 UTC) caused by a maintenance deploy to platform level code. The error was reverted as quickly as possible.

Apologies for the inconvenience. Please let us know if you have any questions.


Incident Report

(published February 19, 2019 at 22:30 UTC)

Summary
On January 31, 2019 at 0822 UTC, a maintenance deployment caused five minutes of downtime for applications hosted on the VIP Go platform. This impact was platform-wide.

Timeline

  • 08:22 UTC: VIP team deploys changeset.
  • 08:23 UTC: Monitoring indicates increased error rates. VIP Team begins investigation.
  • 08:27 UTC: VIP team deploys revert.
  • 08:42 UTC: VIP Lobby post published. Due to the fast pace of this outage, Twitter was not included in our communications.

Root Cause
A maintenance release was deployed to VIP Go that removed some deprecated helper functions from the platform. One of these functions was still relied upon, by wpcom_vip_load_plugin(), the function used to load plugins in code. As a result, all sites using wpcom_vip_load_plugin() experienced PHP fatal errors because of the missing function and the origins, therefore, returned 500 errors to incoming requests.

Future Prevention
We’re looking at two areas of improvement:

  • Improving our integration and pre-deploy testing to account for this specific use case so that similar issues can be caught earlier in the development process.
  • Improving our time-to-revert to reduce impact if similar issues come up again in the future.