Resolved: WordPress VIP Availability Issues

Update (20:24 UTC): On Thursday, June 3, beginning at 09:55 UTC  (5:55 a.m. ET), the VIP Platform experienced degraded performance that lasted approximately 100 minutes.

We take incidents like this very seriously and would like to outline what happened, as well as the steps we have taken to help prevent future occurrences.

What Happened

The issue affected response times in a single data center on the West Coast of the United States, which affected a subset of sites on the WordPress VIP platform.

This was caused by two memcached hosts having issues when network switches (hardware) they are connected to were upgraded and rebooted during regular maintenance operations.

Impact

Those sites in the West Coast datacenter experienced slower response times and, in some cases, 503 errors. The issue was apparent between 09:55 UTC and 11:35 UTC. Busier sites with more memcached traffic were more adversely impacted.

Not Impacted

Sites hosted in other data centers were not affected.

Timeline

  • At 10:20 UTC, we received the first reports of issues.
  • At 11:20 UTC, we identified that the issue was related to a specific datacenter.
  • At 11:35 UTC, performance was restored.

Future Prevention

We have already implemented, or are currently implementing, additional safeguards and process improvements designed to prevent similar issues from happening again. These include (but are not limited to):

  • Additional architecture changes to prevent memcache oversubscription. 
  • Internal monitoring to more quickly identify and resolve any issues.

Questions?

If you have any questions related to this incident, please open a support ticket and we will be happy to assist.


Update (12:22 UTC): The issue affecting response times in a single West Coast data center has been resolved. Only some VIP sites were affected. More detailed information will be posted here in the VIP Lobby, after the investigation has been completed.


Update (11:54 UTC): We’re currently seeing improvements and are continuing to monitor the situation.


We are currently troubleshooting issues with the WordPress VIP platform, affecting in a single data center on the West Coast of the United States. A limited number of sites may experience slow load times or errors in the meantime.

Sorry for the trouble! We are working on the issue, and will follow up with another alert once this is resolved

We will continue to update this post and tweet out status updates from @wpvipstatus until the issue is resolved

If you have any questions, please open a support ticket and we will be happy to assist.