SnapEngage Service Status


Home > Service disruption for the Visitor Chat API

Categories:


RESOLVED: Service disruption for the Visitor Chat API*
All systems are back to normal

 

Postmortem: We have identified that our error monitoring did not detect the increased error rate on the API endpoints, requiring our customers to report the issue before we could escalate to our hosting provider. We have taken the corrective actions and have reconfigured our alert policies to be notified as soon as the error rate increases on this component. If an increased error rate would happen on the API, our technical team will be notified right away. Google Cloud Platform is still working on a full resolution of the deployment process which introduced the configuration issue yesterday.

6:59 am Mountain time We have leveraged a work-around provided by Google to stop the Chat API error rate. A permanent solution is being worked on by Google. Customers using the Chat API in their mobile applications should see the API performing back at normal levels.

6:26 am Mountain time Google Cloud Platform, our hosting provider, has identified a configuration problem on their infrastructure that seems to be the root cause of the Chat API returning HTTP 500. Google’s system reliability engineers are working on a resolution to restore the proper configuration. We are waiting for a resolution or an ETA for the resolution from Google now.

6:03 am Mountain time We are still actively working on the issue, and so is our hosting provider, doing the same in parallel. We are trying a few actions to attempt a resolution but it seems like a correction from our hosting provider will be necessary. We will post an update as soon as we have additional feedback, or in an hour from now.

5:04 am Mountain time The API developers are still working on getting to the root cause of the elevated error rate to resolve the issue. We are working with our hosting provider to help localize the root cause. We will post an update in an hour or earlier.

4:00 am Mountain time We are seeing the Chat API reporting a high error rate (HTTP 500). This API is used by some of our clients to add the live chat functionality into their own mobile applications. The API developers are working on resolving this as soon as possible. We will post an update in an hour or before. The elevated error rate on the API endpoint started a few hours ago, we are still researching when this started.

*Please note that normal chats inside web browsers, desktop or mobile, are not impacted.


Published February 18, 2016