All Articles
Postmortem: December 15th & 16th

Postmortem: December 15th & 16th

2
min read
Overview:
Overview:

Hi all,

I wanted to follow up on the intermittent errors you may have experienced using Streak on Dec 15 and 16. These are related to the same database as our previous postmortem, but have a somewhat different cause: Gmail’s recent outage had a number of unexpected effects on our system that caused overloading in our database system.

While it was widely publicized that Gmail experienced errors in email processing, this had two effects on our system which resulted in increased load:

  1. We saw that email deliveries came in bursts, causing a spiky load pattern that temporarily overwhelmed our database
  2. The intermittent errors in the Gmail API caused us to have to refresh some customers’ email indexes to ensure that we didn’t miss data. This caused additional load well above normal.

Since we process each incoming message to ensure that emails are properly boxed and that magic columns are properly updated, this resulted in significant additional load on the system.

This was certainly an unusual two days in terms of stability for Gmail, but at the same time, this is a wake up call that in order to provide a reliable service for our customers, we need to make sure this system is engineered to be able to weather much heavier load than expected during normal times.

We now have the majority of our backend engineering team focusing on reworking the system with the goal of ensuring that even during periods of heavy load, we can continue providing the core Streak experience. We anticipate the major efforts of this work being done by the end of the year.

We apologize for the continued issues and are resolved to fix the underlying issues to provide the reliability you and your business expect in the new year.

-Fred
Engineering @ Streak

We're hiring

Come build something great with us.