Twitter - ANOMALY DETECTION
Twitter developed novel statistical techniques for automatically detecting anomalies in cloud infrastructure data. Specifically, the techniques employ statistical learning to detect anomalies in both application, and system metrics. • They employ time series decomposition to filter the trend and seasonal components of the time series. • They use robust statistical metrics – median and median absolute deviation (MAD) – to accurately detect anomalies, even in the presence of seasonal spikes. The techniques that Arun presents were evaluated with a wide variety of time series (system and application metrics obtained from production as well stock data) and have been deployed in production at Twitter. Arun demonstrates the efficacy of the proposed techniques using production data.
In 2013, Jordan Hochenbaum and I developed an automated anomaly detection algorithm for Twitter. Specifically, we were attempting to automate the detection of anomalies within user metrics that exhibit a heavy seasonal component. We worked with Arun Kejariwal, and developed a method called S-H-ESD that removes the seasonal components and then robustly detects outliers in the residual. We presented the work at USINEX HotCloud 2014, and published a paper on the detection of anomalies in long term time series. The project has been open-sourced as an R package, and is available on Twitter's github. The following video shows Arun presenting the work and it's integration into Twitter's anomaly detection system.