Before anyone starts to panic, you need to triage the problem and estimate the size of the issue. We only have a week's worth of data, could this just be an anomaly?

Learn

When you see something abnormal in the data, it may just be random noise that will revert to the mean...More

💪 Useful 0

😓 Difficult 0

🎉 Fun 0

😴 Boring 0

🚨 Errors 0

😕 Confusing 0

🤓 Interesting 0

Premium subscription required.

Excel experience recommended.

1. Scenario

BANK OF YU OFFICE – MORNING

Apple pushed live their iOS14 update last week, and you're now looking at the data. What you see isn't pretty – it's immediately clear that tracking is broken…Ashton Donaghy

at Bank of Yu

Hey just sending you the latest data: it looks like around 20-30% of our conversions are being attributed to direct instead of Facebook.

Can you take a look?

This course is a work of fiction. Unless otherwise indicated, all the names, characters, businesses, data, places, events and incidents in this course are either the product of the author's imagination or used in a fictitious manner. Any resemblance to actual persons, living or dead, or actual events is purely coincidental.

2. Brief

One simple way to determine if an event was significant or just noise, is to do some basic anomaly detection. This works by finding the standard deviation and mean of the data prior to the event, then seeing how many standard deviations out from the mean the value is. In using this statistical technique instead of guessing or eyeballing the data gives you a reliable, consistent method for determining how important a deviation a worrisome new data point is. You find the upper bound for anomalies by adding 1x, 2x, or 3x the standard deviation to the mean, and taking it away to find the lower bound. If it’s between those values, it’s an anomaly.

This is key if you want to avoid continuously chasing your tail as an analyst, because it can tell you when to dive in to solve a problem, and when it makes sense to just relax and wait for more data. It's likely that if something isn't a true anomaly it will revert back to the mean given a few more days or weeks. This technique is closely related to quartiles, for example dividing the data into 4 buckets using the QUARTILE function in GSheets, you’ll see the difference between the 3rd and 1st quartile, is approximately 1.35 times the standard deviation.

4. Exercises

5. Certificate

Complete all of the exercises first to receive your certificate!

Share This Course

Copy to clipboard