External variables often affect your performance more than anything you did in marketing. Yet most marketing reports are missing the impact of macro factors like pandemics, the economy, and seasonality. Failing to control for significant variables means missing your targets in downturns and overcompensating your team in periods of high demand.


Estimating the impact of one or more variables on performance can be done with linear regression, by correlating spikes and dips between variables...


1. Scenario
It’s time to set next quarter’s goals, but how do we account for COVID? Performance was down and everyone missed their targets, but if we set the goal too low as the economy recovers we’ll go bankrupt for all the bonus payouts! How do we model what’s realistic?
We’re setting quarterly goals by the end of the week and we need your submission

Leadership won’t approve anything they deem ‘unrealistic’

That’s based on their view that we’re exiting the pandemic and shouldn’t see further lockdowns

Though I have an agreement that bonus schedules will be revised accordingly if we do miss target because of COVID again

Of course the last few quarters have been unnaturally down thanks to the pandemic

What I want to see from you is what numbers would we have hit if we control for COVID?

If you can run a regression analysis and get back to me end of day that’d be great

You can get the Google Mobility data here:

2. Brief

The workhorse of analytics, Linear Regression, is a tool every marketer should have in their toolbox. It’s unreasonably effective for its simplicity and ease of use. Despite what you might fear, you don’t need to know a lot of advanced statistics to make use of it. Excel and GSheets have you covered with the LINEST function, which can correlate multiple variables and give you an estimate of their relative impact on performance.

This is done using the ordinary least squares method, which is relatively easy to visualize with two variables. If you had height and weight, and plotted them each on one axes, you’d see they were correlated and could draw a line of best fit between them. LINEST is doing this but can handle multiple variables for you, providing coefficients that can be plugged into an equation to estimate what performance would have been controlling for each significant variable.

One common use of linear regression is to estimate the impact of one aggregated dataset on another. For example if we wanted to know the impact of COVID on the number of sales leads we captured, there’s no way to track the impact of COVID directly. Using Linear Regression, we can associate the spikes and dips of COVID with spikes and dips in leads captured, to draw a picture of the relationship between the two.

3. Tutorial

Hey, I'm going to help you calculate the impact of COVID. So what we're looking at here is Google COVID data. This is from Google maps. You can get this from COVID mobility. On I downloaded the global CSV, but it's a huge file though. So what I recommend is you just open that in a text editor rather than opening it in Excel or Google sheets, and then just delete all the lines that you don't need.

