"Have you ever thought about modeling? You recently started as an intern at Vexnomics and are eager for more responsibility. Finally your boss comes to you with an exciting opportunity to build your first marketing mix model!
GoolyBib Office – Morning
It's late on Thursday and you're waiting for your boss to leave for the day. She rises to go, but to your surprise she heads over to your desk on the way out of the door.
Hey have you got a sec?\nWe just got a new client, but the analysts are all at capacity.\nI know you are looking for more responsibility: would you want to take this on?\nI can walk you through how it works. It should be a pretty simple model and the client is an old friend of mine – it'll be a good project to learn on.\nSo what do you say? Are you interested?
Marketing Mix Modeling You're the newest intern at Vexnomics, a marketing science and research consultancy, and you just landed your first big opportunity: building a marketing mix model for a new client. The client is an old friend of your boss' – a fast-growing startup in the wearable tech for babies space called GoolyBib – so don't mess this up!\nYou know what marketing mix modeling is, but not how to do it. You'll get help from your boss, but you want to impress, and do as much of the heavy lifting proactively yourself. You'll have to learn about data collection and cleaning, selecting variables for the model and how to present data to the client.\nYou've been assured this will be a relatively simple modeling exercise, so it's the perfect project to learn on for your first model. Once you've built your model you should have a good grounding to tackle more advanced techniques in future projects, so you can't wait to get started!
However beautiful the strategy, you should occasionally look at the results.
Most attribution projects begin long after the strategy has already been set. So it's important to understand what the client did, why they did it, and what they expected to happen. Only then can you answer their questions in a way they'll be happy with. Remember they hired you because the results weren't what they expected... or because they never thought about how to measure them in the first place.
Your boss asks the client for a brief explaining the context around the project, and shares it with you. You should read the brief before starting work on the project!
Making a list, and checking it twice First things first: we need to collect the data from the client before we can start modeling. Let's make a checklist of anything that might be relevant so we can make sure we've got everything we need.
Friday Morning – Vexnomics Office
Arriving bright and early, you're ready to get started on your new project. You open your inbox to a kickoff email from your boss, outlining the first task.
New Project: GoolyBib\nHey, thanks for helping me out on this\nWe need to get a data request file across to the client today if possible.\nI've attached a template: let me know if any questions!
Data Requests This is your first task: putting a spreadsheet together full of requests for data. You need to make sure you include anything that would be helpful for the model. However you also don't want to include too many irrelevant requests that will overwhelm the client. Take the template from your boss's email and get started crafting your own copy to send to the client, based on what you learned in the brief. You've been advised that this should be a simple model, so greatly simplify the requests you see in the template.
It is a capital mistake to theorize before one has data.
Share a link to your copy of the spreadsheet, and a brief description of what you kept, added or removed.
Clean data equals a clear mind Now we've got the data, it's time to get it in the right format for modeling. We want to save our boss some time and build the 'analytical base table' for them – a spreadsheet with all the clean data in one place.
Wednesday Afternoon – Vexnomics Office
The client came back with the data you needed, and thankfully they spent some time getting it into the right format for modeling. All that's left is to stitch the data together into an 'analytical base table', ready for modeling.
We got the data back from GoolyBib\nThey actually spent some time getting it in the right format which is nice!\nThe only thing I need you to do is get all the clean data into one sheet, one column per variable\nWe probably also want to add some dummy variables for day or week and payday\nLet me know if any problems
Analytical Base Table (ABT) Before modeling can begin, it helps to get all the clean data merged together into one big spreadsheet. This is called the 'analytical base table' or ABT, and having one saved means you can separate the modeling stage from the data cleaning and collection stage. If an analyst ever wants to recreate your model or build a better one, they can use the ABT without having to worry about the data cleaning stage.\nIn order to get your ABT ready, you want to set up a spreadsheet with one column per variable that may (or may not) be used in the model. You also typically want one unique row per date or time period, which is what you join all of the different data sources together on.
You can use a vlookup formula to join two datasets together by date in Excel or GSheets. The formula is VLOOKUP(search_key, range, index, [is_sorted]). In our case the search key would be the date for that row, the range would be the data we're looking up to merge in, index would be the column you want to pull from, and is_sorted should be set to FALSE.
Share a link to your analytical base table and explain your process for merging the data together in one spreadsheet.
Dummy Variables As well as modeling continuous variables like ad spend, you will often need to denote in your model that an event happened on a specific day or week. For this dummy variables are quite useful: a simple 1 or 0 in each row to show when something was on or off will give you an answer to what impact that event had on sales.\nWe need to add dummy variables for each day of the week and for the 15th and 30th when most people get paid. This will help us capture the impact of the weekend on sales and see if there's a 'payday' effect.
The best way in Excel or GSheets to create a dummy variable, is to use an IF statement that results in a 1 if true, 0 if false. For example if you wanted to add a dummy variable for every Friday, you could use the formula IF(TEXT(A1,\"DDDD\")='friday',1,0), where the date was in cell A1. Don't add a dummy for every day of the week: for example leave off Tuesdays otherwise you'll run into a 'linearly correlated variable' error.
How many Fridays are there in the data set?
It's time to build the model Everything's ready and there's no time like the present. We need to select our variables, generate our model then interpret the results for the client. Let's get started!
Thursday Morning – Vexnomics Office
Your boss was happy with the ABT you built for her. Now we can finally begin modeling! Not knowing the math, you ask her for a template.
Yep, don't worry, you won't need to do any math!\nTemplate is here: I'm sure you'll figure it out\nStart with a 'kitchen sink' model - basically include every variable and see where that gets us
The LINEST function in Excel and GSheets gives you everything you need to make a simple marketing mix model. It works as follows LINEST(known_data_y, [known_data_x], [calculate_b], [verbose]). In our case we put the variable we want to predict (i.e. sales) first, then all the X variables (any columns from our ABT we expect to drive sales), and put TRUE for the next two options.
Time to brush up on Linear Regression. Do some Googling and explain in your own words how it works.
The R2 number is a measure of what?
Create a linear regression model with LINEST following the template provided. Include all variables in your model. Explain what modifications you had to make to the template.
Which variables are statistically significant (P Value is less than 0.05)
Thanks for making the first pass at the model\nYou're right, affiliates looks insignificant, let's drop it along with the other insignificant variables.
Final Model Your boss has finalizes the model by dropping the insignificant variables. The output of the model has been provided to you including the coefficients, which have been used to calculate the contribution of each channel. Now we need to ask some questions we know the client will want answers to.
What share of total revenue for the time period did Facebook contribute?
Was TV profitable based on return on ad spend? (ROAS > 1)
What days had the highest and lowest impact on sales, according to the model?
From 'what happened' to 'what if'? Data without presentation is like cake without icing: this is the part the client is really paying for! This is a big topic, but let's keep it simple and start with the basics.
The Next Week – Vexnomics Office
The deadline is looming, so we need to put the finishing touches on our presentation. Your boss messages you to ask for help putting the presentation together.
Hi\nHope you have some time today to help?\nJust pulling together the final presentation\nWe need charts for model accuracy and channel contribution.\nWe could also do with a forecast - say $10k spend on FB and $5k on TV?\nI added charts to the template and a simple forecast so you know what to do.
Data Visualization The main chart that every client wants to see, is predicted vs actual. This shows them visually in an easy to understand manner, how well the model did in predicting historical sales. It reassures them that they can trust the results.\nOnce they're comfortable with the model, the next best chart to show them is channel contribution: according to the model, what channels drove sales? This is the whole point of doing the analysis, because it's how relative budget decisions are made.\nFinally it makes sense to use your model coefficients to do a forecast. This can help the client picture what the future would look like if they reallocated budget: will they hit their goals?
Make a predicted vs actual chart, and explain your process.
Create a channel contribution chart, and explain how you did it.
Scenario: what would daily revenue look like if the client spent spend $10k per day on Facebook and $5k per day on TV?
Finished Model\nHi there,\nAs promised, attached are the slides for the presentation.\nTop level summary:\n- The model could explain 75% of the movements in revenue, and was only out by 5% on a daily average across the period\n- TV returned $0.40 cents for every dollar invested, but this doesn't capture long term effects on the brand (likely to be substantial)\n- Affiliates doesn't look to be driving incremental revenue. You should investigate their behavior further\n- Facebook looks to be driving a high return at $4.20 ROAS, but we should validate this with an incrementality test\n- There are improvements to be made with the model, for example the numbers for Facebook are likely inflated\n- According to this model, with $10,000 spent on Facebook and $5,000 on TV per day, we'd average $57k revenue\nPlease let me know if you have any questions – let's hop on a call to discuss.\nBest,
"