" Hoping for a heatwave
You've just landed a new client – a rapidly growing fast-casual hot dog restaurant chain in NYC. They want to measure the impact of their marketing campaigns, but weather plays a big role in driving sales. We need to account for that – but how?
The Original Dog Shed Restaurant, Tompkins Square Park – Mid Afternoon
The owner of the Dog Shed restaurant chain, Danielle Oscar, has invited you to her flagship restaurant to talk marketing. She wants to know if their new campaign is actually working, or if a recent heat wave deserves the credit.
Thanks for coming\nI like to take most of my meetings here, back where we started.\nIt reminds me of a time when I didn't have to have to take meetings.\nAnyway, you're here, so you can help me with my problem.\nWe hired this expensive new marketing consultant who tells me what a great job he's doing... but I don't know.\nSince I can remember we always sold more hot dogs when it gets hotter\nPeople like to come out to the park, and a hot dog's got the advantage of being particularly mobile\nI wanna know if I'm selling more dogs from the heat, or if this guy is actually doing as well as he says.
Back at the Vexnomics office – Late Afternoon
I heard it went well with Dani?\nIt sounds like a standard modeling project but we'll need weather data.\nI found that NASA has pretty decent data that doesn't cost anything\nLet me find the link\nhttps://power.larc.nasa.gov/data-access-viewer/
We spend a lot of time thinking about what actions in marketing drove sales, but often external factors play a bigger role!\nOne great example is Weather Data – many product categories see temperature affecting sales. For example are you generating more sales on a hot day? Or does a cold day mean more people stay indoors and shop online for your product?\nWithout including this you might be giving too much credit to marketing and that means making mistakes when allocating budgets. Getting weather data can be a little messy – there are lots of options to download this data (few are free), and the methods can be a little antiquated and repetitive, or require manual cleaning or in some cases programming skill to pull data from an API.
It is best to read the weather forecast before praying for rain
Just caught up with Dani - she said she'll send the data tomorrow\nWhile we wait can you make a start on gathering the weather data, just temperature and precipitation?\nThen we can join it together and work on the model
Leave it cleaner than you found it
The client knows she sells more hot dogs in the summer, but what's driving that behavior, and by how much? Before we can figure this out, we need to clean the data and get it in one place.
Vexnomics Office – The Next Day
You client sends across their data - it's your job to download relevant weather data and clean it, before adding it to the existing client data, ready for analysis. Let's get to work!
Your task is to download weather data from NASA, then clean the data before joining it to the existing client dataset ready. We need to download temperature and precipitation, which are labeled as 'Temperature at 2 Meters' and 'Precipitation Corrected' when you export the data from the POWER viewer. You need daily data for 2018, 2019, 2020, and up to June 2021, which you need to roll up into weekly data in the format 'YYYY-WW' using a pivot table. You also need to convert the temperature into Fahrenheit from Celsius.
If you want to download weather data for free, the best method we've found is as follows: visit the NASA Power Viewer (link below), choose a point on the map, select a start date and end date, select the CSV file format, then under '6. Select Parameters' choose 'Temperatures' and 'Humidity/Precipitation' then submit to download the CSV.
What columns did you keep?
What's -10.02 in Celsius converted to Farenheit
Marketing Data\nHey all,\nHere's the data you requested - let me know when you've done your thing.\nThanks\n
Join this data together so the Week columns line up. Note that the date range is shorter than your weather data. Make sure your date format is YYYY-WW. Share your file and explain your process.
Hey thanks for this\nI also added daylight hours - important to know if people buy hot dogs because it's sunny, or because the day is longer.\nI'll be in the office tomorrow we can build the model then
Correlation isn't causation but it helps
Now we've got the data, let's figure out what drives hot dog demand. Seeing what's correlated with sales is key to understanding what drives it.
Vexnomics Office – One Day Later
Now you have the data, we need to see if there are any correlations that stand out. The main thing we want to know is if there's a strong correlation between weather and hot dogs sold.
You may have heard the phrase \"correlation isn't causation\", and that's true! For example ice cream sales and shark attacks happen at the same time, but that doesn't mean one causes the other! They're both caused by a third variable, the increase in sunshine.\nIn addition, causation doesn't mean correlation either: take famous economist Milton Friedman's example of a thermostat. It turns on the heat when the outside temperature dips, and we know that the heater acts to increase inside temperature, but there would be no correlation between the activity of the heater and internal temperate, which is kept constant by the thermostat!\nWith all that in mind, doing exploratory data analysis is important for any business to understand what's driving sales – knowing what's correlated (and what isn't) is valuable information that may contain a few surprises. Unexpected information is the perfect place to look to form new hypothesis to test.
Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'
In Excel and Google Sheets, you can find the correlation of two variables by using the CORREL function. Input the data you care most about first (sales), then the variable you think is correlated second (temperature), separated by a comma.\nIt can also be helpful to create a line chart to show visually if each variable is moving together. Put one variable on the left Y axis and one on the right Y axis to see them both on similar scales.\nFinally, it can be useful to create a scatterplot chart and add a trendline with an equation, to get the R squared of the relationship, a measure of how much changes in one variable explains the other.
Create a line chart of temperature and hot dogs sold, with temperature on the right axis (sales on the left). Do they look correlated?
Correlation strength can be described as follows:\n1.0 = Perfect Positive Correlation\n0.7 = Strong Positive Correlation\n0.5 = Moderate Positive Correlation\n0.3 = Weak Positive Correlation\n0 = No Correlation\n-0.3 = Weak Negative Correlation\n-0.5 = Moderate Negative Correlation\n-0.7 = Strong Negative Correlation\n-1.0 = Perfect Negative Correlation
Calculate the correlation between temperature and hot dog sales. How would you describe the strength of the correlation?
To see the correlations of all variables at once, you can create a correlation matrix. Create a grid with every variable name on the left as rows, and also on the top as column. Use the CORREL function to find the correlation betwen each pair in the grid. You actually only need to find half (on the diagonal) because you'd otherwise just be repeating yourself. More info in the article below.
Build a correlation matrix. What variable is most strongly correlated with sales?
Take a look at the weather variables correlation - which one has the strongest correlation with sales?
Create a scatterplot chart with store locations on the x axis and hot dogs sold on the y axis. Add a trendline with an equation to the chart. How much of the changes in hot dogs sold can be explained by the number of store locations?
Assuming the client is right about high temperature increasing sales, explain why temperature might be negatively correlated with sales.
Took a look at the data you sent\nYou're right, store locations is dominating any other effects\nTry looking at the correltion since the 5th location launched?
Calculate the correlation between temperature and hot dog sales after the launch of the 5th store. Also calculate the correlation between TikTok impressions and sales. What can you say from this data?
Thanks so much for all the insight you've given me\nIt's amazing what you can see in the data - I'm gonna have to open another store!\nI woulda axed that marketing guy for being slimy if it wasn't for your analyis\nLooks like I'm stuck with him afterall\nI thought there's no way the kids watching TikToks was selling dogs, but I guess we just learned a new trick"