Learn how to perform regression analysis using R and how to interpret the results.
- [Instructor] So let's get started with our regression analysis for R. I've already got the application opened, so R Studio is here on our desktop. And I'm going to navigate over to our exercise files and open up zero three zero two. I'm going to select regression dot R. And at the outset of any analysis, we need to connect to our data, so we've already done that here, but let's select that line and run it so we can bring that data connectivity into R Studio so select that line, click Run.
Let's go ahead and take a look at our data. I'm just going to double click on my regression analysis in the Environment pane. On the right side of your screen, double click. And this helps us to see what we have to work with here in terms of our case study. So it looks like we have some data related to date, geography, out of home, print and broadcast, those seem like marketing channels. And then we have net sales and it looks like a few other indicators here.
For all intents and purposes, for our particular case study, what I think it would make sense to do is to analyze the broadcast column and try to determine the correlation between broadcast advertising and sales for the organization. So I'm going to go ahead and close out of this pane. Just going to click on the X in the tab above that. And now let's go ahead and and we're going to plot that data, broadcast and sales. So I'm going to type plot and then I'm simply going to use my variable name here, just copied and pasted that in.
And we want to connect specifically to our broadcast column in that data, so if you recall how we look at a subset of data is directly following our variable, we enter a dollar sign. And R Studio is nice enough to show us our different column names here, so I'm just going to scroll down and hit Enter on broadcast. And next we're going to look at sales. And I can recall from taking a look at the data that that column name was something like net sales so again what we want to do is just input our variable name for our data frame, enter that dollar sign and yup there it is, net sales.
'Kay let's go ahead and run that. And that gives us a scatter plot. So in essence what we're looking at here is you can see the broadcast dimension running across the X-axis. And then on the Y-axis, you can see our net sales. So we can begin to make some inferences just from this visualization alone. This is the type of visualization that indicates there's some pattern here, there's some pattern that might show a good fit.
Let's walk forward and see what we find there. So now we want to fit a line by running the LM function and so the way we do that is we do a variable name again and I'm just going to call that myLm. And I'm going to run the function LM here. And my regression data. And then net sales. And then I'm going to run a 'til day.
My regression data. And the broadcast dimension. Okay so what this is going to do is it's going to fit a line to that data. So we can see how closely these data points correlate to that line. So run that. And that creates a line but we also have to visualize the line as well, so you'll remember in our earlier video about regression analysis, we discussed independent and dependent variables.
Or the X and the Y data. So really here's what we're plotting. And in the previous step, we calculated the line with that LM function and now we're going to visualize that line. So we're going to do lines, my regression data, and broadcast. And then we're going to feed it myLm, which we just recreated, and apply, fit it to that.
And so now that we have this written, we can go ahead and hit Run as well and now we can see that line has drawn itself in. Now what I want to point out is that this is a line of best fit. It's not going to hit every point. But in essence it provides a line so we can see A, which way the data is trending, and B, so we can show the relationship between our dependent variable, that of net sales, and our independent variable, that of broadcast. Well how do I determine this relationship? We do that with coefficients.
What a coefficient is and how and why it matters will make a little more sense in a moment. For now, just know that that coefficient is a multiplicative value. A multiplier. It may be a concept you recall from algebra class. It's a number used to multiply a variable. Such as say 4Z, means four times Z. Where Z is the variable and four would be the coefficient. So I'm going to generate the coefficient of our fitted line. And this is how I do that. So I'm going to type the command myLmcoeff.
And run that. And that generates a couple of numbers for us. Down here in our console. First there's the intercept. This is without doing any broadcast advertising that our net sales are $133,108 and 78 cents. The second number is our slope, so this is a calculation of the coefficient for broadcast media. That for each one unit increase in broadcast units, there's an increase of $12,141 and 94 cents in net sales.
Now your broadcast unit could be a GRP, it could be an impression or some other measure, it just depends on how you gather that data in the first place, but the point remains. There is a positive correlation between our broadcast media and net sales. So with this information you can determine the cost of each broadcast unit and then establish your ROI and determine if this is a good investment. We can run this sort of analysis on each of our channels and each of our campaigns, so we can see that across the board.
Let's go ahead and shut down R and move onto Python to look at a similar analysis.
In this course, discover how to gain valuable insights from large data sets using specific languages and tools. Follow Chris DallaVilla as he walks through how to use R, Python, and Tableau to perform data modeling and assess performance. As Chris dives into these concepts, he shares specific case studies that come directly from his own work with clients. Plus, he shares three essential—and practical—best practices for data-driven marketing that you can use to bolster your organization's marketing performance.
- Installing R, Python, and Tableau
- Navigating the UI for R, Python, and Tableau
- Using R, Python, and Tableau
- Exploratory analysis
- Performing regression analysis
- Performing a cluster analysis
- Performing a conjoint assessment
- Stakeholder alignment