- [Instructor] The R language has quite a lot of built in graphics capabilities, not only the ability to export to different file formats. But also the ability to create different types of plots and graphs. A conditional density plot is one of those and let's take a look at how we can create that using standard base R functionality. First of all we need some data, so let's grab the data, ChickWeight and then let's say that I want to given a certain amount of time, figure out how much a chick should weigh.
So a conditional density plot requires a factor, that's the first thing we need to create. So I'm going to create a factor, inside of a vector called ThreeWeights and into it I'm going to use the cut functionality that we talked about earlier to cut ChickWeight, weight into three buckets and I'm going to label those buckets as 34, 148, and 260 and those values I came across earlier with doing some experimentation about how to properly label the resulting graph.
So you can experiment around a little bit. So now I've got a vector called ThreeWeights and it has three levels, and again it's a factor, it's not a numeric vector. Now I'm ready to do a plot, so let's do cdplot which is a conditional density plot and I'm gonna ask it to plot ChickWeight, Time against the ThreeWeights vector which has factors in it, and produce a plot for that.
And you can see over here in the lower right hand corner, we now have a plot that plots the weights against time, and the way to read these conditional density plots. It's a little bit confusing at first, but if you go over across the bottom row and say 20 days in, well what's the probability that a chick will weigh 148? And you can say that a 20 day period, there is approximately an 80% chance that that chick should weigh 148.
Let's take a look at how we can make this graph a little bit more understandable. So I'm gonna close this particular graph out and we can add things to it, to make things clearer. So there is our cdplot which we originally created and let's add some labels to that. The first thing that I'd like to do is add a main title and we'll call the main title How much should a chick weigh? I'm going to label the Y axis as Probable weight, and I'm going to label the X axis with xlab as Days.
Now let's see what kind of a graph we get? So now you can see that the X and the Y axies are labeled and it's a little clear as to what the numbers mean in this particular graph. Cdplot provides an alternative way to describe the plot that I want to generate and that's using formulas. So let's take a look at how that works, here's cdplot again and I'm going to say that I would like to plot the factor, of weight, against and I'll use a tilde to signify against, Time.
Now, the question you might have is, well how does cdplot know where to get weight, and where to get Time? And the way you do that is you specify where the data comes from, data is equal to oh, ChickWeight so cdplot will now pull weight and time from the ChickWeight dataset, and if I hit Return and run you'll see that we get, well it looks similar to the previous one. But you can see that it has way over-plotted our particular graph.
So let's fix our cdplot line and we can do that with cut. I'll go over here to the weight column, and I'll use cut again and I wanna put in parenthesis around weight, I need to say that I wanna cut weight into six buckets, and I'm going to label those buckets as, one through six times 62.
The number 62 is a value that I found after some experimentation to decide how I wanted to label the graphs. So now when I run cdplot what I get is a much more understandable graph, that has fewer buckets to put things into and gives me a real clear picture of what the probability is of a chick weighing a certain amount on a certain day. So cdplot is one of the many plotting functions available in R, and it's useful to give you a range of values across a certain set of circumstances.
The five minutes you spend each week will provide you with a building block you can use in the next two hours at work. Review language basics, discover methods to improve existing R code, explore new and interesting features, and learn about useful development tools and libraries that will make your time programming with R that much more productive.
All series code samples can be downloaded at https://github.com/mnr/five-minutes-of-R.Note: Because this is an ongoing series, viewers will not receive a certificate of completion.