From the course: R for Data Science: Lunch Break Lessons

Unlock this course with a free trial

Join today to access over 22,600 courses taught by industry experts.

by is like tapply

by is like tapply

- [Instructor] You will often want to take a dataset, split it up by one of the variables of that dataset and then apply some function to that resulting split and for that, you can use several commands. I'm going to compare by against lapply, split and tapply. I prefer by because it's so much easier and let's take a look at why. First, I'm going to create a vector called chicweightbytime and into that vector I'm going to use by which is a function and the data that I'm going to use is ChicWeight which is of course just a collection of weights compared to dates and I'm going to select the weight column of that data. The indices I'm going to select is ChicWeight, I'm going to split by time, so this is what I'm going to split this by. Then the function which is FUNC equals is max, so what I'm going to do here is find the max weight for each time, so I hit Command + Return and that function is run and we can now take a look at chicweightbytime, so let's take a look at that…

Contents