From the course: Analyzing Big Data with Hive

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Simple aggregations

Simple aggregations - Hive Tutorial

From the course: Analyzing Big Data with Hive

Start my 1-month free trial

Simple aggregations

- [Narrator] Now let's take a look at simple aggregations in Hive. There are kind of two categories here. On the left, we have SUM() which gives us a total, a summation of values in a column. We have MIN() and MAX() which give us the lowest and the highest values, and AVG() and a COUNT() which do what they sound like they'll do. And then on the advanced side, we have the STDEV() and the VAR() to get the variance of something. I'm just going to focus on the simple ones because I find that even the advanced ones, while you can do them in Hive, often today when you're really doing hardcore data analysis work, a lot of those things are better handled outside of Hive themselves in a tool like R, Tableau, or any of the other data analysis things like Python even. So first, we're just going to go through and calculate some simple aggregations. Then we'll review the additional ones. So here in my Cloudera sandbox, I'm going to open up my scripts. And we're going to go for 4-1.sql. And I'm…

Contents