At the end of this video the student will know how to group results using aggregate functions and the GROUP BY clause.
- [Narrator] Now let's take a look…at simple aggregations in Hive.…There are kind of two categories here.…On the left, we have SUM() which gives us…a total, a summation of values in a column.…We have MIN() and MAX() which give us…the lowest and the highest values,…and AVG() and a COUNT() which do…what they sound like they'll do.…And then on the advanced side,…we have the STDEV() and the VAR()…to get the variance of something.…I'm just going to focus on the simple ones…because I find that even the advanced ones,…while you can do them in Hive,…often today when you're really doing…hardcore data analysis work,…a lot of those things are better handled…outside of Hive themselves in a tool like R, Tableau,…or any of the other data analysis things like Python even.…
So first, we're just going to go through…and calculate some simple aggregations.…Then we'll review the additional ones.…So here in my Cloudera sandbox,…I'm going to open up my scripts.…And we're going to go for 4-1.sql.…And I'm going to copy this over to our Hive editor,…
This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way.
- Defining data structures in Hive
- Selecting data
- Joining tables
- Manipulating data
- Filtering results
- Aggregating data
- Using built-in aggregate functions
- Mastering built-in table-generating functions
- Using CUBE and ROLLUP
- Using clauses: WHERE and HAVING
- Using LIKE, JOIN, and SEMI JOIN
- Using functions: String, math, date, and conditional