Join Ben Sullins for an in-depth discussion in this video What you should know before watching this course, part of Analyzing Big Data with Hive.
- [Instructor] For this course you should have a basic understanding of working with flat file data sources, such as comma separated values, and what databases are. Any experience with Hadoop would be helpful. If you want to brush up on that beforehand, I recommend checking out Lynn Langit's course Hadoop Fundamentals. This is a beginner course, so it's not necessary to have any previous knowledge of Hadoop, Hive, or even databases for that matter. I'll walk you through all the concepts and terminology as we go through this course.
This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way.
- Defining data structures in Hive
- Selecting data
- Joining tables
- Manipulating data
- Filtering results
- Aggregating data
- Using built-in aggregate functions
- Mastering built-in table-generating functions
- Using CUBE and ROLLUP
- Using clauses: WHERE and HAVING
- Using LIKE, JOIN, and SEMI JOIN
- Using functions: String, math, date, and conditional