This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way.
- Defining data structures in Hive
- Selecting data
- Joining tables
- Manipulating data
- Filtering results
- Aggregating data
- Using built-in aggregate functions
- Mastering built-in table-generating functions
- Using CUBE and ROLLUP
- Using clauses: WHERE and HAVING
- Using LIKE, JOIN, and SEMI JOIN
- Using functions: String, math, date, and conditional
Skill Level Intermediate
- From the early days of Big Data, it has been a challenge to find ways that allow many different types of people and professions to work with the data, that was until Facebook invented Hive, which is a sequel language that actually processes and analyzes data in Hadoop. Hi, I'm Ben Sullins, and I've been a Data Geek since the late 90s, focused on helping organizations get the most out of their data. In this course, we're going to take a look at how to use Hive to actually get the most out of your Hadoop data. I'll start by showing you how to structure your data in Hive, and optimize it for fast query performance.
Then, we'll take a look using the Hadoop user interface, which is called Hue, and we'll use that to analyze our data with the Hive QL language. We'll finish by walking through the built-in analytical functions that you need to answer those tough data science questions. We'll be covering all of these topics to get you up to speed with Hive, and help you start analyzing your Big Data in Hadoop. Let's dive in.