Join Ben Sullins for an in-depth discussion in this video How Hive works, part of Analyzing Big Data with Hive.
- [Narrator] Before we move on to using Hive,…I thought it's important to understand…how Hive actually works, at least at a high level.…So let's start here by understanding just how…this modern data ecosystem typically goes.…First, you have data providers that generate data.…These are external events on things like social networks,…business systems that handle…all of your organization's processes,…and internal apps that may be home-grown…and designed with a very different…storage and access method in mind.…So from these systems, all the data…flow in to HDFS, that's the Hadoop File System.…
And it literally is like placing files on a shared drive.…It's really not much different, other than…how Hadoop actually handles that…and distributes them across multiple different nodes…to create that reliable platform.…There are a number of ways to do this…and literally dozens of platforms to make it happen.…But in the end, your data lives in HDFS typically.…The format can be anything from tab delimited text files,…comma-separated vales, or compressed .zip files…
This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way.
- Defining data structures in Hive
- Selecting data
- Joining tables
- Manipulating data
- Filtering results
- Aggregating data
- Using built-in aggregate functions
- Mastering built-in table-generating functions
- Using CUBE and ROLLUP
- Using clauses: WHERE and HAVING
- Using LIKE, JOIN, and SEMI JOIN
- Using functions: String, math, date, and conditional