At the end of this video the learner will have created a partitioned managed table from the Bash shell.
- [Voiceover] Now let's talk about a different way…that we can set up our tables known as Partitioning.…And a partitioned table in Hive is a defined structure…that separates these typically large tables…into smaller subsets,…usually based on some conditional logic.…Now a common way to do this is by year,…so let's say we have each year of our data set…which would comprise of maybe sales over many years,…into separate partitions.…This helps with performance.…If we're only looking for one year sales,…probably most recent or most recent two years,…I'm going to have a much smaller data set to actually scan,…when I want to analyse that data,…so it helps with performance, especially when we're…dealing with large data.…
In our demo, we're going to create a table with a script,…so we're actually going to type it out and run the script,…instead of using the user interface…that we have been doing before.…Then we'll upload some files,…the different partitioned files,…we'll add those partitions to our table,…and then we'll go take a look…
This course shows how to use Hive to process data. Instructor Ben Sullins starts by showing you how to structure and optimize your data. Next, he explains how to get Hue, the Hadoop user interface, to leverage HiveQL when analyzing data. Using the newly configured option, he then demonstrates how to load data, create aggregate tables for fast query access, and run advanced analytics. He also takes you through managing tables and putting functions to use. This course is designed to help you find new ways to work with datasets so you can answer the tough data science questions that come your way.
- Defining data structures in Hive
- Selecting data
- Joining tables
- Manipulating data
- Filtering results
- Aggregating data
- Using built-in aggregate functions
- Mastering built-in table-generating functions
- Using CUBE and ROLLUP
- Using clauses: WHERE and HAVING
- Using LIKE, JOIN, and SEMI JOIN
- Using functions: String, math, date, and conditional