Join Ben Sullins for an in-depth discussion in this video What you should know, part of Hadoop for Data Science Tips, Tricks, & Techniques.
- [Instructor] To be successful in this course, you should have some basic understanding of the SQL database language. Also, some experience with Hadoop and Hive would be helpful, and definitely being comfortable with the Linux Terminal, the Command Line. If you need to get up to speed, we have some courses for you. First, I would recommend, the SQL Essential Training, so you can get familiar with that. Then, I would dive into Analyzing Big Data in Hive. And, if you aren't familiar with the Command Line, we have a good course out there, Learn the Linux Command Line: The Basics.
- Working with files
- Organizing files in HDFS
- Connecting to Hadoop
- Exploring Hive through Beeline
- Accessing Hive from Python
- Creating aggregates in Hive
- Selecting partitions in Hive
- Complex data structures in Hive
- Mapping data in Hive
- Creating flat tables for Impala
- Deconstructing Impala queries