Get up to speed with Hadoop. Learn tips and tricks for doing data science work in this popular big data platform.
- [Instructor] Hadoop is becoming the standard for many companies looking to warehouse their data and then analyze it. There are so many components and different parts of the ecosystem that you can easily become a specialist in just one area of Hadoop. Over the years, I've learned some of the most common tips and tricks to help you get going in Hadoop, and that's what we're going to take a look at in this course. Hi, I'm Ben Sullins, and I'm going to start in this course by walking you through some of the basic file management techniques in Hadoop.
Then, we'll take a look at how to access and analyze that data from Hive, the Hadoop SQL engine, and lastly, we'll dive into some of the techniques for running fast queries inside of that Hive engine. We'll be covering all of these topics and more to get you up to speed with Apache Hadoop. Let's dive in.
- Explain which commands are used to make changes in HDFS.
- Identify the commands used to upload data from the command line to the HDFS.
- Recognize two operations the HDFS performs when a user moves files.
- Summarize how to remove files recursively in HDFS.
- Recall how to select and implement partitions.
- Explain how to flatten a Struct data type in HiveQL.