From the course: Analyzing Big Data with Hive
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Handling CSV files in Hive - Hive Tutorial
From the course: Analyzing Big Data with Hive
Handling CSV files in Hive
- [Instructor] Now let's take a look at handling CSV files in Hive. Now CSV files have a unique thing that if there is a value in the file that actually needs to include the comma, say the name of a company, then it puts quotes around that value. Now this poses a challenge for us when we're working with the data in Hive. In fact, we have to implement a custom engine known as a SerDe for serialization and deserialization just to handle those files properly. So what I'm going to do is open up the HUE Metastore Manager, then we'll upload a file that has these quoted strings, then I'll show you how to use the custom SerDe, and then we'll apply that to our table settings, and see what it does. So here in HUE, the first thing I want to do is open the Metastore Manager, and I'm going to create a new table out of a file. I'm going to call this sales_withcomma. I'll click on the ellipse and upload the file. Under Exercise Files and Data, we have the one that ends with WithCommas.csv, and we'll…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.