Join Ben Sullins for an in-depth discussion in this video What you should know before watching this course, part of Data Science Foundations: Data Engineering.
- [Instructor] For this course you should have a basic understanding of working with data in Hadoop as well as a good understanding of SQL. If you want to brush up on Hadoop beforehand, check out Lynn Langit's course, Hadoop Fundamentals. If you need help with SQL, you can also check out my SQL Tips and Tricks for Data Science course. If you're still new to Hadoop, but are familiar with databases in SQL, you should feel comfortable with the pacing of this course.
- Working with systems and schemas
- Managing of a good data pipeline
- Setting up an environment
- Loading and profiling data
- Testing quality
- Adding data types
- Handling missing values and inferred members
- Performing master data lookups
- Loading schemas and tables
- Creating views