From the course: Data Science Foundations: Data Engineering
Unlock the full course today
Join today to access over 22,600 courses taught by industry experts or purchase this course individually.
Environment setup
From the course: Data Science Foundations: Data Engineering
Environment setup
- [Instructor] Before we get going with actually staging our data, cleansing our data, and conforming our data, we need to set up our environment so we have a clean place to work. So on my desktop here, what I have are the exercise files that I downloaded, as well as VirtualBox VM Manager. So this is the software we're going to be using. We'll be running in Cloudera, with their sandbox 5.8. And what you do is you install the VirtualBox software, you download that sandbox image from Cloudera. Then you import that. You go to File, Import Appliance. Look for the Cloudera file you downloaded and start it from there. We'll need to be able to access these exercise files from within our virtual machine, so you'll want to go into the Settings and then Shared Folders and add wherever you've downloaded those files to. Now when you do, make sure that you have it set up as Auto-mount and Make Permanent. And then within the VM, we should be able to now access these exercise files. So here in the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.