From the course: Data Science Tools of the Trade: First Steps

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Hadoop: Preparation

Hadoop: Preparation

From the course: Data Science Tools of the Trade: First Steps

Start my 1-month free trial

Hadoop: Preparation

- [Instructor] The first step in installing Hadoop is to ensure that you're using the right operating system. What you need is a Linux distribution, and we're using Ubuntu here. Create a new Ubuntu VM on Proxmox with at least eight gigabyte of this storage. Use as much memory as you can afford to. One gigabyte may be fine, but I'm using eight gigabyte as you can see here. We also need to prepare the Ubuntu operating system so that we have all the necessary underlying software Hadoop depends on. These are rsync and JDK. Rsync is a utility program used to transfer and synchronize files among computers connected through a network. JDK is Java Development Kit. Hadoop is mostly written in java, which is why we need JDK for it to run. Type sudo apt-get install rsync and press enter. Looks like rsync is already installed. Next, type java -version to see if java is already installed. Looks like java is not installed yet. Now, type sudo apt-get install openjdk-8-jdk to install the JDK. Press…

Contents