From the course: Big Data Analytics with Hadoop and Apache Spark

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Apache Hadoop overview

Apache Hadoop overview

From the course: Big Data Analytics with Hadoop and Apache Spark

Start my 1-month free trial

Apache Hadoop overview

- [Instructor] In this video, I will review the key features and the current state of technology for Apache Hadoop. Hadoop is an open-source technology that started the big data wave. It provides distributed data storage and computing using low-cost hardware. It can scale to petabytes of data and can run on clusters with hundreds of nodes. Hadoop mainly consists of two components, the Hadoop Distributed File System, or HDFS, that provides data storage. MapReduce is a programming model and implementation that provides distributed computing capabilities with data stored in HDFS. Where does Hadoop stand today? Let's look at HDFS and MapReduce separately. HDFS is still a very good option for cheap storage of large quantities of data. It provides scaling, security, and cost benefits that help in its adoption. It is most suitable for enterprises with in-house data centers who want to host the data within their network.…

Contents