Learn about the concept and importance of distributed systems.
- The term distributed systems refers to a set of independent computers connected through either a local area network or wide area network. These computers share resources, such as CPUs or storage devices. They provide an infrastructure for distributed processing. Computers in distributed systems don't share main memory, and their only way of communicating is through network connections.
Another characteristic of distributed systems is the fact that the time to exchange messages between computers is significantly longer when compared with the time between events occurring on CPUs. Since the communication overhead is high, it doesn't make any sense to use a distributed system when you can finish a job in a short time frame, like seconds. Let's say that you need to process a big dataset, and it takes several hours.
Now it's worth using a distributed system since the time spent on messaging is negligible when compared to the total time it takes to finish a job. In addition, you can handle bigger data sizes and reduce the overall processing time by dividing the work across multiple machines. That is, you may be able to finish a task in minutes instead of hours. Based on this definition, Hadoop Distributed File System, also known as HDFS, is a distributed system because it allows to seamlessly store and access large datasets across a network.
Distributed processing leverages a distributed system, such as HDFS, to complete a task or related tasks. Distributed processing requires sophisticated software, like MapReduce, to partition a relatively big job into smaller tasks and to eventually combine the results from the completed tasks.
- Enabling technologies in data science
- Cloud computing and virtualization
- Installing and working with Proxmox, Hadoop, Spark, and Weka
- Managing virtual machines on Proxmox
- Distributed processing with Spark
- Fundamental applications of machine learning
- Distributed systems and distributed processing
- How Hadoop, Spark, and Weka can work together