From the course: Data Science Tools of the Trade: First Steps

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Distributed systems and distributed processing

Distributed systems and distributed processing

From the course: Data Science Tools of the Trade: First Steps

Start my 1-month free trial

Distributed systems and distributed processing

- The term distributed systems refers to a set of independent computers connected through either a local area network or wide area network. These computers share resources, such as CPUs or storage devices. They provide an infrastructure for distributed processing. Computers in distributed systems don't share main memory, and their only way of communicating is through network connections. Another characteristic of distributed systems is the fact that the time to exchange messages between computers is significantly longer when compared with the time between events occurring on CPUs. Since the communication overhead is high, it doesn't make any sense to use a distributed system when you can finish a job in a short time frame, like seconds. Let's say that you need to process a big dataset, and it takes several hours. Now it's worth using a distributed system since the time spent on messaging is negligible when compared to the total time it takes to finish a job. In addition, you can…

Contents