From the course: Deploying Scalable Machine Learning for Data Science

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Running services in clusters

Running services in clusters

From the course: Deploying Scalable Machine Learning for Data Science

Start my 1-month free trial

Running services in clusters

- [Instructor] We've examined how we can wrap our machine learning models inside of services, and that makes them programmatically accessible from other applications. We've also looked at how we can use containers to facilitate deploying our models along with supporting software. So now, let's turn our attention to running our machine learning code in multiple containers. And for that, we're gonna look at what it's like to run services within a cluster. Now, if we deploy our machine learning code in a single container and run one instance of that container on a server, that service will be able to respond to some number of requests per minute or per second. Now, if your use case is well understood and the load on the system does not vary much, you could size a single server with the appropriate number of CPUs and the right amount of memory to match your load. In many situations, though, load is dynamic. Sometimes the load will steadily increase. For example, if your business is…

Contents