- [Instructor] How does a data science process…or pipeline look like?…What are the various stages?…Let us explore in detail.…The various stages in a data science process are…acquisition, transport, storage, processing and servicing.…The first stage in a data science pipeline is data…acquisition which focuses on the sources of data,…specifically the format of data in the data sources,…interfaces to acquire data,…security and working through them,…reliability of the acquisition process,…latency requirements, interfaces might provide data…in real time and batch modes.…
The next stage of a data pipeline is data transport.…Data needs to be moved from its origin systems to the…destination systems or data warehouses or data centers.…Depending upon the use case, this movement needs…to happen within a LAN, WAN, or even wireless networks.…The key engineering challenges here are reliability…and integrity of data as it moves over networks…and organizational boundaries,…security concerns, latency of data,…time taken to move data from the source to the destination,…
Released
6/19/2018Kumaran Ponnambalam begins by discussing the roles of databases in data science, as well as the key feature and performance requirements for databases in this field. Next, Kumaran goes over different database types, sharing the strengths and weaknesses of each one. To wrap up, he walks through specific use cases and shows how to select the best database technology for each situation.
Share this video
Embed this video
Video: Data science process