Learn why big data analytics is important by pointing out the importance of data analytics and various value-added services it can provide.
- [Voiceover] Big data analytics leverages distributed computing technologies and data analytics techniques to overcome computational challenges presented by big data sets. Distributed computing means an approach used in computer science to break down a task into smaller pieces that are easier to process. "Divide and conquer" is a philosophy behind this classic technique. Once partitioned into smaller chunks, each element of the task is assigned to a processor which could be geographically dispersed.
For example, a fragment of your task can be processed in Seoul, South Korea, while another piece can be worked on in New York. Cloud computing provides a platform on which distributed computing can be implemented with low cost and scalable methods. To simply put it, cloud computing offers a bunch of computers housed in data centers. In addition to the hardware, a software solution is necessary to manage various aspects of distributed computing.
This is why we need software tools such as Hadoop and NoSQL databases. Once you get with both hardware and software infrastructures to store, manage, and process big data sets, you're finally ready to run data analytics programs to ask your specific questions on certain big data sets. These questions can touch on applications like fraud detection, online dating, network security, disease control, and climate changes.
Jungwoo Ryoo is a professor of information science and technology at Penn State. Here he reviews the history of data science and analytics, explores which markets are using big data the most, and reveals the five main skills areas: data mining, machine learning, natural language processing (NLP), statistics, and visualization. This leads to a discussion of the five biggest career opportunities, the four leading industry-recognized certifications available, and the most exciting emerging technologies. Along the way, Jungwoo discusses the importance of ethics and professional development, and provides pointers to online resources for learning more.
- A history of data science
- Why analytics is important
- How data science is used in social media, climate research, and more
- Data science skills
- Data science certifications
- The future of big data