Learn how big data analytics is used for fraud detection by defining what is fraud detection and analyzing how data science technologies are used to discover fraud. Jungwoo will compare and contrast the conventional and new, big data-driven fraud detection techniques. In particular, he will explain the role of machine learning in the newly emerging fraud detection algorithms. Big banking companies such as PayPal is actively adopting this approach to more effectively detect fraud cases.
- [Voiceover] Data science marketplace is diverse. For example, one of its key markets is fraud detection. As we move toward the digital economy, criminals and crooks are finding various and ingenious ways to commit fraud against the banking sector. The stakes are high. The loss due to unauthorized credit card transactions alone is estimated to be billions of dollars each year.
Therefore, banks are extremely interested in figuring out what's fraudulent and what's not as fast as possible or as they occur. Until very recently, fraud detection involved significant human intervention. Suspicious activities would be flagged for additional scrutiny. Then a fraud detection specialist looked into the case more closely. One of the major challenges in this approach has been the number of false positives, that is, there tend to be too many cases for a human operator to review, and a significant number of them turn out to be normal transactions anyway.
Therefore, improving the accuracy of fraud detection is a key to success in this case. Machine learning and big data analytics are revolutionizing the fraud detection market by drastically reducing the number of legitimate customer events falsely identified as fraud attempts. What machine learning brings to the table is its ability to learn on its own what's the best way to detect a fraud through numerous trial and error.
Big data contributes to this process by providing rich data sets machine learning algorithms can use to train themselves. The more data points there are, the more accurate the outcome becomes.
Jungwoo Ryoo is a professor of information science and technology at Penn State. Here he reviews the history of data science and analytics, explores which markets are using big data the most, and reveals the five main skills areas: data mining, machine learning, natural language processing (NLP), statistics, and visualization. This leads to a discussion of the five biggest career opportunities, the four leading industry-recognized certifications available, and the most exciting emerging technologies. Along the way, Jungwoo discusses the importance of ethics and professional development, and provides pointers to online resources for learning more.
- A history of data science
- Why analytics is important
- How data science is used in social media, climate research, and more
- Data science skills
- Data science certifications
- The future of big data