From the course: Data Science on Google Cloud Platform: Predictive Analytics

Understanding input data - Google Cloud Tutorial

From the course: Data Science on Google Cloud Platform: Predictive Analytics

Start my 1-month free trial

Understanding input data

- [Instructor] Lets start building predictive analytic solutions in DCP. The first step in this process, is to build and test models locally. I'm going to create sample Python code that predicts whether a website visitor will make a purchase based on his browsing data. We will use the scikit-learn library in Python for building this model. Then I will show you how to set up this model in DCP Cloud ML and use it for predictions. For these predictions, we will use a CSV file called web-browsing-data.csv which can be found in the exercise files. It contains information about various website visiting sessions for an e-commerce website. It contains a number of flags on whether the website visitor did a specific activity. For example, the first column is IMAGES. If the visitor looked at the images of a product, it is marked one. Else its marked zero. There are similar columns for reviews, FAQ's, specifications, etc. Finally, there is a BUY attribute that shows whether the visitor actually made a purchase. Next let us use this file as input for our model building work in our future videos.

Contents