Practice creating calculated fields, building and modifying hierarchies, and defining new data groups within the refine interface. Understand the importance of the "rows" field, and demonstrate how to add or remove columns from the refined data set.
- [Instructor] Before we dive into our analysis, let's practice working with calculations, hierarchies, and groups to finish refining our data set. So we'll go ahead and close our column properties and scroll to the left-most column to get started. The first thing I'd like to do here is drill into my column list and enable the Rows column, which was automatically created for us, by simply giving it a check mark to the left of the name. Rows will be a valuable tool that we can use to give us a count of observations under specific criteria.
If I scroll down it also looks like we have some work to do with the auto-generated hierarchies here, since Watson is grouping Origin States with Destination Cities and vice versa. If you look closely, you'll see why this is happening, which is because the word origin in the Origin City column is missing the first I. So Watson didn't recognize that it should be paired with the associated Origin State column, which is spelled correctly. But not to worry, all we need to do is select the hierarchy and we can modify it as we see fit.
So let's go ahead and select Destination State Origin City, remove the Origin City level. We'll add a new level for Destination City and then give our hierarchy a meaningful name. So let's call it Destination State slash City. Now we're essentially structuring our data set such that a user can drill down into a particular destination state to reveal the destination cities within that state, a feature which can also be referred to as a drill path.
So let's go ahead and repeat that same process for origin hierarchy as well. So we'll remove Destination City, We'll add a level for Origin City, and then we'll name this hierarchy Origin State slash City. Last but not least, we can select the check marks to include both of the new fields into our data set. Now let's say we'd like to calculate the total travel time, which we can define as the departure delay plus the total flight time.
To do this all we need to do is select Calculation, give it a name, let's call it Total Travel Time, and insert the columns and operators that we need. In this case it's pretty simple, We'll select thew A and search for departure delay in minutes. We'll keep the plus operator and then click on B to pull in flight time in minutes to finish this off.
And once we've clicked done, our new field populates here at the bottom of our column list, Total Travel Time. Finally, let's build out an example data group to categorize shopping spend-amounts. We can simply choose the Data Group option, click on year flight date to select the field that we're interested in, which in this case, is shopping amount at airport and customize how we want to bucket or group these values.
Let's keep things pretty simple and create three distinct buckets and we can customize the thresholds for those who spend less than 100 dollars, between 100 and 300 dollars, or over 300. Finally, I can change my bucket labels to low, medium, and high. And finally name the data group as a whole. In this case, we'll call it Shopping Spend Level.
Now if I click done and scroll all the way to the right of my table, I can see the group that I just created. And now if we hide the column list, we can confirm that our shopping spend labels are populating correctly. So anyone who spent less than a hundred dollars shows up with a label of low. Anyone who spent between 100 and 300 shows a level of medium and anyone who spent over 300 shows a label of high.
Last but not least, let's go ahead and do a save as at the top of our screen, so that we can preserve our original data set as well. And we'll call this one Airline Satisfaction Survey Refined. Because I'm using a professional version of Watson Analytics I have the option to save this in a shared or personal folder. In this case, I'm going to keep it in my personal assets and click the Save button.
- Reviewing key differentiators
- Navigating the 3 Ds of Watson Analytics: data, discovery, and display
- Importing, joining, and refining data
- Using natural language querying
- Understanding key drivers
- Interpreting decision trees
- Displaying insights
- Assembling multitabbed displays and dashboard filters
- Modifying and sharing displays