- [Instructor] Another thing that I do very frequently to convert my data, is I turn numbers into percentiles, because often times when you have a list of things in, let's say, rank order, the rank is helpful. Like in this case, I have GDP, right, the gross domestic product, essentially the size of the economies of 181 countries, and so I can see that I'm sorted by the GDP in this case. So I can see that the United States is number one, with a 16 point something trillion dollar economy.
China is number two with an eight point something trillion dollar economy. Now right there I can see why rank order might fail in certain types of analysis, because China is number two, so it's almost as good as the United States, great. But you know what, it's literally about half the size. These numbers are from 2012 or 2013 by the way. So it's literally half the size, so being two is not close to the same thing as being number one in terms of the overall size of economy. Another example of that is if I take just Germany in this case, which has a three point something trillion dollar economy, and Sudan, which is a 58 billion dollar size economy.
Germany is number four, so they're right up near the top even thought it's literally like just barely over 20% of the size of the United States. Actually maybe a little less than 20%. Sudan is actually number 68 on the list at 58 point something billion dollars. So, even though I can look up those rank orders, as I said, the rank order may not be that informative, and also rank order out of 181 is hard for me to exactly figure out what that really means.
Now, 181 isn't a crazy number, but what if the number was 4,273, if that was my list count? Now if I had a value that was 1,267 that out of 4,000 whatever, you know it just gets harder to sort of understand what that really means. And so percentiles can help solve that problem. And so I'm just going to show you how I would calculate a percentile in this example. First of all, to do a formula I start with the equal sign, and percentiles are always one minus something. And that's because what we're trying to do is to figure out essentially a decimal out of the maximum value, the count, and then we subtract it from one because the highest value is essentially the closest to 100%, and the lowest value is the closest to zero percent.
So essentially what I'm doing is I'm going to say one minus, and then I'm going to say the row number of this thing that I'm looking at, divided by the count. And so if I hit enter on that, I see that the Unites States is in the 99th percentile, meaning that it is higher than 99% of other countries. Now, if I change the formatting, I'm going to hit command one, or I can also go to format cells in the menu, and if I changed this instead of being as a percentage, if I just say show me as a number, you will see that the number is actually, it is a decimal.
So one minus the row number, which in this case is number one, so it's one minus one, divided by 181. That's sort of what the formula is doing. So if I do command one, or control one on a pc, and turn it back into a percentage, and I'm going to get rid of the decimal places, it's 99%. Now, I've talked about cell locking before, and we're going to see a symptom of the problem here. If I just click and drag this down, I've got problems here because the first row, instead of dividing the row number by this number, it's dividing the row number by this number, which is not what I want to do.
I always want to divide by my count, the count of things in the list to figure out what percentile this thing is in that list. So I have to do the row locking, in this case, of the column and the row, just by putting those little dollar signs, I don't even really need to do the column, I could just do the row, because I'm not moving left or right. But now if I click and drag this all the way down, I can see essentially where everything belongs. So if I click and drag all the way to the bottom here, I can see, as I should, that Tuvalu, being the last on the list, is in the zeroth percentile, zero percent of countries are below it in the list.
Okay so this is all it's doing, it's telling me where something lives on the list. So Germany, at number four, is in the 98th percentile, Sudan at number 68 is in the 62nd percentile, meaning 38% of countries are lower than Sudan on this list. So percentiles are a great way to sort of get an understanding, in addition to rank order, where something lives on the list relative to its peers in a sort of a number scale, zero to 100 that we're used to thinking of things like this.
- Describe the process by which individuals’ interests are incorporated into data visualizations.
- Differentiate the use of the Ws in data visualization.
- Explain techniques involved in defining your narrative when visualizing data.
- Identify the factors that make data visualizations relatable to an audience’s interests and needs.
- Review the appropriate use of charts in data visualizations.
- Define the process involved in applying interactivity to data visualizations.