(tech noise) - [Instructor] Let's do some cleaning. We import our standard packages and then load the tb.csv file into a Panda's data frame. Let's look at this table. And let's print out the column names. We need to apply the Panda's Melt Operation. To identify our columns, are just country and year.
All the columns that express the sex and age range for an observation has been mounted into separate rows. You can just copy their names from above. The melted viable name will be sexage. And the value variable name will be cases. Let's see. We also separate the combined sex/age column into two, using a string slicing operation.
We access string operations for a Panda series by going through .str. For the sex we just need the first character. For the age range the characters that follow. This worked. We should also rename the age range values with something more readable. We can do such a replacement with a Panda's Method Map and giving it a dictionary.
For instance 04 returning to 0-4, and so on. Let's go to a new line. This is one of those cases where it's a toss up whether it's faster to ride the Python function that that's its job or to do it manually.
The last one is u which stands for age range unknown. Which we replace with not the number. Not the number is finding the number module. Let's have a look. Very good. Finally, we drop all the unknown values in the column cases, using dropna.
And let's have a look. We can also sort the table by country, year, age, and sex. That is achieved with sort values. Sex needs to be a string also. And last we drop the combined sex/age column and reorder the other columns.
This is done by sub setting the columns of final. Here we go. If your curious how to write out in Panda's data framing to a csv file, that's easily achieved, with the method to csv. In this case we do not care to output the index. So let me demonstrate that I can read back the same file.
- Installing and setting up Python
- Importing and cleaning data
- Visualizing data
- Describing distributions and categorical variables
- Using basic statistical inference and modeling techniques
- Bayesian inference