From the course: Python Statistics Essential Training (2018)
Unlock the full course today
Join today to access over 22,400 courses taught by industry experts or purchase this course individually.
Describe categorical variables - Python Tutorial
From the course: Python Statistics Essential Training (2018)
Describe categorical variables
- [Instructor] Moving on to categorical variables. How do we describe variation in those? Using tables, of course. So, leaving gapminder aside for a moment, we will use the Whickham data set, this cast by David Kaplan, in his excellent textbook, Statistical Modeling. Then we import packages and then the data set. The table records interviews with women in Whickham, England, in 1973 who were asked if they were smokers. The interviews were followed up 20 years later, when it was recorded if the woman were still alive. The categorical values in this case smoker and outcome, are both binary, yes or no. We can tally up the explanatory smoker, and the response outcome variables separately. We use the method value counts and enclosing the results in a data frame creates a prettier output. Doing so doesn't tell us much other than both pairs of groups are represented fairly well in the table. Smokers and non smokers, women who survived for 20 years and those who didn't. If you want to see the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
The power of visualization7m 12s
-
Describe distributions5m 3s
-
Plot distributions7m 34s
-
Plots of two quantitative variables5m 50s
-
More quantitative variables7m 58s
-
Describe categorical variables4m 59s
-
Plot categorical variables4m 30s
-
Personal email analytics10m 10s
-
✓ Challenge: More email analytics22s
-
✓ Solution: More email analytics1m 45s
-
-
-
-