From the course: Python Statistics Essential Training (2018)

Unlock the full course today

Join today to access over 22,400 courses taught by industry experts or purchase this course individually.

Describe categorical variables

Describe categorical variables - Python Tutorial

From the course: Python Statistics Essential Training (2018)

Start my 1-month free trial

Describe categorical variables

- [Instructor] Moving on to categorical variables. How do we describe variation in those? Using tables, of course. So, leaving gapminder aside for a moment, we will use the Whickham data set, this cast by David Kaplan, in his excellent textbook, Statistical Modeling. Then we import packages and then the data set. The table records interviews with women in Whickham, England, in 1973 who were asked if they were smokers. The interviews were followed up 20 years later, when it was recorded if the woman were still alive. The categorical values in this case smoker and outcome, are both binary, yes or no. We can tally up the explanatory smoker, and the response outcome variables separately. We use the method value counts and enclosing the results in a data frame creates a prettier output. Doing so doesn't tell us much other than both pairs of groups are represented fairly well in the table. Smokers and non smokers, women who survived for 20 years and those who didn't. If you want to see the…

Contents