Join Michele Vallisneri for an in-depth discussion in this video The structure of data, part of Python Statistics Essential Training.
- [Instructor] Let's start by establishing some terminology by showing you how data should be organized so that it's easiest to deal with and understand using software. The word data is plural, that is because statistics is about variability. It is concerned with how things are different from each other, so a data set is like a catalog or a collection. For instance, for a planetary scientist, a data set of interest would be the planets of the solar system.
The data set consists of cases, the planets, and each case has attributes called variables. For instance, the mass of a planet or the period of its orbit around the sun. It is standard practice to organize data in a data frame, in effect, a table, where each row refers to one case and each column to one variable. Variables can be quantitative, represented by a number, or categorical, a description that can be put in words select from a fixed set of labels.
In this case, just yes or no for the presence of rings around the planet. Usually quantitative variables are given as pure numbers and the units are described in a code book or data dictionary for the data frame. For instance, in this table, the masses are given in units of 10 to the 24 kilograms and the diameter is in kilometers. This arrangement, known as Kay's variable organization, is very simple, but it can accommodate many different sorts of data.
It is also reflected directly in the data structures used by statistical software. In that case, lambdas.
- Installing and setting up Python
- Importing and cleaning data
- Visualizing data
- Describing distributions and categorical variables
- Using basic statistical inference and modeling techniques
- Bayesian inference
Skill Level Intermediate
SPSS Statistics Essential Trainingwith Barton Poulson4h 57m Beginner
R Statistics Essential Trainingwith Barton Poulson5h 59m Intermediate
1. Installation and Setup
2. Importing and Cleaning Data
3. Visualizing and Describing Data
4. Introduction to Statistical Inference
5. Introduction to Statistical Modeling
Next steps1m 55s
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.