Learn about cross validation and degrees of freedom.
- [Instructor] In the last video we saw…a few classical analytics have it techniques…to have it with goodness of fit,…and to compare models based on their…explanatory power and simplicity.…Now I want to show you how to implement…a much simpler strategy known as Cross Validation.…Which is used in machine learning to compare models.…We divide the data into a training set…which we use to feed the model,…and a testing set which we use to…evaluate the models prediction error.…
So instead of concentrating on in-sample error…as we do with classical techniques,…we will look at how to sample prediction error.…Models and accounts look better by…over fitting the data they're trained on.…Instead they need to in some sense…understand something about the world.…I have all ready included code to load our data set.…I import packages.…And since we will be splitting our data…I have refactored the plotting…so it works on arbitrary data.…
I have also copied the model formulas alone…from the last two videos.…To divide up our data we first shuffle it…
Author
Released
7/17/2018- Installing and setting up Python
- Importing and cleaning data
- Visualizing data
- Describing distributions and categorical variables
- Using basic statistical inference and modeling techniques
- Bayesian inference
Skill Level Intermediate
Duration
Views
Related Courses
-
R Statistics Essential Training
with Barton Poulson5h 59m Intermediate -
SPSS Statistics Essential Training
with Barton Poulson4h 57m Beginner -
Python: Data Analysis
with Michele Vallisneri2h 16m Intermediate -
Statistics Foundations: 1
with Eddie Davila2h 6m Beginner
-
Introduction
-
Welcome1m 9s
-
Using the exercise files1m 2s
-
-
1. Installation and Setup
-
2. Importing and Cleaning Data
-
The structure of data1m 52s
-
Create tidy data tables5m 20s
-
Introducing pandas7m 28s
-
Data cleaning12m 6s
-
-
3. Visualizing and Describing Data
-
The power of visualization7m 12s
-
Describe distributions5m 3s
-
Plot distributions7m 34s
-
More quantitative variables7m 58s
-
Plot categorical variables4m 30s
-
Personal email analytics10m 10s
-
-
4. Introduction to Statistical Inference
-
Statistical inference1m 27s
-
Confidence intervals9m 30s
-
Bootstrapping7m 10s
-
Hypothesis testing7m 34s
-
-
5. Introduction to Statistical Modeling
-
Statistical modeling1m 35s
-
Fitting models to data7m 36s
-
Goodness of fit6m 13s
-
Cross validation6m 22s
-
Logistic regression5m 30s
-
Bayesian inference9m 14s
-
-
Conclusion
-
Next steps1m 55s
-
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.
CancelTake notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.
Share this video
Embed this video
Video: Cross validation