Module 7 - Regularization and Model Selection

Overview

Machine Learning models are often over-parameterized, meaning that they can have many more model parameters than there are data points. By default, over-parameterized models will generalize poorly due to overfitting. A family of methods called regularization are used to modify the loss functions employed in machine learning models, penalizing models for having values of the parameters far from 0. Remarkably, using these penalties can allow for over-parameterized models to generalize out of sample. We will study regularization in the context of the linear model, where the types of penalties used are called ridge regression, the lasso, elastic-net regression, or even Huber regression. These regression methods apply broadly throughout machine learning. We will also learn about principal component analysis, our first unsupervised learning method, and other tools used when working with large numbers of parameters and high-dimensions.

Lab 4 Will be due at the end of the week

Learning Objectives

  • Regularization and Over-fitting
  • Ridge Regression, The Lasso, and Elastic Net
  • Principal Component Analsysis (PCA)
  • Challenges of working in high-dimensional spaces

Readings

  • ISLP (Introduction to Statistical Learning with Python): Chapters 6 Section 6.2 is by far the most important section of the reading material this week. 6.3 and 6.4 are things you will need to know at some point, and I will return to the material in 6.3 when we do unsupervised learning. 6.1 are techniques that are common in statistics and motivate the regularization approaches.

ISLP Videos

I am just going to post the videos from the key sections. You are welcome to watch the other Chapter 6 videos if you have time.

ISLP Coding Videos

Videos