Classification on the Titanic using sklearn

Click here to watch a video on fitting and evaluating logistic regression models on the Titanic dataset using ‘sklearn’
Author

George I. Hagstrom

Published

February 18, 2026

I have recorded a video covering how to implement and evaluate logistic regression models in ‘sklearn’. I used the ‘titanic’ dataset, which contains information on titanic passengers and whether or not they survived. I show how to build a pipeline that preprocesses data, engineers features, and fits a logistic regression model, and I showed how to extract coefficients from that model, plot calibration curves and ROC curves, calculate the log loss and the accuracy score, and calculate a variety of other standard metrics (F1 score, precision, recall, etc). The video recording process was a bit clunky and the live coding was a bit buggy, so the video runs pretty long, apologies for that.

You can find the titanic video vignette here

You can download my python notebook here: logistic regression vignette