Sold Out

Applied Machine Learning with Scikit-Learn

Event Information

Share this event

Date and Time

Location

Location

Cowork Lab

2500 Yale Street

#b

Houston, TX 77008

View Map

Refund Policy

Refund Policy

Refunds up to 30 days before event

Event description

Description

This class is designed to introduce you to machine learning using the Python library Scikit-Learn. Through this intense, daylong program you will be able to understand the most common workflows for building machine learning models including both regression and classification problems, categorical and continuous data, missing data imputation, feature engineering, cross-validation, parameter tuning via grid search, and building pipelines.

Expert Instructor

This course is taught by Ted Petrou, an expert at building machine learning models with Scikit-Learn. He is the author of Pandas Cookbook, a thorough step-by-step guide to accomplish a variety of data analysis tasks with Pandas.

Is this course for you?

You should be familiar with the fundamentals of Python as well as have minimal exposure to the Pandas library as we will be using it for basic data manipulation.

Small Class Size

This is a small class with at most 10 participants, so you will be able to ask and get help with specific questions quickly.

Discount with Previous Class

Take the Introduction to Data Science class the day before on August, 25 and get $25 off with code double25. Use the code with that event.

When

Saturday, August 26, 2018: 9 a.m. - 5 p.m.

Syllabus

Part 1: Scikit-Learn Vocabulary and Gotchas

Scikit-Learn requires data to be in a specific format. We will cover the most common "gotchas" as well as build a dummy estimator with three lines of code.

Part 2: Categorical and Missing Data

Non-numeric data must be transformed into numeric in order to be processed by Scikit-Learn. Similarly, missing data must be handled before any learning can take place.

Part 3: Cross-Validation

Cross-Validation is one of the best tools we have to determine how well our model will perform on new or unseen data. Scikit-Learn provides several powerful functions to perform cross-validation.

Part 4: Parameter Tuning

All the machine learning models in Scikit-Learn have parameters that can be tuned to optimize performance.

Part 5: Feature Engineering

The data that originally comes with your data should not be thought of as stagnant. We can significantly improve performance by engineering new features.

Part 6: Building a Pipeline

Scikit-Learn allows you to combine many features of the library into one powerful pipeline for learning.

Part 7: Kaggle Competition

One of the best ways to practice machine learning is to participate in Kaggle competitions.

Post-Course Plan

You will be given a detailed plan on how to both master Scikit-Learn. You will also have access to the instructor in a private Slack chatroom.

Instructor

Ted Petrou is the author of Pandas Cookbook and founder of Dunder Data as well as the Houston Data Science Meetup group. He worked as a data scientist at Schlumberger where he spent the vast majority of his time exploring data. Ted received his Masters degree in statistics from Rice University and used his analytical skills to play poker professionally and teach math before becoming a data scientist.

Share with friends

Date and Time

Location

Cowork Lab

2500 Yale Street

#b

Houston, TX 77008

View Map

Refund Policy

Refunds up to 30 days before event

Save This Event

Event Saved