[QCL Workshop] Web Scraping with Python (Level2-Data/Coding)

Event Information

Share this event

Date and Time

Location

Location

Claremont McKenna College/Roberts North 12 (RN12)

320 East 9th Street

Claremont, CA 91711

View Map

Event description

Description

# Web Scraping with Python (Level2-Data/Coding)

## Summary

In this 2-hour workshop, you will learn a way to collect data from web pages such as Wikipedia using web scraping functions and data manipulation packages in Python.

Learning objectives of the workshop:

  • Understanding Robots.txt and HTTP requests.
  • Understanding basic components of a webpage and HTML.
  • Get familiar with Pandas Module.
  • Parsing html string into Pandas.
  • Parse URL class into Pandas.
  • Parse Tables from Wikipedia into Pandas.
  • Parse non-Wikipedia Tables into Pandas.
  • Parse Wiki InfoBoxes.
  • Write html parsed tables into flat csv.
  • Advanced understanding of HTML parsing using tagging and CSS selection.

## Date and Time

November 8, 2019 from 1 pm to 3 pm (2 hours)

## Location

Roberts North 12 (RN12)

## Pre-requisites

Internet Use: Introductory level (search, log-in, navigation of websites, etc.)

Programming: Basic Python programming skills (functions, packages, etc.)

## Participants

CMC Students, Faculty and Staff

Date and Time

Location

Claremont McKenna College/Roberts North 12 (RN12)

320 East 9th Street

Claremont, CA 91711

View Map

Save This Event

Event Saved