Contact Scale Unlimited for event and ticket information.

Looks like this event has already ended.

Check out upcoming events by this organizer, or organize your very own event.

View upcoming events Create an event

Introduction to Cascading - March 2012

Friday, March 23, 2012 from 9:00 AM to 1:30 PM (PT)

Ticket Information

Ticket Type Sales End Price Fee Quantity
Class Participant Ended $395.00 $0.00
SHARE THIS EVENT

Event Details

Introduction to Cascading

Creating complex data processing workflows with raw Hadoop is painful, tedious, and error-prone. This 1/2 day class shows you how to use the open source Cascading API to quickly create reliable, scalable, high-performance data processing workflows on top of Hadoop. We combine lab exercises and real-world examples to reinforce lecture content, so that by the end of the day you're ready to start solving Big Data problems using Cascading. 

Agenda
  • Overview
  • Thinking in Cascading
  • Log Processing Lab
  • Built-in and Custom Operations
  • Grouping and Joining
  • Taps and Schemes (Text, Sequence files, Solr)
  • Cascading Summary
Requirements: 
 - Reliable, fast (1.5Mbps or better) Internet connection.
 - Up-to-date web browser

Scale Unlimited Classes

Classes include virtual lab environments, interactive lectures using Adobe Connect and teleconference/VoIP, personalized PDFs of lecture materials, and access to post-class support via a moderated mailing list.

Instructor

All of our classes are taught by active developers with deep, hands-on experience solving real-world problems using Hadoop, Cascading, Solr, Tika, and other powerful open source solutions. Instructor Ken Krugler has been a software developer, consultant, trainer and entrepreneur for over 20 years. Previously he started Krugle in 2005, as a pioneer in code search and an early adopter/supporter of Nutch, Hadoop, Lucene and Solr. He is a committer for the Apache Tika project, an author of one of the new Lucene In Action use cases, and an expert in web crawling and data mining.