Monday, June 22, 2009 from 8:00 AM - 4:00 PM (ET)
Cloudera's Basic Hadoop Training provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop. This session is appropriate for attendees who are new to Hadoop, and may have never used the software before. It is also appropriate for new users who are seeking a deeper understanding of the core principles, programming API and basic MapReduce algorithms.
Cloudera will give a series of lectures interleaved with practical, hands-on examples and exercises. Attendees must bring their own laptop with VMware Player (or Fusion for Mac) so they may follow along in a pre-configured virtual machine Cloudera provides.
Attendees seeking more in-depth knowledge may also attend our Intermediate Hadoop Training the following day (a discount code is provided if you miss the early bird registration).
Those wishing to document their skills and receive the Cloudera Certified Hadoop Professional (CCHP) credential may take the certification exam immediately following intermediate training. It covers material from both sessions.
During this all-day session, we will cover the following agenda with ample time for questions:
Thinking at Scale: Introduction to Hadoop and Big Data You know your data is big – you found Hadoop. What implications must
you consider when working at this scale? This lecture addresses common
challenges and general best practices for scaling with your data. MapReduce and HDFS These tools provide the core functionality to allow you to store,
process, and analyze big data. This lecture "lifts the curtain" and
explains how the technology works. You'll understand how these
components fit together and build on one another to provide a scalable
and powerful system. Getting Started with Hadoop If you'd like a more hands-on experience, this is a good time to
download our VM and kick the tires a bit. In this activity, using the
provided instructions, you'll get a feel for the tools and run some
sample jobs. The Hadoop Ecosystem An introduction to other projects surrounding Hadoop, which complete
the greater ecosystem of available large-data processing tools. The Hadoop MapReduce API Learn how to get started writing programs against Hadoop's API. Introduction to MapReduce Algorithms Writing programs for MapReduce requires analyzing problems in a new way. This lecture shows how some common functions can be expressed as part of a MapReduce pipeline. Writing MapReduce Programs Now that you're familiar with the tools, and have some ideas about how
to write a MapReduce program, this exercise will challenge you to
perform a common task when working with big data - building an inverted
index. More importantly, it teaches you the basic skills you need to
write your own, more interesting data processing jobs. Hadoop Deployment Once you understand the basics for working with Hadoop and writing MapReduce applications, you'll need to know how to get Hadoop up and running for your own processing (or at least, get your ops team pointed in the right direction). Before ending the day, we'll make sure you understand how to deploy Hadoop on servers in your own datacenter or on Amazon's EC2.
Lecture:
Lecture:
Exercise:
Lecture:
Lecture:
Lecture:
Exercise:
Lecture:
Lunch will be provided around noon.
Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.
| View other Cloudera, Inc events |
|
|
Contact the Host |
|
|
Subscribe to receive notifications of future events by this host |
Email
Facebook
Twitter
LinkedIn
MySpace
Digg
Delicious
Reddit