Looks like this event has already ended.
Check out upcoming events by this organizer, or organize your very own event.
Hadoop Training for Developers - London - SeptemberClouderaMonday, September 6, 2010 at 9:00 AM - Wednesday, September 8, 2010 at 5:00 PM (BST)London, United Kingdom |
|
Event Details
To pay by credit card (in US dollars), see http://www.eventbrite.com/event/809996722
Cloudera offers a three day training program targeted toward developers who want to learn how to use Hadoop MapReduce to build powerful data processing applications.
The overall schedule is:
Day 1: Basic training
Day 2: Intermediate
Day 3: Advanced and certification
Agendas for each of the three days are provided below.
Basic Training
Cloudera's Basic Hadoop Training provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop. This session is appropriate for attendees who are new to Hadoop, and may have never used the software before. It is also appropriate for new users who are seeking a deeper understanding of the core principles, programming API and basic MapReduce algorithms.
Cloudera will give a series of lectures interleaved with practical, hands-on examples and exercises. Attendees must bring their own laptop with VMware Player (or Fusion for Mac) so they may follow along in a pre-configured virtual machine Cloudera provides.
Attendees seeking more in-depth knowledge may also attend our intermediate and advanced training on the following two days.
Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD)
credential may take the certification exam immediately following
advanced training. It covers material from all three sessions.
During this all-day session, we will cover the following agenda with ample time for questions:
Thinking at Scale: Introduction to Hadoop You know your data is big – you found Hadoop. What implications must
you consider when working at this scale? This lecture addresses common
challenges and general best practices for scaling with your data. MapReduce and HDFS These tools provide the core functionality to allow you to store,
process, and analyze big data. This lecture "lifts the curtain" and
explains how the technology works. You'll understand how these
components fit together and build on one another to provide a scalable
and powerful system. Getting Started with Hadoop If you'd like a more hands-on experience, this is a good time to
download our VM and kick the tires a bit. In this activity, using the
provided instructions, you'll get a feel for the tools and run some
sample jobs. The Hadoop Ecosystem An introduction to other projects surrounding Hadoop, which complete
the greater ecosystem of available large-data processing tools. The Hadoop MapReduce API Learn how to get started writing programs against Hadoop's API. Introduction to MapReduce Algorithms Writing programs for MapReduce requires analyzing
problems in a new way. This lecture shows how some common functions can
be expressed as part of a MapReduce pipeline. Writing MapReduce Programs Now that you're familiar with the tools, and have some ideas about how
to write a MapReduce program, this exercise will challenge you to
perform a common task when working with big data - building an inverted
index. More importantly, it teaches you the basic skills you need to
write your own, more interesting data processing jobs. Hadoop Deployment Once you understand the basics for working with Hadoop and writing
MapReduce applications, you'll need to know how to get Hadoop up and
running for your own processing (or at least, get your ops team pointed
in the right direction). Before ending the day, we'll make sure you
understand how to deploy Hadoop on servers in your own datacenter or on
Amazon's EC2.
Lecture:
Lecture:
Exercise:
Lecture:
Lecture:
Lecture:
Exercise:
Lecture:
We will take a one hour break around noon for lunch.
Intermediate Training
Cloudera's Intermediate Hadoop Training builds on our basic training, and is appropriate for those who are already familiar with Hadoop basics and the MapReduce programming model.
Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam after advanced training on the following day.
Intermediate training focuses on importing data into
Hadoop and building data processing pipelines. We'll cover more
advanced topics such as Hive and Pig and show participants how to use
each effectively.
Cloudera will give a series of lectures, interleaved with practical, hands on examples and exercises. Attendees must bring their own laptop with VMware Player (or Fusion for Mac) so they may follow along in a pre-configured virtual machine Cloudera provides.
During this all day session, we will cover the following agenda with ample time for questions:
Augmenting Existing Systems with Hadoop To introduce our intermediate
trainign session, we'll take a step back and look at data systems more
generally. Hadoop rarely replaces existing infrastructure, but rather
enables you to do more with your data by providing a scalable batch
processing system. This lecture helps you understand how it all fits
together. Best Practices for Data Processing Pipelines In order for Hadoop to crunch large
volumes of data, first you'll need to get that data into Hadoop. This
lecture will help you understand how to import different types of data
from various sources into Hadoop for further analysis. Importing Existing Databases with Sqoop Sqoop is a command line tool
developed by Cloudera and contributed to the Hadoop project. It
provides an easy way to import data from RDBMSs and enable you to work
with that data directly using MapReduce, Hive, or Pig. Introduction to Pig Pig is a
high-level language for large-scale data analysis programs. Pig exposes
many common MapReduce constructs in an simplified processing language,
and is often used for ad-hoc analysis. Working with Pig In this exercise, we'll revisit some common tasks and see how you can accomplish them using Pig. Introduction to Hive - A Data Warehouse for Hadoop Hive is a powerful data warehousing application built on top of Hadoop
which allows you to use SQL to access your data. This lecture will give
an overview of Hive and the query language. Working with Hive This exercise will show you exactly how to work
with Hive. We'll walk through importing data, creating tables, and
making queries.
Lecture:
Lecture:
Exercise:
Lecture:
Exercise:
Lecture:
Exercise:
We will take a one hour break around noon for lunch.
Advanced Training
Cloudera's Advanced Hadoop Training completes the three day training course and teaches advanced skills for debugging MapReduce programs and optimizing their performance. This session requires prior experience writing Hadoop programs, or attendence in the basic and intermediate training sessions. Attendees will look deeper into the Hadoop API and learn about programmatic tools that facilitate tighter integration between Hadoop programs and other systems and higher parallel throughput.
Cloudera will give a series of lectures interleaved with practical, hands-on examples and exercises. Attendees must bring their own laptop with VMware Player (or Fusion for Mac) so they may follow along in a pre-configured virtual machine Cloudera provides.
At the end of the training session, those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam. It covers material from all three sessoins.
During this all-day session, we will cover the following agenda with ample time for questions:
Debugging MapReduce programs Debugging in the distributed environment is challenging. This
lecture will expose you to best practices for program design to
mitigate debugging challenges, as well as local testing tools and
techniques for debugging at scale. Advanced Hadoop API In the basic training session, you learned how to get up and running
writing Hadoop MapReduce programs in Java. This lecture probes deeper
into the API, covering custom data types and file formats, direct HDFS
access, intermediate data partitioning, and other tools such as the
DistributedCache. Advanced Algorithms This lecture introduces some graph algorithms that can be adapted
for your needs, as well as more involved examples like PageRank. We'll
also look at strategies for implementing joins efficiently, and compare
different techniques that are appropriate to different data models. Optimizing MapReduce Programs We'll use the Cloudera Training VM to work through an example where
you write a MapReduce program and improve its performance using
techniques explored earlier. Cloudera Certified Hadoop Developer Exam The day will end with the CCHD exam for those wishing to document
their Hadoop expertise. This test will assess your knowledge of all
areas covered over the course of the three day training sessions.
Lecture:
Lecture:
Lecture:
Exercise:
Exam:
We will take a one hour break around noon for lunch.
When & Where
Learning Tree - London
24 Eversholt St
NW11AD London
United Kingdom
Monday, September 6, 2010 at 9:00 AM - Wednesday, September 8, 2010 at 5:00 PM (BST)
Add to my calendar
Organizer
Cloudera
Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.