This event has ended

Hadoop Training for Developers - Berlin - January

Cloudera

Monday, January 17, 2011 at 9:00 AM - Wednesday, January 19, 2011 at 5:00 PM (CET)

Berlin, Germany

Hadoop Training for Developers - Berlin - January

Ticket Information

Ticket Type Sales End Price Fee Quantity
3 Days: Full Program + Certification (€1499 + 19% VAT) Ended €1,783.00 €0.00
3 Days: Full Program + Certification Ended €1,499.00 €7.50

Share Hadoop Training for Developers - Berlin - January

Event Details

Information for VAT (UST) Waiver

If a company is purchasing these tickets and can provide a UST ID number, you may manage VAT reporting with your own internal processes. To do this, you may apply the discount code "VAT_waiver" and enter your UST ID number during the registration process.


 

Cloudera offers a three day training program targeted toward developers who want to learn how to use Hadoop MapReduce to build powerful data processing applications.

The overall schedule is:

Day 1: Basic training

Day 2: Intermediate

Day 3: Advanced and certification

Agendas for each of the three days are provided below.

This course will be delivered in English.

 


 

 

Basic Training

Cloudera's Basic Hadoop Training provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop. This session is appropriate for attendees who are new to Hadoop, and may have never used the software before. It is also appropriate for new users who are seeking a deeper understanding of the core principles, programming API and basic MapReduce algorithms.

Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam immediately following advanced training. It covers material from all three sessions. 

During this all-day session, we will cover the following agenda with ample time for questions:


Lecture:

Thinking at Scale: Introduction to Hadoop

You know your data is big – you found Hadoop. What implications must you consider when working at this scale? This lecture addresses common challenges and general best practices for scaling with your data.

Lecture:

MapReduce and HDFS

These tools provide the core functionality to allow you to store, process, and analyze big data. This lecture "lifts the curtain" and explains how the technology works. You'll understand how these components fit together and build on one another to provide a scalable and powerful system.

Exercise:

Getting Started with Hadoop

If you'd like a more hands-on experience, this is a good time to download our VM and kick the tires a bit. In this activity, using the provided instructions, you'll get a feel for the tools and run some sample jobs.

Lecture:

The Hadoop Ecosystem

An introduction to other projects surrounding Hadoop, which complete the greater ecosystem of available large-data processing tools.

Lecture:

The Hadoop MapReduce API

Learn how to get started writing programs against Hadoop's API.

Lecture:

Introduction to MapReduce Algorithms

Writing programs for MapReduce requires analyzing problems in a new way. This lecture shows how some common functions can be expressed as part of a MapReduce pipeline.

Exercise:

Writing MapReduce Programs

Now that you're familiar with the tools, and have some ideas about how to write a MapReduce program, this exercise will challenge you to perform a common task when working with big data - building an inverted index. More importantly, it teaches you the basic skills you need to write your own, more interesting data processing jobs.

Lecture:

Hadoop Deployment

Once you understand the basics for working with Hadoop and writing MapReduce applications, you'll need to know how to get Hadoop up and running for your own processing (or at least, get your ops team pointed in the right direction). Before ending the day, we'll make sure you understand how to deploy Hadoop on servers in your own datacenter or on Amazon's EC2.

We will take a one hour break around noon for lunch.


 

Intermediate Training

Cloudera's Intermediate Hadoop Training builds on our basic training, and is appropriate for those who are already familiar with Hadoop basics and the MapReduce programming model.

Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam after advanced training on the following day.

Intermediate training focuses on importing data into Hadoop and building data processing pipelines. We'll cover more advanced topics such as Hive and Pig and show participants how to use each effectively.

During this all day session, we will cover the following agenda with ample time for questions:

Lecture:

Augmenting Existing Systems with Hadoop

To introduce our intermediate trainign session, we'll take a step back and look at data systems more generally. Hadoop rarely replaces existing infrastructure, but rather enables you to do more with your data by providing a scalable batch processing system. This lecture helps you understand how it all fits together.

 Lecture:

Best Practices for Data Processing Pipelines

In order for Hadoop to crunch large volumes of data, first you'll need to get that data into Hadoop. This lecture will help you understand how to import different types of data from various sources into Hadoop for further analysis.

Exercise:

Importing Existing Databases with Sqoop

Sqoop is a command line tool developed by Cloudera and contributed to the Hadoop project. It provides an easy way to import data from RDBMSs and enable you to work with that data directly using MapReduce, Hive, or Pig.

 Lecture:

Introduction to Pig

Pig is a high-level language for large-scale data analysis programs. Pig exposes many common MapReduce constructs in an simplified processing language, and is often used for ad-hoc analysis.

 Exercise:

Working with Pig

In this exercise, we'll revisit some common tasks and see how you can accomplish them using Pig.

 Lecture:

Introduction to Hive - A Data Warehouse for Hadoop

Hive is a powerful data warehousing application built on top of Hadoop which allows you to use SQL to access your data. This lecture will give an overview of Hive and the query language.

 Exercise:

Working with Hive

This exercise will show you exactly how to work with Hive. We'll walk through importing data, creating tables, and making queries.

We will take a one hour break around noon for lunch.

 

 

 


 

 

Advanced Training

Cloudera's Advanced Hadoop Training completes the three day training course and teaches advanced skills for debugging MapReduce programs and optimizing their performance. This session requires prior experience writing Hadoop programs, or attendence in the basic and intermediate training sessions.  Attendees will look deeper into the Hadoop API and learn about programmatic tools that facilitate tighter integration between Hadoop programs and other systems and higher parallel throughput.

At the end of the training session, those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam. It covers material from all three sessoins.

During this all-day session, we will cover the following agenda with ample time for questions:


Lecture:

Debugging MapReduce programs

Debugging in the distributed environment is challenging. This lecture will expose you to best practices for program design to mitigate debugging challenges, as well as local testing tools and techniques for debugging at scale.

Lecture:

Advanced Hadoop API

In the basic training session, you learned how to get up and running writing Hadoop MapReduce programs in Java. This lecture probes deeper into the API, covering custom data types and file formats, direct HDFS access,  intermediate data partitioning, and other tools such as the DistributedCache.

Lecture:

Advanced Algorithms

This lecture introduces some graph algorithms that can be adapted for your needs, as well as more involved examples like PageRank. We'll also look at strategies for implementing joins efficiently, and compare different techniques that are appropriate to different data models.

Exercise:

Optimizing MapReduce Programs

We'll use the Cloudera Training VM to work through an example where you write a MapReduce program and improve its performance using techniques explored earlier.

Exam:

Cloudera Certified Hadoop Developer Exam

The day will end with the CCHD exam for those wishing to document their Hadoop expertise. This test will assess your knowledge of all areas covered over the course of the three day training sessions.

We will take a one hour break around noon for lunch.

Have questions about Hadoop Training for Developers - Berlin - January? Contact Cloudera

When & Where


New Horizons - Berlin
23 Zimmerstrasse
10969 Berlin
Germany

Monday, January 17, 2011 at 9:00 AM - Wednesday, January 19, 2011 at 5:00 PM (CET)


  Add to my calendar

Organizer

Cloudera

Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


 

  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.