Looks like this event has already ended.
Check out upcoming events by this organizer, or organize your very own event.
Hadoop Training for Developers - Berlin - JanuaryClouderaMonday, January 17, 2011 at 9:00 AM - Wednesday, January 19, 2011 at 5:00 PM (CET)Berlin, Germany |
|
Event Details
Information for VAT (UST) Waiver
If a company is purchasing these tickets and can provide a UST ID number, you may manage VAT reporting with your own internal processes. To do this, you may apply the discount code "VAT_waiver" and enter your UST ID number during the registration process.
Cloudera offers a three day training program targeted toward developers who want to learn how to use Hadoop MapReduce to build powerful data processing applications.
The overall schedule is:
Day 1: Basic training
Day 2: Intermediate
Day 3: Advanced and certification
Agendas for each of the three days are provided below.
This course will be delivered in English.
Basic Training
Cloudera's Basic Hadoop Training provides a solid foundation for those seeking to understand large scale data processing with MapReduce and Hadoop. This session is appropriate for attendees who are new to Hadoop, and may have never used the software before. It is also appropriate for new users who are seeking a deeper understanding of the core principles, programming API and basic MapReduce algorithms.
Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam immediately following advanced training. It covers material from all three sessions.
During this all-day session, we will cover the following agenda with ample time for questions:
Thinking at Scale: Introduction to Hadoop You know your data is big – you found Hadoop. What implications must you consider when working at this scale? This lecture addresses common challenges and general best practices for scaling with your data. MapReduce and HDFS These tools provide the core functionality to allow you to store, process, and analyze big data. This lecture "lifts the curtain" and explains how the technology works. You'll understand how these components fit together and build on one another to provide a scalable and powerful system. Getting Started with Hadoop If you'd like a more hands-on experience, this is a good time to download our VM and kick the tires a bit. In this activity, using the provided instructions, you'll get a feel for the tools and run some sample jobs. The Hadoop Ecosystem An introduction to other projects surrounding Hadoop, which complete the greater ecosystem of available large-data processing tools. The Hadoop MapReduce API Learn how to get started writing programs against Hadoop's API. Introduction to MapReduce Algorithms Writing programs for MapReduce requires analyzing problems in a new way. This lecture shows how some common functions can be expressed as part of a MapReduce pipeline. Writing MapReduce Programs Now that you're familiar with the tools, and have some ideas about how to write a MapReduce program, this exercise will challenge you to perform a common task when working with big data - building an inverted index. More importantly, it teaches you the basic skills you need to write your own, more interesting data processing jobs. Hadoop Deployment Once you understand the basics for working with Hadoop and writing MapReduce applications, you'll need to know how to get Hadoop up and running for your own processing (or at least, get your ops team pointed in the right direction). Before ending the day, we'll make sure you understand how to deploy Hadoop on servers in your own datacenter or on Amazon's EC2.
Lecture:
Lecture:
Exercise:
Lecture:
Lecture:
Lecture:
Exercise:
Lecture:
We will take a one hour break around noon for lunch.
Intermediate Training
Cloudera's Intermediate Hadoop Training builds on our basic training, and is appropriate for those who are already familiar with Hadoop basics and the MapReduce programming model.
Those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam after advanced training on the following day.
Intermediate training focuses on importing data into Hadoop and building data processing pipelines. We'll cover more advanced topics such as Hive and Pig and show participants how to use each effectively.
During this all day session, we will cover the following agenda with ample time for questions:
Augmenting Existing Systems with Hadoop To introduce our intermediate trainign session, we'll take a step back and look at data systems more generally. Hadoop rarely replaces existing infrastructure, but rather enables you to do more with your data by providing a scalable batch processing system. This lecture helps you understand how it all fits together. Best Practices for Data Processing Pipelines In order for Hadoop to crunch large volumes of data, first you'll need to get that data into Hadoop. This lecture will help you understand how to import different types of data from various sources into Hadoop for further analysis. Importing Existing Databases with Sqoop Sqoop is a command line tool developed by Cloudera and contributed to the Hadoop project. It provides an easy way to import data from RDBMSs and enable you to work with that data directly using MapReduce, Hive, or Pig. Introduction to Pig Pig is a high-level language for large-scale data analysis programs. Pig exposes many common MapReduce constructs in an simplified processing language, and is often used for ad-hoc analysis. Working with Pig In this exercise, we'll revisit some common tasks and see how you can accomplish them using Pig. Introduction to Hive - A Data Warehouse for Hadoop Hive is a powerful data warehousing application built on top of Hadoop which allows you to use SQL to access your data. This lecture will give an overview of Hive and the query language. Working with Hive This exercise will show you exactly how to work with Hive. We'll walk through importing data, creating tables, and making queries.
Lecture:
Lecture:
Exercise:
Lecture:
Exercise:
Lecture:
Exercise:
We will take a one hour break around noon for lunch.
Advanced Training
Cloudera's Advanced Hadoop Training completes the three day training course and teaches advanced skills for debugging MapReduce programs and optimizing their performance. This session requires prior experience writing Hadoop programs, or attendence in the basic and intermediate training sessions. Attendees will look deeper into the Hadoop API and learn about programmatic tools that facilitate tighter integration between Hadoop programs and other systems and higher parallel throughput.
At the end of the training session, those wishing to document their skills and receive the Cloudera Certified Hadoop Developer (CCHD) credential may take the certification exam. It covers material from all three sessoins.
During this all-day session, we will cover the following agenda with ample time for questions:
Debugging MapReduce programs Debugging in the distributed environment is challenging. This lecture will expose you to best practices for program design to mitigate debugging challenges, as well as local testing tools and techniques for debugging at scale. Advanced Hadoop API In the basic training session, you learned how to get up and running writing Hadoop MapReduce programs in Java. This lecture probes deeper into the API, covering custom data types and file formats, direct HDFS access, intermediate data partitioning, and other tools such as the DistributedCache. Advanced Algorithms This lecture introduces some graph algorithms that can be adapted for your needs, as well as more involved examples like PageRank. We'll also look at strategies for implementing joins efficiently, and compare different techniques that are appropriate to different data models. Optimizing MapReduce Programs We'll use the Cloudera Training VM to work through an example where you write a MapReduce program and improve its performance using techniques explored earlier. Cloudera Certified Hadoop Developer Exam The day will end with the CCHD exam for those wishing to document their Hadoop expertise. This test will assess your knowledge of all areas covered over the course of the three day training sessions.
Lecture:
Lecture:
Lecture:
Exercise:
Exam:
We will take a one hour break around noon for lunch.
When & Where
New Horizons - Berlin
23 Zimmerstrasse
10969 Berlin
Germany
Monday, January 17, 2011 at 9:00 AM - Wednesday, January 19, 2011 at 5:00 PM (CET)
Add to my calendar
Organizer
Cloudera
Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.