Tuesday, June 23, 2009 from 8:00 AM - 3:30 PM (ET)
Cloudera's Intermediate Hadoop Training builds on our Basic Hadoop Training, and is appropriate for those who are already familiar with Hadoop basics and the MapReduce programming model.
Those wishing to document their skills and receive the Cloudera Certified Hadoop Professional (CCHP) credential may take the certification exam immediately following intermediate training.
Intermediate training focuses on importing data into Hadoop and building data processing pipelines. We'll cover more advanced topics such as Hive and Pig and show participants how to use each effectively.
Cloudera will give a series of lectures, interleaved with practical, hands on examples and exercises. Attendees must bring their own laptop with VMware Player (or Fusion for Mac) so they may follow along in a pre-configured virtual machine Cloudera provides.
During this all day session, we will cover the following agenda with ample time for questions:
Augmenting existing systems with Hadoop To introduce our intermediate trainign session, we'll take a step back and look at data systems more generally. Hadoop rarely replaces existing infrastructure, but rather enables you to do more with your data by providing a scalable batch processing system. This lecture helps you understand how it all fits together. Best Practices for Data Processing Pipelines In order for Hadoop to crunch large volumes of data, first you'll need to get that data into Hadoop. This lecture will help you understand how to import different types of data from various sources into Hadoop for further analysis. Importing existing databases with Sqoop Sqoop is a command line tool developed by Cloudera and contributed to the Hadoop project. It provides an easy way to import data from RDBMSs and enable you to work with that data directly using MapReduce, Hive, or Pig. Introduction to Pig Pig is a
high-level language for large-scale data analysis programs. Pig exposes many common MapReduce constructs in an simplified processing language, and is often used for ad-hoc analysis. Working with Pig In this exercise, we'll revisit some common tasks and see how you can accomplish them using Pig. Introduction to Hive - A Data Warehouse for Hadoop Hive is a powerful data warehousing application built on top of Hadoop
which allows you to use SQL to access your data. This lecture will give
an overview of Hive and the query language. Working with Hive This exercise will show you exactly how to work
with Hive. We'll walk through importing data, creating tables, and
making queries.
Lecture:
Lecture:
Exercise:
Lecture:
Exercise:
Lecture:
Exercise:
Lunch will be provided around noon.
Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.
| View other Cloudera, Inc events |
|
|
Contact the Host |
|
|
Subscribe to receive notifications of future events by this host |
Email
Facebook
Twitter
LinkedIn
MySpace
Digg
Delicious
Reddit