Skip Main Navigation
Page Content
This event has ended

Hadoop Training for Developers - Dallas - Mar 14-16

Cloudera

Monday, March 14, 2011 at 9:00 AM - Wednesday, March 16, 2011 at 5:00 PM (CDT)

Hadoop Training for Developers - Dallas - Mar 14-16

Ticket Information

Ticket Type Sales End Price Fee Quantity
3 Days: Full Program + Certification Ended $1,999.00 $0.00

Share Hadoop Training for Developers - Dallas - Mar 14-16

Event Details

Hands-On Labs

Throughout the course, hands-on labs help students build their knowledge and apply the concepts being discussed. By the end of the course, participants will be able to import and analyze their own data in Hadoop.

Labs include:

  • Importing flat-file data into HDFS
  • Running MapReduce jobs
  • Writing MapReduce code in Java, or using the Hadoop Streaming API
  • Importing data into HDFS from relational database management systems
  • Implementing an inverted index in Hadoop
  • Manipulating data with Hive and Pig
  • Creating pipelines of MapReduce jobs with Oozie

This three-day training course from Cloudera is for developers who want to learn to use Hadoop MapReduce to build powerful data processing applications.

You will learn:

  • How MapReduce and the Hadoop Distributed File System work
  • How to write MapReduce code in Java or other programming languages
  • What issues to consider when developing MapReduce jobs
  • How to implement common algorithms in Hadoop
  • Best practices for Hadoop development and debugging
  • How to leverage other project such as Hive, Pig, Sqoop and Oozie
  • Advanced Hadoop API topics required for real-world data analysis

Certification Exam

Following the training, attendees will have an opportunity to take the Cloudera Certified Hadoop Developer exam.

Course Pre-Requisites

This course is designed for developers with some programming experience (preferably Java). Existing knowledge of Hadoop is not required.


Course Contents

The course covers the following topics:

  • The Motivation For Hadoop
    • Problems with traditional large-scale systems
    • Requirements for a new approach
  • Hadoop: Basic Concepts
    • What is Hadoop?
    • The Hadoop Distributed File System
    • How MapReduce Works
    • Anatomy of a Hadoop Cluster
  • Writing a MapReduce Program
    • Examining a Sample MapReduce Program
    • Basic API Concepts
    • The Driver Code
    • The Mapper
    • The Reducer
    • Hadoop's Streaming API
  • The Hadoop Ecosystem
    • Hive and Pig
    • HBase
    • Flume
    • Other Ecosystem Projects
  • Integrating Hadoop Into The Workflow
    • Relational Database Management Systems
    • Storage Systems
    • Importing Data from RDBMSs With Sqoop
    • Importing Real-Time Data with Flume
  • Delving Deeper Into The Hadoop API
    • Using Combiners
    • The configure and close Methods
    • SequenceFiles
    • Partitioners
    • Counters
    • Directly Accessing HDFS
    • ToolRunner
    • Using The Distributed Cache
  • Common MapReduce Algorithms
    • Sorting and Searching
    • Indexing
    • Classification/Machine Learning
    • Term Frequency - Inverse Document Frequency
    • Word Co-Occurrence
  • Using Hive and Pig
    • Hive Basics
    • Pig Basics
  • Debugging MapReduce Programs
    • Testing with MRUnit
    • Logging
    • Other Debugging Strategies
  • Advanced MapReduce Programming
    • A Recap of the MapReduce Flow
    • Custom Writables and WritableComparables
    • The Secondary Sort
    • Creating InputFormats and OutputFormats
    • Pipelining Jobs With Oozie
  • Joining Data Sets in MapReduce Jobs
    • Map-Side Joins
    • Reduce-Side Joins
  • Graph Manipulation in Hadoop
    • Introduction to graph techniques
    • Representing Graphs in Hadoop
    • Implementing a sample algorithm: Single Source Shortest Path
  • The New Hadoop API
  • Cloudera Certified Hadoop Developer Exam
Have questions about Hadoop Training for Developers - Dallas - Mar 14-16? Contact Cloudera

When & Where


Dallas MicroTek
5430 Lyndon B Johnson Fwy
Suite 300, Three Lincoln Center
Dallas, TX 75240-2601

Monday, March 14, 2011 at 9:00 AM - Wednesday, March 16, 2011 at 5:00 PM (CDT)


  Add to my calendar

Organizer

Cloudera

Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


 

  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.