" rel="stylesheet">
Skip Main Navigation
Page Content
This event has ended

Cloudera Training for Apache Hive and Pig - NYC - Feb 6-7

Cloudera

Monday, February 6, 2012 at 9:00 AM - Tuesday, February 7, 2012 at 5:00 PM (EST)

Cloudera Training for Apache Hive and Pig - NYC - Feb...

Ticket Information

Ticket Type Sales End Price Fee Quantity
Early Bird ($100 savings) Ended $1,695.00 $0.00
Regular Registration Ended $1,795.00 $0.00

Share Cloudera Training for Apache Hive and Pig - NYC - Feb 6-7

Event Details

Apache Hive and Apache Pig are higher-level languages on top of Hadoop's MapReduce paradigm.

Hive makes Hadoop accessible to users who already know SQL; Pig is similar to popular scripting languages.

Cloudera's two-day course on Hive and Pig is designed for people who have a basic understanding of how Hadoop works and want to utilize these languages for analysis of their data. 

We'll alternate between instructional sessions and hands-on labs to ensure participants leave ready to import and analyze their own data with Hive and Pig

We'll cover the following topics:

  • Intro to Hive and Pig
    • What is Hadoop?
    • Where do Hive and Pig fit in?
  • Getting data into Hive
    • Creating tables
    • Data types
    • Load data
    • SerDe
    • External tables
    • Importing and Exporting Data between HDFS and RDBMS with Sqoop
    • Hands-on Exercise: loading data into Hive
  • Hive Architecture
    • Hive interfaces
    • Hive architecture
    • The Hive CLI
  • HiveQL
    • SQL vs. HiveQL
    • SELECT
    • Functions
    • GROUP BY
    • Custom map/reduce scripts
    • Subqueries
    • Joins
    • Inserting
    • Hands-on Exercise: Writing queries in HiveQL
  • Query Execution
    • Types of query plans
    • EXPLAIN
    • Join execution
    • Using hints
    • Hands-on Exercise: EXPLAIN
  • Partitioning and Bucketing
    • Creating partitions
    • Loading data into partitions
    • Bucketing
    • Sampling
    • Hands-on Exercise: Using partitioning and bucketing
  • Best Practices for Hive
    • Configuring Hive
    • Hive's metastore
    • Handling data in Hive
    • Other recommendations
  • Troubleshooting Hive
    • The JobTracker UI
    • Logging
    • Problems with Derby
    • Using the mailing list
  • Getting Started with Pig development
    • Loading and displaying data
    • Basic data filters
    • Pig Schemas
    • Hands-On Exercise
  • PigLatin in-depth
    • Pig datatypes
    • More Advanced Dataset Filtering
    • Hands-On Exercise
    • Pig Expressions and Functions
    • Grouping and Sorting Data
    • Hands-On Exercise
    • Joining Multiple Datasets
    • Validating Datasets
    • Hands-On Exercise
    • Storing Data
  • User-Defined Functions
    • Using functions in Pig
    • Hands-On Exercise
  • Best Practices for Pig
    • Achieving Optimal Pig Performance in Production
Have questions about Cloudera Training for Apache Hive and Pig - NYC - Feb 6-7? Contact Cloudera

When & Where


MicroTek - NYC
90 Broad St.
11th Floor
New York, NY 10004-2205

Monday, February 6, 2012 at 9:00 AM - Tuesday, February 7, 2012 at 5:00 PM (EST)


  Add to my calendar

Organizer

Cloudera

Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


 

  Contact the Organizer
Cloudera Training for Apache Hive and Pig - NYC - Feb 6-7
New York, NY Events Class

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.