This event has ended

Hadoop World: Analyzing data with Hive & Pig


Wednesday, October 13, 2010 at 9:00 AM - Thursday, October 14, 2010 at 5:00 PM (EDT)

New York, United States

Hadoop World: Analyzing data with Hive & Pig

Ticket Information

Ticket Type Sales End Price Fee Quantity
Regular Registration Ended $1,695.00 $0.00
Groups (5+) Ended $1,395.00 $0.00

Share Hadoop World: Analyzing data with Hive & Pig

Event Details

Hive and Pig are high-level languages on top of Hadoop's MapReduce paradigm.

Hive makes Hadoop accessible to users who already know SQL; Pig is similar to popular scripting languages.

Cloudera's two-day course on Hive and Pig is designed for people who have a basic understanding of how Hadoop works and want to utilize these languages for analysis of their data. Cloudera's one-day Introduction to Hadoop course provides all the necessary background required.

Please note, Hive and Pig are under active development, and may offer additional functionality by the time of this session. In that case, we may modify the below agenda to include the most relevant information available at the time.

This course will alternate between instructional sessions and hands-on labs to ensure participants leave ready to import and analyze their own data with Hive and Pig

We'll cover the following topics:

  • Intro to Hive and Pig
    • What is Hadoop?
    • Where do Hive and Pig fit in?
  • Getting data into Hive
    • Creating tables
    • Data types
    • Load data
    • SerDe
    • External tables
    • Importing and Exporting Data between HDFS and RDBMSs with Sqoop
    • Hands-on Exercise: loading data into Hive
  • Hive Architecture
    • Hive interfaces
    • Hive architecture
    • The Hive CLI
  • HiveQL
    • SQL vs. HiveQL
    • SELECT
    • Functions
    • GROUP BY
    • Custom map/reduce scripts
    • Subqueries
    • Joins
    • Inserting
    • Hands-on Exercise: Writing queries in HiveQL
  • Query Execution
    • Types of query plans
    • Join execution
    • Using hints
    • Hands-on Exercise: EXPLAIN
  • Partitioning and Bucketing
    • Creating partitions
    • Loading data into partitions
    • Bucketing
    • Sampling
    • Hands-on Exercise: Using partitioning and bucketing
  • Best Practices for Hive
    • Configuring Hive
    • Hive's metastore
    • Handling data in Hive
    • Other recommendations
  • Troubleshooting Hive
    • The JobTracker UI
    • Logging
    • Problems with Derby
    • Using the mailing list
  • Getting Started with Pig development
    • Loading and displaying data
    • Basic data filters
    • Pig Schemas
    • Hands-On Exercise
  • PigLatin in-depth
    • Pig datatypes
    • More Advanced Dataset Filtering
    • Hands-On Exercise
    • Pig Expressions and Functions
    • Grouping and Sorting Data
    • Hands-On Exercise
    • Joining Multiple Datasets
    • Validating Datasets
    • Hands-On Exercise
    • Storing Data
  • Best Practices for Pig
    • Achieving Optimal Pig Performance in Production
Have questions about Hadoop World: Analyzing data with Hive & Pig? Contact Cloudera

When & Where

One New York Plaza
31st Floor
New York, 10004

Wednesday, October 13, 2010 at 9:00 AM - Thursday, October 14, 2010 at 5:00 PM (EDT)

  Add to my calendar



Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.