This event has ended

Cloudera Training for Apache Hive and Pig - June 30-July 1

Cloudera

Thursday, June 30, 2011 at 9:00 AM - Friday, July 1, 2011 at 5:00 PM (PDT)

San Jose, CA

Cloudera Training for Apache Hive and Pig - June 30-July 1

Ticket Information

Ticket Type Sales End Price Fee Quantity
Regular Registration Ended $1,695.00 $0.00

Share Cloudera Training for Apache Hive and Pig - June 30-July 1

Event Details

Apache Hive and Apache Pig are higher-level languages on top of Hadoop's MapReduce paradigm.

Hive makes Hadoop accessible to users who already know SQL; Pig is similar to popular scripting languages.

Cloudera's two-day course on Hive and Pig is designed for people who have a basic understanding of how Hadoop works and want to utilize these languages for analysis of their data. 

Please note, Hive and Pig are under active development, and may offer additional functionality by the time of this session. In that case, we may modify the below agenda to include the most relevant information available at the time.

We'll alternate between instructional sessions and hands-on labs to ensure participants leave ready to import and analyze their own data with Hive and Pig

We'll cover the following topics:

  • Intro to Hive and Pig
    • What is Hadoop?
    • Where do Hive and Pig fit in?
  • Getting data into Hive
    • Creating tables
    • Data types
    • Load data
    • SerDe
    • External tables
    • Importing and Exporting Data between HDFS and RDBMS with Sqoop
    • Hands-on Exercise: loading data into Hive
  • Hive Architecture
    • Hive interfaces
    • Hive architecture
    • The Hive CLI
  • HiveQL
    • SQL vs. HiveQL
    • SELECT
    • Functions
    • GROUP BY
    • Custom map/reduce scripts
    • Subqueries
    • Joins
    • Inserting
    • Hands-on Exercise: Writing queries in HiveQL
  • Query Execution
    • Types of query plans
    • EXPLAIN
    • Join execution
    • Using hints
    • Hands-on Exercise: EXPLAIN
  • Partitioning and Bucketing
    • Creating partitions
    • Loading data into partitions
    • Bucketing
    • Sampling
    • Hands-on Exercise: Using partitioning and bucketing
  • Best Practices for Hive
    • Configuring Hive
    • Hive's metastore
    • Handling data in Hive
    • Other recommendations
  • Troubleshooting Hive
    • The JobTracker UI
    • Logging
    • Problems with Derby
    • Using the mailing list
  • Getting Started with Pig development
    • Loading and displaying data
    • Basic data filters
    • Pig Schemas
    • Hands-On Exercise
  • PigLatin in-depth
    • Pig datatypes
    • More Advanced Dataset Filtering
    • Hands-On Exercise
    • Pig Expressions and Functions
    • Grouping and Sorting Data
    • Hands-On Exercise
    • Joining Multiple Datasets
    • Validating Datasets
    • Hands-On Exercise
    • Storing Data
  • User-Defined Functions
    • Using functions in Pig
    • Hands-On Exercise
  • Best Practices for Pig
    • Achieving Optimal Pig Performance in Production
Have questions about Cloudera Training for Apache Hive and Pig - June 30-July 1? Contact Cloudera

When & Where



San Jose - ExecuTrain
2025 Gateway Pl.
Suite 390
San Jose, CA 95110

Thursday, June 30, 2011 at 9:00 AM - Friday, July 1, 2011 at 5:00 PM (PDT)


  Add to my calendar

Organizer

Cloudera

Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


 

  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.