Looks like this event has already ended.

Check out upcoming events by this organizer, or organize your very own event.

View upcoming events Create an event

Hadoop Summit 2010: Hive Training

Cloudera

Friday, July 2, 2010 from 9:00 AM to 5:00 PM (PDT)

Santa Clara, CA

Hadoop Summit 2010: Hive Training

Ticket Information

Ticket Type Sales End Price Fee Quantity
Regular Registration Ended $849.00 $0.00
SHARE THIS EVENT

Event Details

Hive makes Hadoop accessible to users who already know SQL, and Sqoop enables you to automatically import data from existing RDBMS sources.

Cloudera's one-day course on Hive and Sqoop is designed for Hive users and Developers with a basic understanding of how Hadoop works. Cloudera's one-day Introduction to Hadoop course provides all the necessary background required.

Please note, Hive and Sqoop are under active development, and may offer additional functionality by the time of this session. In that case, we may modify the below agenda to include the most relevant information available at the time.

We'll alternate between instructional sessions and hands-on labs to ensure participants leave ready to import and analyze their own data with Hive.

We'll cover the following topics:

  • Intro to Hive
    • What is Hive/Hadoop?
  • Getting data into Hive
    • Creating tables
    • Data types
    • Load data
    • SerDe
    • External tables
    • Importing and Exporting Data between HDFS and RDBMS with Sqoop
    • Hands-on Exercise: loading data into Hive
  • Hive Architecture
    • Hive interfaces
    • Hive architecture
    • The Hive CLI
  • HiveQL
    • SQL vs. HiveQL
    • SELECT
    • Functions
    • GROUP BY
    • Custom map/reduce scripts
    • Subqueries
    • Joins
    • Inserting
    • Hands-on Exercise: Writing queries in HiveQL
  • Query Execution
    • Types of query plans
    • EXPLAIN
    • Join execution
    • Using hints
    • Hands-on Exercise: EXPLAIN
  • Partitioning and Bucketing
    • Creating partitions
    • Loading data into partitions
    • Bucketing
    • Sampling
    • Hands-on Exercise: Using partitioning and bucketing
  • Best Practices
    • Configuring Hive
    • Hive's metastore
    • Handling data in Hive
    • Other recommendations
  • Troubleshooting
    • The JobTracker UI
    • Logging
    • Problems with Derby
    • Using the mailing list

When & Where



Hyatt Regency (same venue as Hadoop Summit)
5101 Great America Parkway
Santa Clara, CA 95054

Friday, July 2, 2010 from 9:00 AM to 5:00 PM (PDT)


  Add to my calendar

Organizer

Cloudera

Cloudera brings Hadoop to enterprise users. We provide a certified distribution based on the most recent stable release from Apache, online and live training, as well as commercial support.


 

  Contact the Organizer

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.