$2,824.95

HDP Developer: Apache Pig and Hive - Hortonworks Official Curriculum

Event Information

Share this event

Date and Time

Location

Location

Hong Kong

Hong Kong

Hong Kong

View Map

Event description

Description

COURE OVERVIEW

This 4 day training course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition, using Pig and Hive to perform data analytics on Big Data and an introduction to Spark Core and Spark SQL.

COURSE CONTENT

DAY 1: AN INTRODUCTION TO THE HADOOP DISTRIBUTED FILE SYSTEM

OBJECTIVES

  • Understanding Hadoop
  • The Hadoop Distributed File System
  • Ingesting Data into HDFS
  • The MapReduce Framework

LABS

  • Starting an HDP Cluster

  • Demonstration: Understanding Block Storage

  • Using HDFS Commands

  • Importing RDBMS Data into HDFS

  • Exporting HDFS Data to an RDBMS

  • Importing Log Data into HDFS Using Flume

  • Demonstration: Understanding MapReduce

  • Running a MapReduce Job

DAY 2: AN INTRODUCTION TO APACHE PIG

OBJECTIVES

  • Introduction to Apache Pig

  • Advanced Apache Pig Programming

LABS

  • Demonstration: Understanding Apache Pig

  • Getting Starting with Apache Pig

  • Exploring Data with Apache Pig

  • Splitting a Dataset

  • Joining Datasets with Apache Pig

  • Preparing Data for Apache Hive

  • Demonstration: Computing Page Rank

  • Analyzing Clickstream Data

  • Analyzing Stock Market Data Using Quantiles

DAY 3: AN INTRODUCTION TO APACHE HIVE

OBJECTIVES

  • Apache Hive Programming

  • Using HCatalog

  • Advanced Apache Hive Programming

LABS

  • Understanding Hive Tables

  • Understanding Partition and Skew

  • Analyzing Big Data with Apache Hive

  • Demonstration: Computing NGrams

  • Joining Datasets in Apache Hive

  • Computing NGrams of Emails in Avro Format

  • Using HCatalog withApachePig

DAY 4: WORKING WITH SPARK CORE, SPARK SQL AND OOZIE

OBJECTIVES

  • Advanced Apache Hive Programming (Continued)

  • Hadoop 2 and YARN

  • Introduction to Spark Core and Spark SQL

  • Defining Workflow with Oozie

LABS

  • Advanced Apache Hive Programming

  • Running a YARN Application

  • Getting Started with Apache Spark

  • Exploring Apache Spark SQL

  • Defining an Apache Oozie Workflow


Share with friends

Date and Time

Location

Hong Kong

Hong Kong

Hong Kong

View Map

Save This Event

Event Saved