Introduction to MapReduce training for beginners in Ithaca, NY | Map Reduce...

Event Information

Share this event

Date and Time

Location

Location

Entirety Technology

Ithaca, NY 14850

View Map

Refund Policy

Refund Policy

No Refunds

Event description

Description

In this course we will introduce you to distributed data processing, how to use MapReduce to process large amounts of data. This course is focused on providing practical hands-on exercises. Students will learn to write MapReduce programs. Advanced Features of MapReduce will be covered as well.


Course Schedule


Prerequisite

Desired but not required - Exposure to, Working proficiency of Java, sql.


Course Features

  • 4 weeks, 8 sessions, 16 hours of total LIVE Instruction
  • Training material, instructor handouts and access to useful resources on the cloud provided
  • Practical Hands on Lab exercises on cloud workstations provided
  • Actual code and scripts provided
  • Real-life Scenarios


Course Outline


1. Introduction to MapReduce

  • MapReduce Overview
  • MapReduce in Hadoop
  • History of MapReduce
  • MapReduce applications
  • Data Flow in MapReduce
  • Map and Reduce operations
  • Job submission flow of MapReduce
  • Map Operation
  • Job Initialization
  • Task Assignment
  • Job Completion
  • Job Scheduling
  • Job Failures
  • Shuffle and sort
  • Word Count Problem, Flow and Solution
  • MapReduce Algorithms

2. Map Reduce Types and Formats

  • Data Types
  • File Formats
  • Input Formats
  • Output Formats
  • Explain the Driver, Mapper and Reducer code
  • Configuring development environment - Eclipse
  • Writing Unit Test
  • Running locally
  • Running on Cluster

3. Understanding MapReduce

  • Data Flow in MapReduce
  • MapReduce example
  • MapReduce Daemons
  • Job tracker
  • Task Tracker
  • Other phases in MapReduce
  • Data Flow in single, multiple and no reduce task

4. MapReduce with YARN

  • Hadoop Architecture
  • Problem with Hadoop 1.x, Hadoop 2.x features,
  • YARN MapReduce Application Execution Flow
  • YARN Workflow
  • Anatomy of MapReduce Program

5. Advanced MapReduce

  • Counters
  • Sorting
  • Input Splits in MapReduce
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • MRunit
  • Reduce Join
  • Joins - Map Side and Reduce Side
  • Custom Input Format
  • Sequence Input Format
  • Side Data Distribution


Refund Policy

  • There are no Refunds. All Sales is final.










Date and Time

Location

Entirety Technology

Ithaca, NY 14850

View Map

Refund Policy

No Refunds

Save This Event

Event Saved