Skip Main Navigation
Page Content
This event has ended

Getting Started with RHadoop

Think Big Analytics

Thursday, April 25, 2013 from 9:00 AM to 4:30 PM (MST)

Getting Started with RHadoop

Ticket Information

Ticket Type Sales End Price Fee Quantity
Full Program Ended $795.00 $0.00

Who's Going

Loading your connections...

Share Getting Started with RHadoop

Event Details

Getting Started with RHadoop is designed for data analysts skilled with R to learn to leverage their existing knowledge in the Big Data environment of Hadooop.

The class is taught by Jeffrey Breen, Principal at Think Big Analytics, and consists of a mixture of lecture and hands-on examples and exercises. Previous knowledge of Hadoop is not necessary, but you should be comfortable using R interactively from a command shell in addition to a GUI. We will work all examples on a Hadoop cluster (provided).


Overview & Introduction

  • R & Hadoop
  • RHadoop Package Overview
  • RHadoop Advantages
  • rmr2 Advantages
  • Installation and configuration
  • RHadoop Prerequisites
  • Downloading RHadoop

Using RHadoop


  • Installation
  • Function Overview

Example: Populate HDFS

Example: Checking the results


  • Installation
  • Installation test: Example 0
  • Function Overview

Example: wordcount: the “hello world” of Hadoop

  • code
  • mapper
  • reducer
  • combiner
  • submit job and fetch results
  • Exercise: get rid of the punctuation

Example: airports

  • about the data
  • selecting meaningful keys and values
  • writing an input formatter
  • mapper
  • reducer
  • submit job and fetch results
  • execution notes
  • Exercise: repeat analysis by year and airline

Example: user-based collaborative filtering

  • about the data
  • the algorithm
  • step 1: input formatter
  • step 1: mapper
  • step 1: reducer
  • step 2: reducer
  • submit the jobs and fetch results
  • Exercise: find nearest neighbors


  • Installation
  • Installation test: Example 0
  • Function Overview

Example: tweets

write Twitter status messages to HBase

Example: Twitter users

store user information tweet authors

Wrap-up and Q&A

Have questions about Getting Started with RHadoop? Contact Think Big Analytics

When & Where


Denver, CO

Thursday, April 25, 2013 from 9:00 AM to 4:30 PM (MST)

  Add to my calendar

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.