Skip Main Navigation
Page Content
This event has ended

Getting Started with RHadoop

Think Big Analytics

Thursday, April 25, 2013 from 9:00 AM to 4:30 PM (MST)

Getting Started with RHadoop

Ticket Information

Ticket Type Sales End Price Fee Quantity
Full Program Ended $795.00 $0.00

Who's Going

Loading your connections...

Share Getting Started with RHadoop

Event Details

Getting Started with RHadoop is designed for data analysts skilled with R to learn to leverage their existing knowledge in the Big Data environment of Hadooop.

The class is taught by Jeffrey Breen, Principal at Think Big Analytics, and consists of a mixture of lecture and hands-on examples and exercises. Previous knowledge of Hadoop is not necessary, but you should be comfortable using R interactively from a command shell in addition to a GUI. We will work all examples on a Hadoop cluster (provided).


Overview & Introduction

  • R & Hadoop
  • RHadoop Package Overview
  • RHadoop Advantages
  • rmr2 Advantages
  • Installation and configuration
  • RHadoop Prerequisites
  • Downloading RHadoop

Using RHadoop


  • Installation
  • Function Overview

Example: Populate HDFS

Example: Checking the results


  • Installation
  • Installation test: Example 0
  • Function Overview

Example: wordcount: the “hello world” of Hadoop

  • code
  • mapper
  • reducer
  • combiner
  • submit job and fetch results
  • Exercise: get rid of the punctuation

Example: airports

  • about the data
  • selecting meaningful keys and values
  • writing an input formatter
  • mapper
  • reducer
  • submit job and fetch results
  • execution notes
  • Exercise: repeat analysis by year and airline

Example: user-based collaborative filtering

  • about the data
  • the algorithm
  • step 1: input formatter
  • step 1: mapper
  • step 1: reducer
  • step 2: reducer
  • submit the jobs and fetch results
  • Exercise: find nearest neighbors


  • Installation
  • Installation test: Example 0
  • Function Overview

Example: tweets

write Twitter status messages to HBase

Example: Twitter users

store user information tweet authors

Wrap-up and Q&A

Have questions about Getting Started with RHadoop? Contact Think Big Analytics

Please log in or sign up

In order to purchase these tickets in installments, you'll need an Eventbrite account. Log in or sign up for a free account to continue.