$2,124.95

HDP Analyst: Data Science - Hortonworks Official Curriculum

Event Information

Share this event

Date and Time

Location

Location

AXA Tower

8 Shenton Way

Singapore, Singapore 068811

Singapore

View Map

Event description

Description

COURSE OVERVIEW

This course Provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikit-learn), the Natural Language Toolkit (NLTK), and Spark MLlib.

COURSE CONTENT

DAY 1: AN INTRODUCTION TO HADOOP AND DATASCIENCE

OBJECTIVES

  • Using Hadoop for Data Science

  • The Hadoop Distributed File System

  • The MapReduce Framework

  • Hadoop 2 and YARN

  • Machine Learning from Data

LABS

  • Setting up the Lab Environment

  • Using HDFS Commands

  • Demonstration: Understanding MapReduce

  • Using Apache Mahout for Machine Learning

DAY 2: AN INTRODUCTION TO APACHE PIG AND PYTHON

OBJECTIVES

  • Introduction to Apache Pig

  • Python Programming

  • Analyzing Data with Python

  • Running Python on Hadoop

  • Machine Learning Algorithms

LABS

  • Getting Started with Apache Pig

  • Using the IPython Notebook

  • Demonstration: Understanding the NumPy Package

  • Demonstration: The Pandas Library

  • Performing Data Analysis with Python

  • Interpolating Data Points

  • Defining User Defined Functions in Python

  • Streaming Python with Apache Pig

  • Exploring Data with Apache Pig

  • Demonstration: Classification with Scikit-Learn

  • Computing K-Nearest Neighbor

  • Generating a K-Means Clustering

DAY 3: MACHINE LEARNING ALGORITHMS

OBJECTIVES

  • Machine Learning Algorithms Continued

  • Natural Language Processing

  • Apache SparkMLib

  • Talking Data Science to Production

LABS

  • Demonstration: POS Tagging Using a Decision Tree

  • Using the Python Natural Language Toolkit

  • Classifying Text Using Naïve Bayes

  • Using Spark Transformations andActions

  • Using Spark MLib

  • Creating a Spam Classifier Using Spark MLib

Share with friends

Date and Time

Location

AXA Tower

8 Shenton Way

Singapore, Singapore 068811

Singapore

View Map

Save This Event

Event Saved