Next class starting
December 5, 2016
Video Conference link
Big Data Analytics & Hadoop Training
Mon, Dec 5, 2016 8:00 AM - 10:00 AM Pacific Standard Time
Please join my meeting from your computer, tablet or smartphone.
You can also dial in using your phone.
United States +1 (669) 224-3412
Access Code: 721-907-101
First GoToMeeting? Try a test session: http://help.citrix.com/getready
Training Session Details
There will be 8 online sessions, each session being of 2-3 hours. Every session will have presentation about theory, concepts and technology, followed by Hands-on Lab practice exercises.
Timings: 8 AM - 10 AM US Pacific time
Each session will be recorded and the recordings will be shared after each session with students who have paid for the training.
Desired but not required - Exposure to, Working proficiency of BI, sql, scripting, how to handle and manage data and databases, using Excel
A Microsoft cloud Azure account will be provided to every student where they will install hortonworks hadoop on the cloud virtual machines. Students will carry out the hands-on lab exercises with instructor guidance.
Session 1: Big Data Basics
• An introduction to Big Data?
• Why is Big Data? Why now?
• The Three Dimensions of Big Data (Three Vs)
• Evolution of Big Data
• Big Data versus Traditional RDBMS Databases
• Big Data versus Traditional BI and Analytics
• Big Data versus Traditional Storage
• Key Challenges in Big Data adoption
• Benefits of adoption of Big Data
• Introduction to Big Data Technology Stack
• Apache Hadoop Framework
• Introduction to Microsoft HDInsight – Microsoft’s Big Data Service
• Creating Azure Storage Account
• Creating HDInsight Cluster
• Using services on HDInsight Cluster
Session 2: The Big Data Technology Stack
• Basics of Hadoop Distributed File System (HDFS)
• Basics of Hadoop Distributed Processing (Map Reduce Jobs)
• Loading files to Azure storage account
• Moving files across HDInsight Cluster
• Remote Access to Azure Storage Account and HDInsight Cluster
Session 3: Deep dive into Hadoop Storage System (HDFS) (1 Hour)
• Reading files with HDFS
• Writing files with HDFS
• Error Handling
• Accessing Hadoop configuration files using HDInsight Cluster
Session 4: Processing Big Data –MapReduce and YARN
• How MapReduce works
• Handling Common Errors
• Bottlenecks with MapReduce
• How YARN (MapReduceV2) works
• Difference between MR1 and MR2
• Error Handling
• Running a simple MapReduce application (word count)
• Running a custom MapReduce application (census data)
• Running MapReduce via PowerShell
• Running a MapReduce application using PowerShell
• Monitoring application status
Session 5: Big Data Development Framework
• Introduction to HIVE
• Introduction to PIG
• Loading the data into HIVE
• Submitting Pig jobs using HDInsight
• Submitting Pig jobs via PowerShell
Session 6: Big Data Integration and Management
• Big Data Integration using Polybase
• Big Data Management using Ambari
• Fetching HDInsight data into SQL
• Using Ambari for managing HDInsight cluster
Session 7: Big Data – BI and Reporting using Power BI
• Introduction to Power BI
• Usual workflow of Power BI
• Power BI Ecosystem
• Getting Data into Power BI
• Reports vs Dashboards
• Additional elements of Power BI Reports
• Fetching HDInsight Data into Power BI desktop
• Data Modelling using Power BI desktop
• Creating reports using Power BI desktop
Session 8: PowerBI.com services – Deep dive
• Power BI Dashboards
• Natural Language Query
• Power BI Workspaces – Personal and Group Workspaces
• Sharing using OneDrive for Business
• Publishing reports to Powerbi.com
• Sharing reports using OneDrive for Business
End-to-End Use Case Implementation- Lab Exercise
• Use case -Healthcare Analytics using Hadoop framework through Microsoft HDInsight and Power BI
Class Size: Maximum 22
1. Pre-registration is FREE. You will be able to attend the first few sessions for FREE subject to availability.
2. However once the sessions with Hands-on lab exercises begin, you will need to purchase a training ticket.
3. Each paying student get access to a login account on the cloud, Microsoft Azure, where they install Hadoop on a cloud virtual machine and perform hands-on lab exercises with instructor guidance.There will be 2 experienced big data, hadoop instructors supporting the students throughout the class.
ADVANTAGE OF PURCHASING TRAINING TICKET:
1. Class recordings will be made available.
2. Post class support
3. Course material available.
4. Cloud account on Microsoft Azure with Hands-on lab exercises under the guidance of two experienced big data, hadoop instructors.
5. Career advancement and Job placement assistance
1. 100% refund will be provided only if we DO NOT hold the class and/or we reschedule the class and the new dates and timings don't work for you.
2. If the class is held as per schedule, you don't show up or you register, purchase a training ticket and then change your mind, we will not issue a refund.