Hands-on introduction to the new Apache SPARK 2.0 and SCALA on Azure
€250 – €900
Hands-on introduction to the new Apache SPARK 2.0 and SCALA on Azure

Hands-on introduction to the new Apache SPARK 2.0 and SCALA on Azure

Event Information

Share this event

Date and Time



On Skype

View Map

Friends Who Are Going
Event description


Leapfrog your competition, gain Apache Spark 2.0 skills.

The newly released Spark 2.0 enables workshop participants to build unified big data applications, combining machine learning, batch, streaming, and interactive analytics on all their datasets. With Spark, developers can write sophisticated distributed and parallel applications to execute faster decisions, better decisions, and real-time actions, applied to a wide variety of use cases, architectures, and industries.

Gain Competitive Advantage from Ecosystem mastery

Apache Spark 2.0 is the next-generation successor to Hadoop MapReduce. Spark is a powerful, open source processing engine for large datasets, optimized for speed, ease of use, and sophisticated analytics. The Spark framework supports streaming data processing and complex, iterative algorithms, enabling applications to run up to 100x faster than traditional Hadoop MapReduce programs.

Hands-On Practice

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Spark ecosystem learning topics such as:

  • Use Spark shell for interactive data analysis
  • How Spark parallelizes task execution
  • Learn about RDDs, DataFrames and Datasets
  • Writing Spark applications in Scala
  • How Spark runs with cluster managers, e.g. Spark Standalone and Hadoop YARN
  • Applying Machine Learning to data at rest and in motion
  • Processing streaming data with Spark

Some of the Hands-on Labs will help you master the follwoing Microsoft Azure architecture

Audience & Prerequisites

This course is best suited for developers and data analysts, engineers with prior knowledge and experience in Scala, Java, R or Python languages. Course examples and exercises are presented in Scala, so knowledge of one of these programming languages is required.

Instructor Spark Class Notes at Mastering Apache Spark.

Early birds (till Nov 15th ) $999. Regular fee: $2500. For more information email: registration@valueamplify.com


  • Understanding of the benefits of Spark in the Big Data ecosystem
  • Provision and configure your own Spark production clustered environment
  • Ability to master Spark fundamentals: RDDs, partitions, jobs, stages, tasks
  • Understanding of the role of DAGScheduler, TaskScheduler, and SchedulerBackends
  • Conducting Monitoring of Spark Apps using Spark’s web UI, SparkListeners, and log analysis
  • Conduct data analytics using Spark SQL
  • Train and run Machine Learning models using Spark MLlib
  • Managing streaming data using Spark Streaming
  • Learning about security and RPC communication layer to mix Pyton, R and other apps.



The Elements of Apache Spark’s Architecture – 4 hours

LEARNING / HANDS-ON (depending on the audience)

- RDD and DataFrame, Dataset (Spark 2.0), Structured Streaming (Spark 2.0)

- Jobs, Stages, Tasks, Shuffling

- DAGScheduler, TaskScheduler and SchedulerBackends

- Spark Modules (Spark SQL and Spark MLlib, less about Spark Streaming and Spark GraphX)

- Spark and cluster managers – Hadoop YARN, Apache Mesos and Spark Standalone

- Deployment Modes (client vs cluster)

Monitoring Spark Apps (using web UI, SparkListeners and log analysis) – 3 hours


- web UI

- SparkListeners (including developing custom SparkListeners)

- Log analysis

Spark Setup and Your First Spark Application - 1 hour


- Setting Up Deployment Environment

- Developing Spark SQL Applications using Scala, sbt, and IntelliJ IDEA



Introduction to Scala (and sbt) – 2 hours


- Developing Scala Applications introducing Scala Standard API

- Working with Files

- Scala Collection API

- Customizing sbt projects (using plugins)

Developing Spark SQL Applications using Datasets – 3 hours

HANDS-ON (Based on Pre-Assessment)

- Working with Structured Datasets in CSV and JSON files

- Using Dataset API

- Using User-Defined Functions (UDF)



Developing Machine Learning Pipelines using Spark MLlib - 3 hours


- Create your first ML Pipeline

- Train a Logistic Regression model

- Using Random Forest and Classification algorithms

Spark Security

- Secured web UI

- Secured RPC

- Hadoop YARN

Early birds (till Nov 15th) $999, Fees: $2500
registration@valueamplify.com or go to www.valueamplify.com

If the class doesn't reach a minum number you will be refunded.


Jacek Laskowski

Developer and trainer for Apache Spark, Scala, sbt, Hadoop YARN with some experience in Apache Kafka, Apache Hive, Apache Mesos, Akka, and Docker.

See the Spark Class Notes at Mastering Apache Spark.

Spark & Scala Workshops taught

  • Toronto
  • Mississauga
  • Plymouth Meeting
  • Montreal
  • London

Feedback from the class

Notes from the IMS Health employees in Plymouth Meeting PA on Spark / Scala Workshop: “Jacek, you are a great teacher.

Instructional Designer

Adj. Prof. Giuseppe Mascarella,

16 years at Microsoft, manager and trainer of MTC and MCS

Teaching Social Media Analytics at FAU (Florida Atlantic University).

MS in Industrial Engineering, Major: Statistical Quality Control

Taught Machine Learning Recommenders, Churn and Predictive Maintenance at SQL PASS Analytics events.

Recording of a session on Recommenders

Recording of a session customer Churn predictions with Azure ML

Share with friends

Date and Time


On Skype

View Map

Save This Event

Event Saved