Apache Spark Data Science 2 Days - Melbourne
A$1,955.24 – A$2,415.04
Apache Spark Data Science 2 Days - Melbourne

Apache Spark Data Science 2 Days - Melbourne

Event Information

Share this event

Date and Time

Location

Location

Servian

Level 11, 45 William St

Melbourne, VIC 3000

Australia

View Map

Friends Who Are Going
Event description

Description

Description

Note this is the Day 2 and Day 3 course of Databricks Apache Spark for Machine Learning and Data Science course (Spark 301)

This hands-on, 2-day Apache Spark training targets experienced Data Scientists wishing to perform data analysis at scale using Apache Spark. This course covers employing exploratory data analysis (EDA), building machine learning models, evaluating models, and performing cross validation.

The course is written using Scala 2.11, Python 2.x and Spark 2.0. All hands-on labs are run on Databricks Community Edition, a free cloud based Spark environment. This allows the participants to maximize their time using open source Apache Spark to solve real problems, rather than dealing with the complex issues of setting up Spark cluster installations. Labs can easily be ported to run on open source Apache Spark after class.

Intended Audience

  • Data scientists
  • Software engineers with some machine learning background

Requirements

All participants need to bring a laptop with updated versions of Chrome or Firefox (Internet Explorer and Safari are not supported). Participants should familiarize themselves with basic Scala syntax before the training. Participants should have some understanding of machine learning.

Learning Objectives

  • Learn to apply various regression and classification models, both supervised and unsupervised.
  • Train analytical models with Spark MLlib’s DataFrame-based estimators including: linear regression, decision trees, logistic regression, and k-means.
  • Use Spark MLlib transformers to perform pre-processing on a dataset prior to training, including: standardization, normalization, one-hot encoding, and binarization.
  • Create Spark MLlib pipelines to create a processing pipeline including transformations, estimations, evaluation of analytical models.
  • Evaluate model accuracy by dividing data into training and test datasets and computing metrics using Spark MLlib evaluators.
  • Tune training hyper-parameters by integrating cross-validation into Spark MLlib Pipelines.
  • Compute using RDD-based Spark MLlib functionality not present in the MLlib DataFrame API, by converting DataFrames to RDDs and applying RDD transformations and actions. (Optional Module)
  • Troubleshoot and tune machine learning algorithms in Spark.
  • Understand and build a general machine learning pipeline for Spark.

Modules

  • First Machine Learning Example

  • N-Grams

  • Regression

  • LDA Topic Modeling

  • Decision Trees

  • K-Means Clustering

  • Graphs, GraphX and GraphFrames

https://databricks.com/training/courses/apache-spark-for-machine-learning-and-data-science

About Servian


Servian is one of the most recognised consulting firms in Australia and New Zealand. Servian has over 170 consultants in Sydney, Melbourne, Auckland, Adelaide and Brisbane working across financial services, telecommunication, retail, public sector and many other industry domains.

Servian is a Databricks Consulting Partner providing advisory, consulting and managed services in Apache Spark™ across Australia and New Zealand.

As the exclusive Databricks certified Training Partner in the region Servian offer both Public and Private corporate classes on Apache Spark™. Spark classes offered by Servian will be delivered by Databricks certified instructors using Databricks course material.


About Databricks


The creators of Apache Spark™ spun out of UC Berkeley to start Databricks in 2013. At Databricks we continue to grow the Spark project via software development, roadmap planning, and fostering the community. We have deeply integrated our Spark engineering efforts and our training program. The lead committers on Spark help design, create, and review our training curriculum and courseware. When you learn about Spark from Databricks you are learning from the Authority on Spark.

Captionless Image


Duration


2 Days, Full Time (9AM to 5PM)



Share with friends

Date and Time

Location

Servian

Level 11, 45 William St

Melbourne, VIC 3000

Australia

View Map

Save This Event

Event Saved