This one-day course is for data engineers, analysts, and architects; software engineers; IT operations; and technical managers interested in a brief hands-on overview of Apache Spark.
The course covers core APIs for using Spark, basic internals of the framework, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs. Each topic includes slide and lecture content along with hands-on use of a Spark cluster through a web-based notebook environment.
After taking this class, you will be able to:
- Experiment with use cases for Spark and Databricks, including extract-transform-load operations, data analytics, data visualization, batch analysis, machine learning, graph processing, and stream processing.
- Identify Spark and Databricks capabilities appropriate to your business needs.
- Communicate with team members and engineers using appropriate terminology.
- Build data pipelines and query large data sets using Spark SQL and DataFrames.
- Execute and modify extract-transform-load (ETL) jobs to process big data using the Spark API, DataFrames, and Resilient Distributed Datasets (RDD).
- Analyze Spark jobs using the administration UIs and logs inside Databricks.
- Find answers to common Spark and Databricks questions using the documentation and other resources.
- Spark Overview
- RDD Fundamentals
- SparkSQL and DataFrames
- Spark Job Execution
- Intro to Spark Streaming
- Machine Learning Basics
• Data Analysts, Engineers, Architects
• Software Developers
• Technical Managers
Servian is a Databricks Consulting Partner providing advisory, consulting and managed services in Apache Spark™ across Australia and New Zealand.
As the exclusive Databricks certified Training Partner in the region Servian offer both Public and Private corporate classes on Apache Spark™. Spark classes offered by Servian will be delivered by Databricks certified instructors using Databricks course material.