About this course
This course is the third in the Apache Spark series. In this course, you cover the following Apache Spark libraries - Spark Streaming, Spark SQL, Spark MLlib, and Spark GraphX. This course describes the benefits of the Apache Spark unified platform and how to build a data pipeline application using Spark Streaming, Spark SQL, Spark GraphX, and MLlib. The concepts are taught using scenarios in Scala that also form the basis of hands-on labs.
Right for you?
- For application developers
Prerequisites for success in the course:
- DEV 361 Build and Monitor Apache Spark Applications
- Basic to intermediate Linux knowledge, including:
- The ability to use a text editor, such as vi
- Familiarity with basic command-line options such a mv, cp, ssh, grep, cd, useradd
- Knowledge of application development principles
- A Linux, Windows, or Mac OS computer with the MapR Sandbox installed (On-demand course)
- Connection to a Hadoop cluster via SSH and web browser (for the ILT and vILT course)
- Knowledge of functional programming
- Knowledge of Scala or Python
- Basic fluency with SQL
- ESS 100 – Introduction to Big Data
This course helps prepare you for the MCSD – MapR Certified Spark Developer certification exam.
Introduction to Apache Spark Data Pipelines
Create an Apache Spark Streaming Application
Use Apache Spark GraphX
Use Apache Spark MLlib