Apache Spark delivers in-memory processing for big data and enables faster application development

Apache Spark is a general-purpose engine for large-scale data processing. It supports rapid application development for big data and allows for code reuse across batch, interactive, and streaming applications. The most popular use cases for Apache Spark include building data pipelines and developing machine learning models. The MapR Converged Data Platform is the choice for production Spark applications.

New to Apache Spark? Get the ebook
Getting Started with Apache Spark:
From Inception to Production
Read now 

Key Features

  • Analytics on Consistent Data: The MapR Converged Data Platform enables data scientists to perform analytics on consistent data in both development and production environments through features such as mirroring and consistent snapshots.
  • Secure Multi-Tenant Applications: The MapR Converged Data Platform enables development of reliable and secure multi-tenant applications leveraging Apache Spark.
  • Run Streaming & NoSQL Workloads Together: The MapR Converged Data Platform enables the development of streaming and NoSQL applications on a single cluster. By using Spark Streaming, MapR Streams, and MapR-DB together, real-time operational applications can be developed that allow for data ingestion at high speeds.

Use Cases

  • Faster Batch Applications: You can now develop and deploy batch applications that run 10-100x faster in production environments with in-memory processing of data. Quantium uses Spark on the MapR Platform to decrease processing time by 92%, which represents a 12.5X increase in performance.
    Case study
  • Complex ETL Data Pipelines: You can leverage the Spark stack to build complex ETL pipelines that can speed up data ingestion and deliver superior performance. Razorsight leverages Spark on the MapR Platform to build a more efficient and cost-effective data pipeline which enables them to deliver cloud-based predictive analytics faster to their mobile and telco operators.
    Case study
  • Advanced Analytics: You can leverage MLlib and GraphX to develop applications that combine the power of machine learning with graph technology. This can enable faster application development and enable data scientists to test new hypothesis faster. Novartis uses Spark on the MapR Platform to integrate and analyze a variety of data to accelerate drug research.
    Case study
Try Now   
Customer Testimonial

Hear the CEO of Terbium Labs describe their use case and how they utilize Spark on the MapR platform


Apache Spark 2.0.1 now available with new MapR Ecosystem Pack
Read more


Apache Spark Training
Learn more


Give Your Enterprise a Lift


Sparkling new Spark distribution spurs MapR to reduce MapReduce


Getting Started with Apache Spark


Razorsight Launches New Predictive Analytics Solutions for Telecoms Running on MapR Platform


Persistent Storage for Enterprise-Grade Spark Applications


MapR Platform including Spark


Getting Started with Spark on MapR Sandbox


MapR Documentation