MapR Ships Eighth Release of Complete Apache Spark Stack on MapR Converged Data Platform
San Jose, CA

Free Spark training uptake dominates coursework enrollment and certification

MapR Technologies, Inc., provider of the industry’s only Converged Data Platform, today announced the immediate availability of Apache Spark 1.6.1 on the MapR Converged Data Platform making it the eighth release of the full Spark stack available to MapR customers. Additionally, the free, complete online Spark On Demand Training (ODT) courses via MapR Academy have achieved the highest course enrollment rate since the ODT program’s initial launch.  

“We have seen a significant customer adoption of Spark for building data pipelines and advanced analytics,” said Anoop Dawar, vice president of product management, Spark and Hadoop, MapR Technologies.  “MapR has fully supported the Spark stack for two years – more than any other vendor in this industry.  Based on customer feedback MapR provides early preview releases so data scientists and developers can try cutting edge features and then follows it up with a GA release for production deployments”

Spark continues to attract significant interest from developers and 30% of course registrants have already become certified as MapR Certified Spark Developers.  This industry credential validates a developer’s technical knowledge, skills and abilities to use Spark in an enterprise environment to process large datasets.

Apache Spark version 1.6.1 on the MapR Converged Data Platform features:

  • Improved performance gains with core Spark engine

With Spark 1.6.1 automatic memory management, both execution memory and storage memory can be changed dynamically based on workload characteristics. Execution memory can now borrow available memory from the storage region and vice versa.

  • Persistence of machine learning pipelines

Spark 1.6.1 adds new features to machine learning that take persistence beyond models to persisting the entire pipeline, including transformers and estimators. The entire workflow can be persisted which includes pipeline persistence along with model persistence, without needing to write custom code for exporting or importing.

  • Dataset API

Spark 1.6.1 introduces a new experimental interface called Dataset API that is an extension of the DataFrames API. Datasets contain encoders that can be used in both Scala and Java, with Python support to be added in future releases. The biggest benefit of this new Dataset API is the reduction in memory usage as it can create a more optimal layout in memory when caching datasets.

 

 


About MapR Technologies

Headquartered in San Jose, Calif., MapR provides the industry’s only Converged Data Platform that enables customers to harness the power of big data by combining analytics in real-time to operational applications to improve business outcomes. With MapR, enterprises have an unparalleled data management platform for undertaking digital transformation initiatives to achieve competitive edge. World-class companies have realized more than five times their return on investment using MapR. Amazon, Cisco, Google, Microsoft, SAP and other leading businesses are part of the global MapR partner ecosystem. For more information, visit www.mapr.com

Media Contacts

Beth Winkowski
MapR Technologies, Inc.
(978) 649-7189
bwinkowski@maprtech.com

Kim Pegnato
MapR Technologies, Inc.
(781) 620-0016
kpegnato@maprtech.com

www.mapr.com/company/press-releases