In this week's Whiteboard Walkthrough Jorge Geronimo, Solutions Architect at MapR, explains how with a single line of code you can create a replica of a MapR data stream within the same cluster or to another cluster even in another part of the world. Jorge also describes multi master replication for streaming data and how MapR Streams' unique capability for geo-distributed replication with preserved offsets offers advantages for working with streaming data.
Many organizations have invested in big data technologies such as Hadoop and Spark. But these investments only address how to gain deeper insights from more diverse data. They do not address how to create action from those insights.
Forrester has identified an emerging class of software—insight platforms—that combine data, analytics, and insight execution to drive action using a big data fabric.
In this presentation, our guest, Forrester Research VP and Principal Analyst, Brian Hopkins, will:
In this week's Whiteboard Walkthrough Ted Dunning, Chief Application Architect at MapR, explains in detail how to use streaming IoT sensor data from handsets and devices as well as cell tower data to detect strange anomalies. He takes us from best practices for data architecture, including the advantages of multi-master writes with MapR Streams, through analysis of the telecom data using clustering methods to discover normal and anomalous behaviors.
In this Whiteboard Walkthrough Parth Chandra, Chair of PMC for Apache Drill project and member of MapR engineering team, describes how the Apache Drill SQL query engine reads data in Parquet format and some of the best practices to get maximum performance from Parquet.
Nick Amato, Director Technical Marketing at MapR, explains the advantages of a converged environment for streaming applications vs. running these services in separate clusters.
In this Whiteboard Walkthrough, MapR’s Chief Application Architect, Ted Dunning, explains the move from state to flow and shows how it works in a financial services example. Ted describes the revolution underway in moving from a traditional system with multiple programs built around a shared database to a new flow-based system that instead uses a shared state queue in the form of a message stream built with technology such as Apache Kafka or MapR Streams. This new architecture lets decisions be made locally and supports a micro-services style approach.
Apache Spark has become the de-facto compute engine of choice for data engineers, developers, and data scientists because of its ability to run multiple analytic workloads with a single compute engine. Spark is speeding up data pipeline development, enabling richer predictive analytics, and bringing a new class of applications to market.
However: Is Spark alone sufficient for developing converged applications? How can you speed up the development of applications which span across Spark and other frameworks such as Kafka, NoSQL databases, and more?
Thank you for using the MapR Converged Community Edition. We hope you have enjoyed great success with your big data projects with the MapR Platform.
Want even more? We recently released version 5.2 of the MapR Converged Data Platform with even more new features. You are welcome to deploy the free Community Edition of the MapR Converged Data Platform in a production environment and take advantage of the free community support. (Paid commercial support is also available.)
Big data presents both enormous challenges and incredible opportunities for companies in today’s competitive environment. To deal with the rapid growth of global data, companies have turned to Hadoop to help them with performing real-time search, obtaining fast and efficient analytics, and predicting behaviors and trends. In this session, we’ll demonstrate how we successfully leveraged Hadoop and its ecosystem components to build a converged data infrastructure to meet these needs.
During this session, we will discuss: