Converged Applications

Big data platforms are changing the way we manage data. Legacy systems often require throwing away older data, moving large data sets from one silo to another, or spending exorbitant amounts to handle growth. But those are becoming the modus operandi of the past. Scale, speed, and agility are front and center with the modern data architectures that are designed for big data. Data integrity, security, and reliability remain critical goals as well. The notion of a “converged application” represents the next generation of business applications for today and the future.



Dr. Crystal Valentine, VP of technology strategy at MapR, talks about converged applications.

Converged applications are software applications that can simultaneously process both operational and analytical data, allowing real-time, interactive access to both current and historical data. This class of applications deliver real-time analytics, high frequency decisioning, and other solution architectures that require immediate operations on large volumes of data.

Converged applications provide real-time access to large volumes of data in an efficient architecture to cost-effectively drive combined operational and analytical workloads on big data. They are often deployed in a modular architecture, especially as microservices that work together as a cohesive unit, not as monolithic processes in distinct data silos that require continual data movement. This architecture leads to greater responsiveness, better decisions, less complexity, lower cost, and lower risk.


Converging Analytical and Operational

converged-applications overview

Converged Application Benefits

  • Higher value, immediate responses to events as they happen
  • Competitive advantage of delivering new class of applications on historical data
  • No costly delays and overhead for moving data from operational to analytical systems, and vice versa
  • No added overhead of managing separate, unnecessary data silos
  • Reduced administrative overhead, lower risk of security gaps on data access
  • Use the right technology for the job, less coding and less complexity
  • Future-proof your deployment (including hardware) as data grows by simply adding more servers to the cluster
  • Gain value from all data, avoid throwing away valuable historical context from older data
  • Get faster results through large-scale parallelization


How to Use the Blueprint

The blueprint consists of a sample financial services application that serves as a tutorial and starting point for a converged application that includes high speed streaming. Written for the MapR Converged Data Platform, the application (and included code examples, written in Java, with data generation in Python) show how to take advantage of the advanced streaming capabilities of MapR and to create a service that predictably scales according to business needs.

To get started using the blueprint, review the overview below, then download the code from GitHub and follow the instructions posted in the README in the github repo. You can install the MapR Converged Community Edition as a platform for running the blueprint. Get started with the installation by going to mapr.com/download, or you can run the example on a single-node VM instance which can be downloaded at mapr.com/sandbox.


Architecture of the Financial Services Example Application (Blueprint)


converged-applications overview

The blueprint consists of a sample application and can serve as a useful architecture example for developing streaming applications.

The purpose of the application is to provide a service moving ticks from the offerer (or sender) and those to whom the offer is extended (called recipient), and enable interactive analytics on the large stream of data. Both sender and recipient are customers of the service and each will occasionally want to know their situation in terms of what offers they have made and what offers they have received.

Financial "tick" and trade data is ingested on the left side of the diagram. This consists of actual trades, consisting of the fixed-width New York Stock Exhange (NYSE) format, as well as simulated "Level 2" bid and ask data leading up to each trade. A microservice consumes the stream and provides fast indexing by sender.

Each entry of the input data has format:

{time, sender, id, symbol, prices, ..., [recipient*]}

Each entry can have only one sender, but potentially many recipients.

The application runs at a high rate of data processing (over 300,000 messages per second) and the provided Java source code shows how to develop an application that can handle this level of throughput throughout the entire production environment, including handling partitioning of topics and how to index data to make it both persistent and be able to support interactive queries.

To provide predictable scalability, multiple MapR Streams (Kafka API) consumers can be started within the application and will automatically load-balance across partitions, enabling scale with increasing data rates, and the stream can be queried directly as the "system of record" using the provided indexing techniques.

Ready to check it out? Take the next step and get a MapR cluster running for free, and visit the github page to view the source code of the application.



Learn how the MapR Converged Data Platform can be used to run both operational and analytical workloads on the same cluster.