Carol has extensive experience as a developer and architect building complex, mission-critical applications in the Banking, Health Insurance and Telecom industries. As a Java Technology Evangelist at Sun Microsystems, Carol traveled all over the world speaking at Sun Tech Days, JUGs, companies, and conferences. She is a recognized speaker in Java communities.
This post will help you get started using Apache Spark Streaming for consuming and publishing messages with MapR Streams and the Kafka API. Spark Streaming is an extension of the core Spark API that enables continuous data stream processing. MapR Streams is a distributed messaging system for streaming event data at scale. MapR Streams enables producers and consumers to exchange events in real time via the Apache Kafka 0.9 API.
Building a robust, responsive, secure data service for healthcare is tricky. For starters, healthcare data lends itself to multiple models: Document representation for patient profile views or updates; Graph representation to query relationships between patients, providers, and medications; Search representation for advanced lookups. This post will describe how stream-first architectures can solve these challenges, and look at how this has been implemented at Liaison Technologies.
This post is the first in a series where we will review examples of how Joe Blue, a Data Scientist in MapR Professional Services, assisted MapR customers in identifying new data sources and applying machine learning algorithms in order to better understand their customers. The first example in the series is an advertising customer 360°; the next example in the series will be banking and healthcare customer 360° examples.
Random forests are one of the most successful machine learning models for classification. In this blog post, I’ll help you get started using Apache Spark’s spark.ml Random forests for classification of bank loan credit risk.
This post will use Apache Spark SQL and DataFrames to query, compare and explore S&P 500, Exxon and Anadarko Petroleum Corporation stock prices.
Standards and incentives for the digitizing and sharing of healthcare data along with improvements and decreasing costs in storage and parallel processing on commodity hardware, are causing a big data revolution in health care with the goal of better care at lower cost.
In this post we are going to discuss building a real time solution for credit card fraud detection.
This post will show how to integrate Apache Spark Streaming, MapR-DB, and MapR Streams for fast, event-driven applications.
This post will help you get started using Apache Spark GraphX with Scala on the MapR Sandbox. GraphX is the Apache Spark component for graph-parallel computations, built upon a branch of mathematics called graph theory. It is a distributed graph processing framework that sits on top of the Spark core.
Decision trees are widely used for the machine learning tasks of classification and regression. In this blog post, I’ll help you get started using Apache Spark’s MLlib machine learning decision trees for classification.
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!