A few months ago, I created the first XML plugin for Apache Drill. The idea behind the plugin is simple: Since Apache Drill already has great support for JSON, why not convert the XML documents to JSON, and feed the information into the JSON driver for further processing and presentation in Apache Drill?
In this post we are going to discuss building a real time solution for credit card fraud detection.
It’s hard to believe it’s been a year since Apache Drill first became generally available on the MapR Converged Data Platform—yes, a full 365 days! This is just the beginning of the impact Apache Drill will have on big data analytics. Explore the infographic to see how Drill has been leveraged over the last year:
Perhaps you’re old enough to remember when the library was the place we went to learn. We foraged through card catalogs, encyclopedias and the Reader's Guide to Periodical Literature in hopes that we’d be able to understand what was going on in other people’s minds when they decided what went where.
In this blog post, I would like to share another, much less talked about advantage that emerges from this strategy. This is because a MapR cluster can naturally take advantage of the very well regarded Elasticsearch and Kibana stack to give cluster admins a near real-time view of their cluster’s health and performance.
Streaming data now is a big focus for many big data projects, including real time applications, so there’s a lot of interest in excellent messaging technologies such as Apache Kafka or MapR Streams, which uses the Kafka 0.9 API.
What capabilities should you look for in a messaging system when you design the architecture for a streaming data project? Let’s start with a hypothetical IoT data aggregation example to illustrate specific business goals and the requirements they place on messaging technology and data architecture needed to meet those goals...
Streaming data is a hot topic these days, and Apache Spark is an excellent framework for streaming. In this blog post, I'll show you how to integrate custom data sources into Spark.
With all the talk about Big Data, most organizations are barely out of the starting blocks when it comes to exploiting it for business benefit. Gartner estimates that 85% of Fortune 500 companies are yet unable to exploit Big Data for competitive advantage.
- 1 of 80
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!