MapR is pleased to announce support for event-driven microservices on the MapR Converged Data Platform. In this blog post, I’d like to explain what this means, and how it fits into our bigger idea of “convergence.” Microservices are simple, single-purpose applications that work in unison via lightweight communications, such as data streams. They allow you to more easily manage segmented efforts to build, integrate, and coordinate your applications in ways that have traditionally been impossible with monolithic applications.
This post will help you get started using Apache Spark Streaming for consuming and publishing messages with MapR Streams and the Kafka API. Spark Streaming is an extension of the core Spark API that enables continuous data stream processing. MapR Streams is a distributed messaging system for streaming event data at scale. MapR Streams enables producers and consumers to exchange events in real time via the Apache Kafka 0.9 API.
Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data.
When I first started my internship, I wasn’t really sure what to expect. I knew the basics—MapR was a big data company, I was a technical marketing intern, and I would be doing quite a bit of competitive analysis. I originally found the position while searching for marketing internships in the Bay Area, and this one in particular popped out at me when I read the job description, since it intertwined my interest in marketing with my hope of getting some experience in the tech industry.
A very common use case for the MapR Converged Data Platform is collecting and analyzing data from a variety of sources, including traditional relational databases. Until recently, data engineers would build an ETL pipeline that periodically walks the relational database and loads the data into files on the MapR cluster, then perform batch analytics on that data.
In this week’s Whiteboard Walkthrough, Stephan Ewen, PMC member of Apache Flink and CTO of data Artisans, describes a valuable capability of Apache Flink stream processing: grouping events together that were ob
Apache Drill is an engine that can connect to many different data sources, and provide a SQL interface to them. It's not just a wanna-be SQL interface that trips over at anything complex - it's a hugely functional one including support for many built in functions as well as windowing functions. Whilst it can connect to standard data sources that you'd be able to query with SQL anyway, like Oracle or MySQL, it can also work with flat files such as CSV or JSON, as well as Avro and Parquet formats.
This blog post is the first in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer. In his book, Davi outlines the rationale and key challenges of securing big data systems and applications. He does so using some great anecdotes and with good humor, making the book a good read whether you’re a white/grey/black hat, cyber superhero, or even if you’re not a security expert at all.
Today we are excited to announce the availability of Drill 1.8 on the MapR Converged Data Platform. As part of the Apache Drill community, we continue to deliver iterative releases of Drill, providing significant feature enhancements along with enterprise readiness improvements based on feedback from a variety of customer deployments.
Six months ago, we launched the Converge Community in order to provide a seamless way for Hadoop and Spark developers, data analysts, and administrators to engage in technical discussions and share expertise that furthers the advancement of the big data community as a whole.
- 1 of 87
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!