Latest

Posted on September 23, 2016 by Yvonne Chen

When I first started my internship, I wasn’t really sure what to expect. I knew the basics—MapR was a big data company, I was a technical marketing intern, and I would be doing quite a bit of competitive analysis. I originally found the position while searching for marketing internships in the Bay Area, and this one in particular popped out at me when I read the job description, since it intertwined my interest in marketing with my hope of getting some experience in the tech industry.

Featured

Posted on September 6, 2016 by Carol McDonald

This post will help you get started using Apache Spark Streaming for consuming and publishing messages with MapR Streams and the Kafka API. Spark Streaming is an extension of the core Spark API that enables continuous data stream processing. MapR Streams is a distributed messaging system for streaming event data at scale. MapR Streams enables producers and consumers to exchange events in real time via the Apache Kafka 0.9 API.

Posted on September 22, 2016 by Raphaël Velfre

A very common use case for the MapR Converged Data Platform is collecting and analyzing data from a variety of sources, including traditional relational databases. Until recently, data engineers would build an ETL pipeline that periodically walks the relational database and loads the data into files on the MapR cluster, then perform batch analytics on that data.

Posted on September 21, 2016 by Ellen Friedman

In this week’s Whiteboard Walkthrough, Stephan Ewen, PMC member of Apache Flink and CTO of data Artisans, describes a valuable capability of Apache Flink stream processing: grouping events together that were ob

Posted on September 20, 2016 by Robin Moffatt

Apache Drill is an engine that can connect to many different data sources, and provide a SQL interface to them. It's not just a wanna-be SQL interface that trips over at anything complex - it's a hugely functional one including support for many built in functions as well as windowing functions. Whilst it can connect to standard data sources that you'd be able to query with SQL anyway, like Oracle or MySQL, it can also work with flat files such as CSV or JSON, as well as Avro and Parquet formats.

Posted on September 15, 2016 by George Demarest

This blog post is the first in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer. In his book, Davi outlines the rationale and key challenges of securing big data systems and applications. He does so using some great anecdotes and with good humor, making the book a good read whether you’re a white/grey/black hat, cyber superhero, or even if you’re not a security expert at all.

Posted on September 14, 2016 by Neeraja Rentachintala

Today we are excited to announce the availability of Drill 1.8 on the MapR Converged Data Platform. As part of the Apache Drill community, we continue to deliver iterative releases of Drill, providing significant feature enhancements along with enterprise readiness improvements based on feedback from a variety of customer deployments.

Posted on September 13, 2016 by Karen Whipple

Six months ago, we launched the Converge Community in order to provide a seamless way for Hadoop and Spark developers, data analysts, and administrators to engage in technical discussions and share expertise that furthers the advancement of the big data community as a whole.

Posted on September 12, 2016 by Vikash Selvin

Elasticsearch and Kibana are widely used in the market today for data analytics; however, security is one aspect that was not initially built in to the product. Since data is the lifeline of any organization today, it becomes essential that Elasticsearch and Kibana be “secured.” In this blog post, we will be looking at one of the ways in which authentication, authorization, and encryption can be implemented for them.

Posted on September 7, 2016 by Ellen Friedman

In this week's Whiteboard Walkthrough, Ellen Friedman, Solutions Consultant at MapR, describes what happens when certain fundamental big data capabilities are engineered together as a part of the same technology. This brief overview compares the converged data platform as a foundation for big data projects versus building solutions on a base of separate pieces.

Blog Sign Up

Sign up and get the top posts from each week delivered to your inbox every Friday!


Featured Author

Data Engineer, MapR
Mathieu is a Data Engineer on the MapR Professional Services team, and is based in the Asia-Pacific region.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free