Technical Tips

Persistent Storage for Docker Containers | Whiteboard Walkthrough

In this week’s Whiteboard Walkthrough, Dale Kim, Sr. Director of Industry Solutions at MapR, describes how MapR addresses the challenge of providing a persistence tier to containers in big data settings. Dale describes a new technology to support Docker containers, the MapR Persistent Application Client Container, or PACC. This lets you deploy containers anywhere, with security enabled, while also providing access to the MapR Converged Data Platform, which includes NoSQL database and streaming message transport as well as files for the persistence layer.

Getting Started with MapR Client Container

The MapR Persistent Application Client Container (PACC) is a Docker-based container image that includes a container-optimized MapR client. The PACC provides secure access to MapR Converged Data Platform services, including MapR-FS, MapR-DB, and MapR Streams. The PACC makes it fast and easy to run containerized applications that access data in MapR.

Perfecting Lambda Architecture with Oracle Data Integrator (and Kafka / MapR Streams)

In this blog I'm going to show you how to configure MapR Streams (aka Kafka) on Oracle Data Integrator with Spark Streaming to create a true lambda architecture: a fast layer complementing the batch and serving layer. I've done this on MapR so I can do a "two birds one stone"; showing you MapR Streams steps and Kafka.

Seven Best Practices for Securing MapR

Recent reports suggest hackers are actively compromising insecure Hadoop deployments. These attacks appear to be targeting a service port (NameNode: 50070) that is not used by MapR, making instances of MapR not susceptible to this specific exploit. The underlying attack methodology, however, is simply to find and exploit internet-accessible ports that do not require authentication.

Getting Started with Kafka REST Proxy for MapR Streams

In this blog, we describe how to use the Kafka REST Proxy to publish and consume messages to/from MapR Streams. The REST Proxy is a great addition to the MapR Converged Data Platform, allowing any programming language to use MapR Streams. The Kafka REST Proxy, provided with the MapR Streams tools, can be used with MapR Streams (default) as well as Apache Kafka (in a hybrid mode). In this article, we will focus on MapR Streams.

Deploying a Secure Mini MapR Cluster with Docker on a Single AWS Instance

If you want to try out the MapR Converged Data Platform to see its unique big data capabilities but don’t have a cluster of hardware immediately available, you still have a few other options. For example, you can spin up a MapR cluster in the cloud using multiple node instances on one of our IaaS partners (Amazon, Azure, etc.).

Connecting Pentaho Data Integration to MapR Using Apache Drill

Pentaho Data Integration (PDI) provides the ETL capabilities that facilitate the process of capturing, cleansing, and storing data. Its uniform and consistent format makes it accessible and relevant to end-users and IoT technologies. Apache Drill is a schema-free SQL-on-Hadoop engine that lets you run SQL queries against different data sets with various formats, e.g. JSON, CSV, Parquet, HBase, etc.


Subscribe to RSS - Technical Tips