Partners Blog Posts

Posted on January 4, 2017 by Sandra Wagner

Pentaho Data Integration (PDI) provides the ETL capabilities that facilitate the process of capturing, cleansing, and storing data. Its uniform and consistent format makes it accessible and relevant to end-users and IoT technologies. Apache Drill is a schema-free SQL-on-Hadoop engine that lets you run SQL queries against different data sets with various formats, e.g. JSON, CSV, Parquet, HBase, etc.

Posted on December 19, 2016 by Bryan Smith

In my first post, I showed how you might quickly deploy a Drill-enabled cluster to the Azure cloud using the MapR template available in the Azure Marketplace. In my next post, I showed you how you might get that Drill-enabled cluster to query an Azure Storage account as well as an Azure SQL Database. In this post, I want to focus on using this cluster as a data source with Power BI, a data discovery tool that’s popular with users of Microsoft technologies.

Posted on December 12, 2016 by Bryan Smith

In my last post, I deployed a MapR cluster to the Azure cloud using the template available through the Azure Marketplace. My goal in doing this was to get a Drill-enabled cluster up and going in Azure as quickly as possible. My emphasis on Azure indicates that I am probably making use of the Microsoft cloud for a broader range of activities than just running this one cluster.

Posted on October 27, 2016 by James Sun

MapR has worked closely with Azure to develop sandboxes that enable users to do a proof of concept with the MapR Converged Data Platform. These sandboxes, which are pre-loaded and preconfigured with the MapR software and the required supporting operating system, can be launched on the Azure Marketplace portal.

Posted on October 18, 2016 by James Sun

If you’ve been keeping tabs on all the great product enhancements that have been coming out of MapR, you will know that the 5.2 version of the MapR Converged Data Platform went GA this summer. It takes a few cycles to make the platform available on the AWS marketplace, largely due to the testing efforts required.

Posted on September 22, 2016 by Raphaël Velfre

A very common use case for the MapR Converged Data Platform is collecting and analyzing data from a variety of sources, including traditional relational databases. Until recently, data engineers would build an ETL pipeline that periodically walks the relational database and loads the data into files on the MapR cluster, then perform batch analytics on that data.

Posted on January 25, 2016 by Ranjit Lingaiah

In the wide column data model of MapR-DB, all rows are stored by a row key, column family, column qualifier, value, and timestamps. In the current version, the row key is the only field that is indexed, which fits the common pattern of queries based on the row key.

Posted on November 10, 2015 by Nick Amato

This blog describes how to get an instance of the MapR-DB Document Database Developer Preview image running on Amazon AWS using one of the pre-configured AMI images supplied by MapR. With this AMI, you can start writing JSON-based applications on MapR-DB using the open source Open JSON Application Interface, or OJAI.

Posted on September 17, 2015 by James Sun

With the advent of container technology like Docker and application resource management platforms such as Apache Mesos, enterprise customers are looking at these technologies very seriously as they promise much shorter development cycles and highly scalable product deployment.

Posted on July 22, 2015 by Abizer Adenwala

As a follow-up to my previous post on MapR-DB, I want to describe how to index MapR-DB table data in near real-time into Elasticsearch on Amazon Web Services (AWS) Elastic Compute Cloud (EC2).

Posted on July 8, 2015 by Andy Lerner

Teradata Connector for Hadoop (TDCH) is a key component of Teradata’s Unified Data Architecture for moving data between Teradata and Hadoop. TDCH invokes a mapreduce job on the Hadoop cluster to push/pull data to/from Teradata databases, with each mapper moving a portion of the data, in parallel across all nodes, for very fast transfers.

Posted on July 7, 2015 by David Tucker

I’m very pleased to announce the release of a custom EMR bootstrap action to deploy Apache Drill on a MapR cluster. MapR is the only commercial Hadoop distribution available for Amazon’s Elastic MapReduce service (EMR), and this addition allows EMR users to easily deploy and evaluate the powerful Drill query engine.

Posted on June 19, 2015 by Uli Bethke

Did you know you can run Apache Drill on your laptop? This is great news for business analysts who need to explore complex and semi-structured data. Let's look at a particular example.

Posted on January 9, 2015 by Nick Amato

The folks over at the Transaction Processing Performance Council (TPC) have been busy. The TPC benchmarks (such as TPC-C, TPC-D and TPC-H) are the industry standard for benchmarking transaction processing systems that touch upon a broad range of our daily lives, from tracking customer orders and optimizing inventory in warehouses, to supporting critical, real-time business decisions. These benchmarks are the standard by which these types of systems have been measured since their initial release in 1992, and they have been a key factor in research, innovation and performance improvements in relational database systems.

Posted on December 16, 2014 by Na Yang

Nearly one year ago the Hadoop community began to embrace Apache Spark as a powerful batch processing engine. Today, many organizations and projects are augmenting their Hadoop capabilities with Spark. As part of this trend, the Apache Hive community is working to add Spark as an execution engine for Hive. The Hive-on-Spark work is being tracked by HIVE-7292 which is one of the most popular JIRAs in the Hadoop ecosystem. Furthermore, three weeks ago, the Hive-on-Spark team offered the first demo of Hive on Spark.

Posted on December 5, 2014 by Will Ochandarena

I often get asked, “What is the easiest way to get hands-on experience with MapR?” The best way is to try the MapR Sandbox, a single-node MapR cluster that you can run on your laptop. However, Hadoop clusters are never built with just one server, and some MapR features require multiple nodes, or even multiple clusters. To get hands-on with a MapR installation that more closely resembles what you might deploy on hardware, I suggest you deploy a MapR cluster in the Amazon cloud, using the MapR Installer. This blog post will walk you through that process.

Posted on July 22, 2014 by Jon Posnik

Fireworks from the July 4th holiday seem like a distant memory, but the virtual fireworks continue to spark (pun intended) within the MapR partner ecosystem. A new sandbox from Talend for the MapR Distribution, the successful launch of our App Gallery, and support of expansion for Apache Spark with partners like Databricks show great momentum with our technology partners.

Posted on June 4, 2014 by Will Ochandarena

Last time you bought a smartphone, what factors did you consider?  You probably first evaluated the phone itself, like how well the camera could capture your kid’s special moments, or if there is enough storage to hold the full Rolling Stones collection in lossless format.  You then looked at whether the services you already use and trust are supported, like the Netflix app for binging on House of Cards, or your bank’s app for catching up on bills at the end of the month.  In the end, you chose the phone that had both the features and compatibility you needed.

Posted on April 4, 2014 by Karen Whipple
Amazon Elastic MapReduce (Amazon EMR) makes it easy to provision and manage Hadoop in the AWS Cloud. The latest webinar from the Amazon Web Services Partner webinar series, titled “Hadoop in the Cloud: Unlocking the Potential of Big Data on AWS,” showed examples of how to use Amazon EMR with the MapR Distribution for Apache Hadoop, and outlined the advantages of using the cloud to increase flexibility and accelerate projects while lowering costs.

Blog Sign Up

Sign up and get the top posts from each week delivered to your inbox every Friday!

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free