A very common use case for the MapR Converged Data Platform is collecting and analyzing data from a variety of sources, including traditional relational databases. Until recently, data engineers would build an ETL pipeline that periodically walks the relational database and loads the data into files on the MapR cluster, then perform batch analytics on that data.
Partners Blog Posts
In the wide column data model of MapR-DB, all rows are stored by a row key, column family, column qualifier, value, and timestamps. In the current version, the row key is the only field that is indexed, which fits the common pattern of queries based on the row key.
This blog describes how to get an instance of the MapR-DB Document Database Developer Preview image running on Amazon AWS using one of the pre-configured AMI images supplied by MapR. With this AMI, you can start writing JSON-based applications on MapR-DB using the open source Open JSON Application Interface, or OJAI.
With the advent of container technology like Docker and application resource management platforms such as Apache Mesos, enterprise customers are looking at these technologies very seriously as they promise much shorter development cycles and highly scalable product deployment.
Teradata Connector for Hadoop (TDCH) is a key component of Teradata’s Unified Data Architecture for moving data between Teradata and Hadoop. TDCH invokes a mapreduce job on the Hadoop cluster to push/pull data to/from Teradata databases, with each mapper moving a portion of the data, in parallel across all nodes, for very fast transfers.
I’m very pleased to announce the release of a custom EMR bootstrap action to deploy Apache Drill on a MapR cluster. MapR is the only commercial Hadoop distribution available for Amazon’s Elastic MapReduce service (EMR), and this addition allows EMR users to easily deploy and evaluate the powerful Drill query engine.
Did you know you can run Apache Drill on your laptop? This is great news for business analysts who need to explore complex and semi-structured data. Let's look at a particular example.
The folks over at the Transaction Processing Performance Council (TPC) have been busy. The TPC benchmarks (such as TPC-C, TPC-D and TPC-H) are the industry standard for benchmarking transaction processing systems that touch upon a broad range of our daily lives, from tracking customer orders and optimizing inventory in warehouses, to supporting critical, real-time business decisions. These benchmarks are the standard by which these types of systems have been measured since their initial release in 1992, and they have been a key factor in research, innovation and performance improvements in relational database systems.
Nearly one year ago the Hadoop community began to embrace Apache Spark as a powerful batch processing engine. Today, many organizations and projects are augmenting their Hadoop capabilities with Spark. As part of this trend, the Apache Hive community is working to add Spark as an execution engine for Hive. The Hive-on-Spark work is being tracked by HIVE-7292 which is one of the most popular JIRAs in the Hadoop ecosystem. Furthermore, three weeks ago, the Hive-on-Spark team offered the first demo of Hive on Spark.
I often get asked, “What is the easiest way to get hands-on experience with MapR?” The best way is to try the MapR Sandbox, a single-node MapR cluster that you can run on your laptop. However, Hadoop clusters are never built with just one server, and some MapR features require multiple nodes, or even multiple clusters. To get hands-on with a MapR installation that more closely resembles what you might deploy on hardware, I suggest you deploy a MapR cluster in the Amazon cloud, using the MapR Installer. This blog post will walk you through that process.
Fireworks from the July 4th holiday seem like a distant memory, but the virtual fireworks continue to spark (pun intended) within the MapR partner ecosystem. A new sandbox from Talend for the MapR Distribution, the successful launch of our App Gallery, and support of expansion for Apache Spark with partners like Databricks show great momentum with our technology partners.
Last time you bought a smartphone, what factors did you consider? You probably first evaluated the phone itself, like how well the camera could capture your kid’s special moments, or if there is enough storage to hold the full Rolling Stones collection in lossless format. You then looked at whether the services you already use and trust are supported, like the Netflix app for binging on House of Cards, or your bank’s app for catching up on bills at the end of the month. In the end, you chose the phone that had both the features and compatibility you needed.
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!