MapR Platform Blog Posts

Posted on October 20, 2016 by George Demarest

This year is the first where Gartner has not included big data in any of their hype cycles. "I would not consider big data to be an emerging technology," says Burton. While this news will not affect the NASDAQ or how many artisan bagel shops there are in the SF Bay Area, it is an interesting indicator.

Posted on October 11, 2016 by Kirk Borne

Much has been written about the power of big data collections to enable the 360 view of our customers, our business, our employees, and our processes. When our numerous disparate heterogeneous data collections are aggregated and joined in the data lake, with appropriate data tagging and data discovery tools in place (such as Apache Drill), then we can reach for that ideal: the 360 view of our domain!

Posted on October 7, 2016 by Emily Torres

To be honest with you, I had no idea what big data was about three months ago when I started as an intern at MapR. I remember spending my first week trying to understand what it all meant and explaining to my friends and family what a converged data platform was.

Posted on October 6, 2016 by Jack Norris

This week I attended my sixth Strata + Hadoop World. Actually, in the beginning they were two separate shows, but the evolution since has been more than the combining, or convergence if you will, of the two shows. The first shows were attended almost exclusively by technologists looking to learn and understand new big data technologies, especially Hadoop.

Posted on October 4, 2016 by Carol McDonald

With the rapid expansion of smart phones and other connected mobile devices, communications service providers (CSPs) need to rapidly process, store, and derive insights from the diverse volume of data travelling across their networks. Big data analytics can help CSPs improve profitability by optimizing network services/usage, enhancing customer experience, and improving security.

Posted on September 27, 2016 by Rachel Silver

MapR is pleased to announce support for event-driven microservices on the MapR Converged Data Platform. In this blog post, I’d like to explain what this means, and how it fits into our bigger idea of “convergence.” Microservices are simple, single-purpose applications that work in unison via lightweight communications, such as data streams. They allow you to more easily manage segmented efforts to build, integrate, and coordinate your applications in ways that have traditionally been impossible with monolithic applications.

Posted on September 23, 2016 by Yvonne Chen

When I first started my internship, I wasn’t really sure what to expect. I knew the basics—MapR was a big data company, I was a technical marketing intern, and I would be doing quite a bit of competitive analysis. I originally found the position while searching for marketing internships in the Bay Area, and this one in particular popped out at me when I read the job description, since it intertwined my interest in marketing with my hope of getting some experience in the tech industry.

Posted on September 15, 2016 by George Demarest

This blog post is the first in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer. In his book, Davi outlines the rationale and key challenges of securing big data systems and applications. He does so using some great anecdotes and with good humor, making the book a good read whether you’re a white/grey/black hat, cyber superhero, or even if you’re not a security expert at all.

Posted on September 13, 2016 by Karen Whipple

Six months ago, we launched the Converge Community in order to provide a seamless way for Hadoop and Spark developers, data analysts, and administrators to engage in technical discussions and share expertise that furthers the advancement of the big data community as a whole.

Posted on September 7, 2016 by Ellen Friedman

In this week's Whiteboard Walkthrough, Ellen Friedman, Solutions Consultant at MapR, describes what happens when certain fundamental big data capabilities are engineered together as a part of the same technology. This brief overview compares the converged data platform as a foundation for big data projects versus building solutions on a base of separate pieces.

Posted on August 16, 2016 by Ellen Friedman

It’s not just a concern when ordering coffee. Something similar can happen as we investigate new and innovative big data technologies and techniques. I used the cappuccino example in a talk I presented recently at the Strata + Hadoop World Conference in London. The talk, titled “Building Better Cross Team Communication,” highlighted the importance of identifying and addressing the difference in how each side thinks the world works when two groups that have different experience and skills come together.

Posted on August 11, 2016 by Dale Kim

With stories of the thefts of millions of credit card records and sensitive employee data at some of the world’s largest companies and government agencies dominating recent headlines, it’s not surprising that organizations are doubling down on security. Security is finally starting to get top management’s attention.

Posted on August 10, 2016 by Dale Kim

Dale Kim, Sr. Director of Industry Solutions at MapR, describes the monitoring capabilities of the MapR Converged Data Platform, which easily give you a single view of all cluster operations. Leveraging popular open source technologies, the monitoring system is customizable and extensible to address the challenges of your big data deployment requirements.

Posted on August 9, 2016 by Yvonne Chen

With the increasing amount of information that we use daily, technology is only becoming more and more important in everything we do. And businesses are seeing this at much greater scale than we do as consumers. There are many great examples of this in just about every industry.

Posted on August 1, 2016 by Sameer Nori

Apache Spark is becoming very popular and widely used in the big data community. There are several reasons for Spark getting such rapid traction. These include its in-memory processing capabilities, support for a wide range of engines for various use cases such as streaming, machine learning, and SQL, and the ability to develop in multiple languages such as Python and Scala.

Posted on July 27, 2016 by Ted Dunning

In this week’s Whiteboard Walkthrough Part I, Ted Dunning, Chief Application Architect at MapR, explains the key capabilities required of a streaming platform in the context of micro-services and the advantages they offer.

Posted on July 27, 2016 by Ted Dunning

In this week’s Whiteboard Walkthrough Part II, Ted Dunning, Chief Application Architect at MapR, talks about the design freedom gained by adopting a micro-services architecture based on streaming data. When you move – one step at a time - from an old style architecture that suffers from too much dependence on a shared global state database to a stream-based flow architecture, the isolation between micro-services results in reduced strain on the original database, improved flexibility and often speed.

Posted on July 26, 2016 by Manny Puentes

“Big Data” is no longer a buzzword. Businesses big and small that don’t invest now in big data technologies risk getting left behind as the marketplace becomes more and more data-driven. In fact, a recent McKinsey and Company report suggested that companies that invest in big data and analytics consistently outperform their peers in both productivity and revenue.

Posted on July 25, 2016 by Jim Scott

Within this post you will see mention of message-driven architectures. This is in short a subset of a service oriented architecture (SOA). This has been around for many years and is a very popular model. What you will find going through this post is that the foundational message-driven architecture is more competitive to the concepts of the enterprise service bus (ESB).

Posted on July 8, 2016 by Ankur Desai

I was at the annual Hadoop Summit in San Jose last week. As usual, the MapR booth was buzzing with big data enthusiasts and experts alike. We showcased demos that spanned multiple topics including multi-cluster Hadoop monitoring using Grafana and Kibana (as part of our new Spyglass Initiative), IoT stream analysis using MapR Streams and Spark Streaming, and self-service big data analytics using Apache Drill.

Posted on June 30, 2016 by Prashant Rathi

Today we are proud to announce the Spyglass Initiative focused on easy management, deep visibility and full control. With this first release, MapR Monitoring empowers administrators with cluster monitoring capabilities, including metric and log collection from nodes, services and jobs, and dashboards.

Posted on June 23, 2016 by Sameer Nori

Customers are flocking to Spark as their primary compute engine for big data use cases, and we received further proof of this last week when we ran an “Ask Us Anything about Spark” forum in the Converge Community. There were some great discussions that took place, where our Spark experts answered questions from customers and partners.

Posted on June 21, 2016 by Ellen Friedman

Streaming data can be used as a long-term auditable history when you choose a messaging system with persistence, but is this approach practical in terms of the cost of storing years of data at scale? The answer is “yes”, particularly because of the way topic partitions are handled in MapR Streams. Here’s how it works.

Posted on June 20, 2016 by Dale Kim

Is there a case to be made for big data for security analytics? The answer is an unqualified “yes.” In fact CSO Magazine called cyber security “the killer app” for big data analytics.

Posted on June 16, 2016 by Sameer Nori

There’s been a lot of buzz and high expectations in the big data community around Apache Spark 2.0 and how it will impact the development of data pipelines, streaming applications, machine learning algorithms and all of the other use cases that Apache Spark is enabling.

Posted on June 14, 2016 by Kirk Borne

In the beginning was data. How do we know this? Because many (if not all) creation stories from all cultures were essentially developed as an explanation of the world as observed by humans.

Posted on June 8, 2016 by Ellen Friedman

In this week's Whiteboard Walkthrough, Ellen Friedman, a consultant at MapR, talks about how to design a system to handle real-time applications, but also how to take advantage of streaming data beyond those in the moment insights.

Posted on June 6, 2016 by Balaji Mohanam

Apache Spark, a powerful general purpose engine for processing large amounts of data, has seen a rapid increase in its adoption since its release. Recognizing its impact very early on, MapR has supported and invested in Spark as part of our Hadoop distribution to enable enterprises to build applications with Spark and deploy it in production in a reliable manner.

Posted on May 23, 2016 by Charu Madan

In today’s world of immense competition and customer churn, Telecom Providers are reinventing and transforming to be able to provide their customers with the best possible customer care and satisfaction.

Posted on May 16, 2016 by Ellen Friedman

Streaming data now is a big focus for many big data projects, including real time applications, so there’s a lot of interest in excellent messaging technologies such as Apache Kafka or MapR Streams, which uses the Kafka 0.9 API.

Posted on May 9, 2016 by Jim Scott

With all the talk about Big Data, most organizations are barely out of the starting blocks when it comes to exploiting it for business benefit. Gartner estimates that 85% of Fortune 500 companies are yet unable to exploit Big Data for competitive advantage.

Posted on May 6, 2016 by Karan Sachdeva

"Big ‍10 ‍banks fined $43bn over seven years for failures in customer reporting” reads yesterday’s headline in Financial Times and I wonder how the power of big data could have helped in saving these billions of dollars.

Posted on May 5, 2016 by Will Ochandarena

The first Kafka Summit was recently held in San Francisco. While the size of the conference was relatively small at 600 attendees, it was encouraging to see the variety of companies that are embracing real-time data pipelines.

Posted on April 28, 2016 by Crystal Valentine

Technological innovation is one of the great stories of the 21st century. Over the past 15 years, technology companies have generated unprecedented wealth at a blistering pace, fueled by smart and capable teams of brilliant scientists and engineers.

Posted on April 27, 2016 by Sean O’Dowd

We are honored to announce that MapR was named one of the Top 10 Banking Analytics Solution Providers for 2016 by Banking CIO Outlook magazine.

Posted on April 25, 2016 by Ellen Friedman

Organizations embracing big data are ready to put data to work, including looking for ways to effectively analyze data from a variety of sources in real time or near real time.

Posted on April 18, 2016 by Bill Peterson

MapR, Cisco, and SAP have been collaborating for years to help you gain insight from all of your data sources. Today, we’re excited to announce that Cisco has developed an appliance that includes the MapR Converged Data Platform for SAP HANA, making it much easier and faster for you to harness the power of big data.

Posted on April 13, 2016 by Dale Kim

Editor's note: In this week's Whiteboard Walkthrough, Dale Kim, Sr. Director of Industry Solutions at MapR, discusses three examples of how the auditing capabilities in the MapR Converged Data Platform are beneficial for your big data environment.

Posted on April 7, 2016 by Jack Norris

There are substantial advantages to being able to make decisions at the speed required to respond to events in the moment. In fact, real time is at the foundation of many transformational applications. Let’s take a closer look at what real time really means, and why real time is required across the entire process.

Posted on April 1, 2016 by Dale Kim

In this week's Whiteboard Walkthrough, Dale Kim, Director of Industry Solutions at MapR, describes the 540° Customer View.

Posted on March 31, 2016 by George Demarest

In my recent article for insideBIGDATA “Converged Data Platforms: Part of a Larger Trend”, I talked about the inevitable direction of technology architecture towards a limitless mainframe model, a converged data center that will be composed largely of open source technologies like Linux, KVM, Hadoop, Spark, Mesos, and OpenStack.

Posted on March 30, 2016 by David Cross

For almost seven years, MapR has been committed to advancing the understanding and application of open-source technology to solve big data challenges. Last year we delivered on the promise of Hadoop with the industry's only enterprise-grade, Converged Data Platform that supports a broad set of mission-critical and real-time production uses.

Posted on March 29, 2016 by John Schroeder

What’s clear to me is that we are in the midst of the biggest change in enterprise computing in decades: a shift in how data is stored, analyzed and processed is changing the way businesses operate and compete in the marketplace.

Posted on March 15, 2016 by Steve Wooledge

In the world of data warehouses and data marts, OLAP analysis has existed for many years. Concepts like drill down, drill across and roll ups have allowed business analysts and users to easily access and analyze data across a variety of dimensions such as product, customers and regions.

Posted on March 7, 2016 by Michele Nemschoff

We are excited to share with you that Gartner has named MapR a Visionary in the Gartner 2016 Magic Quadrant for Data Warehouse and Data Management Solutions for Analytics. Gartner evaluated 21 software vendors on 15 criteria for the quadrant.

Posted on March 4, 2016 by Carl Olofson

Hadoop is a key data technology for Big Data, as everyone knows. But the question becomes, how can Big Data help make me more competitive, more efficient, and better able to detect fraud, security breaches, and other abuses?

Posted on February 24, 2016 by M.C. Srivas

Big data is amazing. Put to use effectively, it delivers efficiencies on a scale that was previously impossible. We live and operate in a complex, highly interconnected world, where almost everything interacts with everything else.

Posted on January 27, 2016 by Tugdual Grall

In this week's whiteboard walkthrough, Tugdual Grall, technical evangelist at MapR, explains the advantages of a publish-subscribe model for real-time data streams.

Posted on January 20, 2016 by John Schroeder

MapR has been running a “built to last” business model since its founding. I discussed this in 2013, when I wrote “Built to Last: How MapR’s Business Model Supports That Goal”.

Posted on January 4, 2016 by Ellen Friedman

Banks are among the many businesses taking advantage of big data and IoT opportunities, including for mobile payments, online banking, and smart kiosks, but the huge quantities of personally sensitive data from these activities must be protected at all stages.

Posted on December 30, 2015 by Jim Scott

One of the many high points in Disney’s Star Wars Episode VII: The Force Awakens movie was the return of several classic ships and other vehicles from the original trilogy, as well as the introduction of innovative, new types of vehicles. With all the advanced technology on these ships, one can’t help but wonder what kind of big data software and analytics they would be using for threat assessment and prediction, mission planning, and enemy ship tracking and identification.

Posted on December 16, 2015 by Ellen Friedman

Big data and Hadoop-based approaches are now widely recognized but are still considered by many to be new technologies. The potential benefit of these approaches already is clear, but are they able to deliver practical value now?

Posted on December 15, 2015 by Jim Scott

In this week's Whiteboard Walkthrough, Jim Scott, Director of Enterprise Strategy and Architecture at MapR, discusses a business use case that leverages the power of MapR Streams.

Posted on December 14, 2015 by Michele Nemschoff

You may understand your work style at the office, but what if you were a developer reindeer at the North Pole?

Posted on December 8, 2015 by Will Ochandarena

Over the last 5 years of shipping product we’ve watched our customers get enormous value out of storing and processing big data. The use cases are far and wide, from performing predictive maintenance on oil rigs to building fraud and risk models on financial transactions.

Posted on December 8, 2015 by John Schroeder

Today is very significant for MapR, with the introduction of MapR Streams and the industry’s first, and only, Converged Data Platform.

Posted on October 30, 2015 by Ellen Friedman

There’s good news in the world of NoSQL databases that will put a smile on the face of developers – and that should also make business leaders happy because it means shorter time-to-value. You can now enjoy the ease and flexibility of a document-style database with the power of extreme scalability and performance.

Posted on October 29, 2015 by Neeraja Rentachintala

In this blog post, I will briefly summarize some of the key capabilities that customers are finding immensely valuable in Drill. I’ll also cover common use cases where Drill is deployed, as well as resources for getting started with Drill.

Posted on October 28, 2015 by Dale Kim

In this week's Whiteboard Walkthrough, Dale Kim, product marketing director, explains how document databases fit in your enterprise's use cases.

Posted on October 26, 2015 by Mike Emerick

When we read “data journalism” articles, it often appears that journalists are walking a perilous line. In many cases, they’re working with data that is provided by the creators.

Posted on October 19, 2015 by Matt Mills

What’s happening today in big data reminds me very much of my early years at Oracle when we fought and won the so-called database wars.

Posted on October 9, 2015 by Michele Nemschoff

At Strata+Hadoop World in New York last week, MapR CMO Jack Norris talked about the Big Data Dividend – the ongoing, significant profits that are derived from data-driven applications. In his keynote, Jack provided a look at the bigger picture.

Posted on October 2, 2015 by Bill Peterson

The MapR Distribution including Hadoop is now available on the Azure Fast Start. This solution enables push button deployment of MapR on the Azure cloud infrastructure, providing you with the solutions to turn your big data into big money.

Posted on September 29, 2015 by Steve Wooledge

Cloudera’s announcement of a new open source project called Kudu, a technology described as a “complement to HDFS and Apache HBase... designed to fill gaps in Hadoop’s storage layer.” Apparently Cloudera’s development team “... eventually came to the conclusion that large architectural changes were necessary to achieve our goals”.

Posted on September 29, 2015 by Tugdual Grall

Today, MapR has announced the developer preview of MapR-DB with native support of JSON, and the new library OJAI (Open JSON Application Interface), pronounced "OH-hy."

Posted on September 22, 2015 by Michele Nemschoff

Australian shoppers are some of the most digitally influenced in the world; a majority of Australians go online to research a product before buying it, according to a 2015 report by Deloitte.

Posted on September 16, 2015 by Bill Zaharchuk

How times have changed—10-15 years ago, when you needed to store data for your application, it was likely structured data; the data fields were known ahead of time and didn’t change much.

Posted on September 15, 2015 by Michele Nemschoff

It's an exciting time for those in pharmaceutical research these days, given that research organizations can now leverage big data to improve their business.

Posted on September 10, 2015 by Bill Peterson

MapR is glad to partner with SAP and we are excited to see them bring lead-edge innovations to the market. We are thrilled today to talk about a new offering from SAP that along with the MapR data platform to help you better serve your customers and simplify how your business works.

Posted on September 8, 2015 by Michele Nemschoff

The explosion of data from new devices and technologies has forced the telecommunications industry to completely change the way they handle big data. Their traditional storage and analytics solutions cannot adequately manage the expanding, diverse volume of data generated today.

Posted on August 14, 2015 by Jim Scott

As you probably know (unless you’ve been living under an ant hill), Ant-Man is a fictional superhero who first appeared in Marvel comic books, and he’s also a proud founding member of The Avengers. He made his debut on the big screen recently with the advent of this summer’s blockbuster movie, “Ant-Man,” which has, as of last week, already earned $116.8 million at the domestic box office, and $234 million worldwide.

Posted on August 13, 2015 by Sean Iannuzzi

There’s a reason the industry refers to Big Data as “Big” Data. According to IBM, we create 2.5 quintillion bytes of data. Here’s another eye-opening stat: 90 percent of the data in the world today has been created in the last two years alone.

Posted on June 18, 2015 by Dale Kim

Hadoop has been a phenomenon for big data and operational workloads. It has transformed from its batch-oriented roots into an interactive platform by incorporating a number of components, including technologies that provide SQL and distributed in-memory capabilities.

Posted on June 15, 2015 by Sameer Nori

We thought the Kickstart song by Mötley Crüe was appropriate, since everyone is excited about kickstarting their Spark-based applications these days. That’s our theme for the Quick Start Solutions we’re announcing today at Spark Summit West—you can kickstart your Spark efforts into high gear with our Spark Quick Starts. You’ll be able to develop at high speeds, use streaming data, and build applications faster.

Posted on March 6, 2015 by Sameer Nori

Most of us have experienced the power of data-driven recommendations. Maybe you found a former colleague through LinkedIn’s “People You May Know” feature or you watched a movie because Netflix suggested it to you. And it’s quite likely that you bought something that recommended to you under the "Frequently Bought Together" section. It’s estimated that recommendation engines power approximately 30% of Amazon’s revenue. In all of these instances, recommendation engines help narrow your choices to those that best meet your particular needs. In all of the above situations, the systems that these companies built incorporate algorithms that learn from past data. Customers benefit from a more tailored and personalized experience, and this positive experience increases the likelihood that they’ll buy more products and services and stay loyal to the particular service provider or retailer in question. For the merchant or service provider, recommendation engines increase up-sell and cross-sell rates, reduce churn, and improve customer loyalty.

Posted on February 18, 2015 by Sameer Nori

Today, MapR introduced Quick Start Solutions, a powerful package of services, software and training/certification to help you jump-start your deployments of enterprise data hub, security and marketing applications. These solutions address commonly implemented and high-value Hadoop use cases for Data Warehouse Optimization and Analytics, Security Log Analytics and Recommendation Engines.

Posted on December 31, 2014 by Karen Whipple

As we close out the year, here is a look back at our 10 most popular blogs of 2014.  Our top posts include machine learning and time series data topics, new milestones for the Apache projects Drill and Spark, and hands-on technical explanations that save you time and headaches.  

Posted on December 18, 2014 by Ted Dunning

I commonly hear lots of questions about how many drives to use per node in a cluster. For a long time, the norm was to have 4-6 drives per node, but lately, I have been hearing more people suggest 12 drives. At MapR, we have been recommending 12 or 24 drives for quite some time to take advantage of the inherent advantages of MapR-FS, but I still hear lots of people recommending smaller configurations. In fact, I think that the norm is moving much higher than 12 drives. It is not uncommon for us to see boxes with up to 60 large drives lately. These are not the majority of systems by any stretch, but they have some very distinct advantages in terms of money (capex $/TB, opex $/TB) and power (opex W/TB).

Posted on October 14, 2014 by Dale Kim

In this blog series, we’re showcasing the top 10 reasons customers are turning to MapR in order to create new insights and optimize their data-driven strategies. Here’s reason #2: MapR provides world record performance for Hadoop.

Posted on October 11, 2014 by Bruce Penn

In this blog series, we’re showcasing the top 10 reasons customers are turning to MapR in order to create new insights and optimize their data-driven strategies. Here’s reason #5: MapR provides complete data protection and disaster recovery with real snapshots and mirroring.

Posted on October 9, 2014 by Dale Kim

In this blog series, we’re showcasing the top 10 reasons customers are turning to MapR in order to create new insights and optimize their data-driven strategies. Here’s reason #7: MapR provides the top-ranked NoSQL key-value database for current offering.

Posted on September 18, 2014 by Ellen Friedman

One cat, a radio collar, and a night on the town – this little adventure turned into an entertaining article in Wired magazine 8 August 2014 by Andy Greenberg about the creative use of a feline investigator to find weak points in security of the neighborhood’s wifis.

Posted on January 28, 2014 by Anoop Dawar
We are excited to announce that Version 3.1 of the MapR Distribution for Hadoop is now ready for release. In addition to enhancements and bug fixes, we have included a few new security features. Highlights of this release include:

Posted on January 3, 2014 by Anoop Dawar
MapR is committed to providing the broadest SQL-in-Hadoop support to our customers. A critical part of this commitment is the ability to provide the latest enhancements from the open source community to our customers. However, it's not as simple as claiming that an open source project will work on MapR. We take pride in bringing Hadoop open source components into our distribution by putting them through a rigorous testing and hardening process. This process allows us to be confident in the stability of the components, and in their performance on the MapR Big Data Platform.
Posted on November 25, 2013 by Neeraja Rentachintala
MapR delivers on the promise of Hadoop with an enterprise-grade Big Data platform that supports a broad set of use cases across all industries. A key part of this promise is to help customers benefit from the new and exciting key projects coming up from the Hadoop ecosystem, along with MapR's unique innovations.
Posted on October 22, 2013 by Tomer Shiran
Apache Hadoop has inspired a rich ecosystem of projects and products that benefit everyone interested in Big Data. With so many options available, I frequently am asked whether MapR supports a specific project - examples include Impala, Knox, Storm and Falcon - so I thought it would make sense to provide an overview that will help explain how this works for MapR.

There are hundreds of open source and commercial projects that relate to Hadoop in one way or another. These projects can be divided into two categories:

Posted on March 28, 2013 by Jack Norris
MapR made two significant announcements today regarding our efforts to support the Hadoop ecosystem and provide an open enterprise-grade platform to Big Data users.

Posted on December 6, 2011 by Tomer Shiran
Today we announced version 1.2 of the MapR Distribution for Apache Hadoop. With this release, MapR continues to push the envelope by making Hadoop more accessible to more users, more languages, and more platforms. This release includes numerous features and capabilities including:

    Blog Sign Up

    Sign up and get the top posts from each week delivered to your inbox every Friday!

    Streaming Data Architecture:

    New Designs Using Apache Kafka and MapR Streams




    Download for free