We’re pleased to announce the general release of the MapR Ecosystem Pack (MEP) version 2.0. This represents the second major release of a MapR Ecosystem Pack since the beginning of this new process of delivering ecosystem upgrades.
Open Source Software Blog Posts
In my previous blogpost, I explained the three major components of a streaming architecture. Most streaming architectures have three major components – producers, a streaming system, and consumers. Producers (such as Apache Flume) publish event data into a streaming system after collecting it from the data source, transforming it into the desired format, and optionally filtering, aggregating, and enriching it.
This blog post is the second in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer (Read Part 1). In his book, Davi outlines the rationale and key challenges of securing big data systems and applications, and he’s included some terrific anecdotes to make the entire book a quick and insightful read.
Business owners and executives today know the power of social media, mobile technology, cloud computing, and analytics. If you pay attention, however, you will notice that truly mature and successful digital businesses do not jump at every new technological tool or platform.
The last decade has ushered in a perfect storm of disruption for the financial services sector – arguably the most data-intensive sector of the global economy. As a result, companies in this sector are caught in a vice.
There is no denying it – we live in The Age of the Customer. Consumers all over the world are now digitally empowered, and they have the means to decide which businesses will succeed and grow, and which ones will fail. As a result, most savvy businesses now understand that they must be customer-obsessed to succeed.
The field of data science is one of the youngest and most exciting fields in the technology sector. In no other industry or field can you combine statistics, data analysis, research, and marketing to do jobs that help businesses make the digital transformation and come to full digital maturity.
As I discussed in my presentation at the Gartner Symposium/ITxpo in Florida, digital transformation is a key topic for business leaders today. While the impact of digital transformation is easily understood what is less clear are the steps to effectively pursue a digital transformation -- and the three keys to ensure successful digital transformation.
Siri, Alexa, Cortana and Google Now are just the beginning. When it comes to getting things done, machines are increasingly edging humans out of the equation. Big data and analytics are at the core of what some people are calling the “bot revolution.”
Business intelligence (BI), which is one of the oldest concepts in data processing, is undergoing a radical reinvention. The concept has already evolved considerably since it first gained popularity in the early 1990s (and particularly since its first mention in the Cyclopædia of Commercial and Business Anecdotes in 1865!).
Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data. There are applications across industries, from finance to wine making, though there are two primary challenges to be addressed.
This blog post is the first in a series based on the ebook The Six Elements of Securing Big Data by security expert and thought leader Davi Ottenheimer. In his book, Davi outlines the rationale and key challenges of securing big data systems and applications. He does so using some great anecdotes and with good humor, making the book a good read whether you’re a white/grey/black hat, cyber superhero, or even if you’re not a security expert at all.
In some circles today there is a sort of ‘Hadoop vs. RDBMS’ debate ongoing. Often the discussion casts Hadoop as the obvious heir apparent in the data processing world, with RDBMS cast as your father’s Oldsmobile.
For almost seven years, MapR has been committed to advancing the understanding and application of open-source technology to solve big data challenges. Last year we delivered on the promise of Hadoop with the industry's only enterprise-grade, Converged Data Platform that supports a broad set of mission-critical and real-time production uses.
Today we at MapR would like to congratulate Apache Arrow, a cross system data layer to speed up big data analytics and a brand new addition to the Apache Open Source Software community on its announcement as a Top Level project.
Two blogs came out recently that share some very interesting perspectives on the blurring lines between architectures and implementation of different data services, ranging from file systems to databases to publish/subscribe streaming services.
In this week's whiteboard walkthrough, Balaji Mohanam, Product Manager at MapR, explains the difference between Apache Spark and Apache Flink and how to make a decision which to use.
It’s the start of a new year -- we’re on the threshold of something new -- so let’s look forward to what you’re likely to be doing in 2016.
In this week's Whiteboard Walkthrough, Jim Scott, Director of Enterprise Strategy and Architecture at MapR, discusses a business use case that leverages the power of MapR Streams.
In this blog post, I’ll share how we see Myriad delivering value to customers, and how it fits in with the MapR platform.
Google has set the standard for most of the world when it comes to running systems at scale. It has created a number of different technologies to benefit its business.
Apache Apex is industry’s first ever YARN native engine that fulfills the disruptive promise of big data. In this post we go into more detail about what Apex is, and why it matters.
At the Strata + Hadoop World 2015 conference held in San Jose, Ted Dunning, Chief Application Architect for MapR, gave an exciting talk titled “YARN vs. MESOS: Can’t We All Just Get Along?” where he showcased how YARN and MESOS can work together to seamlessly share datacenter resources.
As you probably know, Apache Hadoop was inspired by Google’s MapReduce and Google File System papers and cultivated at Yahoo! It started as a large-scale distributed batch processing infrastructure, and was designed to meet the need for an affordable, scalable and flexible data structure that could be used for working with very large data sets.
The Global Data Competition 2015: Collaborate to Change Climate Change is an initiative that appeals to all walks of life through its “swarm offensive” approach to the global challenge of climate change. The “swarm offensive” approach, coined by the filmmakers of “The Coalition of The Willing” (released in 2010), refers to harnessing technologies, innovations and adaptation strategies from the collective genius of the world through open source infrastructures, thus promoting bottom-up “grassroots” efforts to tackle the climate change challenge as opposed to top-down “establishment” conventional approaches.
Today, we are announcing the availability of a course on HBase, the in-Hadoop NoSQL database. The course is titled “HBase Data Model and Architecture” and is catered to data analysts, data architects and application developers.
As we close out the year, here is a look back at our 10 most popular blogs of 2014. Our top posts include machine learning and time series data topics, new milestones for the Apache projects Drill and Spark, and hands-on technical explanations that save you time and headaches.
So, we did it again! Another rapidly growing open source project is now formally supported and packaged in the MapR Distribution including Apache Hadoop. This time the project is Apache Storm. I must say, the Storm project is special, given that we were the first ones to champion this project two years ago. Our own Ted Dunning has mentored the Storm community to get it to Apache Top Level Project status recently. Furthermore, Storm is associated with real-time processing—one of the core strengths of the MapR platform—with features such as a random read-write file-system and the option to use NFS-based spout. Not surprisingly, we already have customers using Storm on MapR in production.
In this blog series, we’re showcasing the top 10 reasons customers are turning to MapR in order to create new insights and optimize their data-driven strategies. Here’s reason #4: MapR provides true multi-tenancy with job isolation, volumes, quotas, data and job placement control, including for YARN.
In this blog series, we’re showcasing the top 10 reasons customers are turning to MapR in order to create new insights and optimize their data-driven strategies. Here’s reason #7: MapR provides the top-ranked NoSQL key-value database for current offering.
In this blog series, we’re showcasing the top 10 reasons customers choose the MapR Distribution for Hadoop to optimize their data-driven strategies. Reason #8: MapR provides Unbiased Open Source supporting 20+ OSS projects.
The first day of the 2014 Hadoop Summit was filled with announcements and interviews. MapR announced our first Apache Hadoop App Gallery, as well as our exciting partnership with Syncsort. Jack Norris, MapR CMO, had a chance to talk about this news on theCUBE with Wikibon’s Jeffrey Kelly and SiliconANGLE’s John Furrier.
A while back, I presented a Big Data Glossary: A to ZZ. In separate articles, I discussed some of the different entries in the glossary. Here, I focus on H (Hadoop), which is the evolving but increasingly standardized big data computing platform.
For details on these updates, please refer to the related release notes:
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!