This year is the first where Gartner has not included big data in any of their hype cycles. "I would not consider big data to be an emerging technology," says Burton. While this news will not affect the NASDAQ or how many artisan bagel shops there are in the SF Bay Area, it is an interesting indicator.
Use Cases Blog Posts
With the rapid expansion of smart phones and other connected mobile devices, communications service providers (CSPs) need to rapidly process, store, and derive insights from the diverse volume of data travelling across their networks. Big data analytics can help CSPs improve profitability by optimizing network services/usage, enhancing customer experience, and improving security.
One of the most significant characteristics of the evolving digital age is the convergence of technologies. That includes information management (structured and unstructured databases: e.g., NoSQL), data collection (big data), data storage (cloud and distributed data: e.g., Hadoop), data applications (analytics), knowledge discovery (data science), algorithms (machine learning), transparency (open data), computation (distributed data processing: e.g., MapReduce and Spark), sensors (Internet of Things: IoT), and API services (microservices, containerization).
MapR is pleased to announce support for event-driven microservices on the MapR Converged Data Platform. In this blog post, I’d like to explain what this means, and how it fits into our bigger idea of “convergence.” Microservices are simple, single-purpose applications that work in unison via lightweight communications, such as data streams. They allow you to more easily manage segmented efforts to build, integrate, and coordinate your applications in ways that have traditionally been impossible with monolithic applications.
In the beginning was data. How do we know this? Because many (if not all) creation stories from all cultures were essentially developed as an explanation of the world as observed by humans.
In this week's Whiteboard Walkthrough, Ellen Friedman, a consultant at MapR, talks about how to design a system to handle real-time applications, but also how to take advantage of streaming data beyond those in the moment insights.
Standards and incentives for the digitizing and sharing of healthcare data along with improvements and decreasing costs in storage and parallel processing on commodity hardware, are causing a big data revolution in health care with the goal of better care at lower cost.
Just a few years ago, using a fingerprint to sign on to your phone seemed futuristic. Today, it’s everywhere and just the beginning of how biometrics will be woven into our lives.
Perhaps you’re old enough to remember when the library was the place we went to learn. We foraged through card catalogs, encyclopedias and the Reader's Guide to Periodical Literature in hopes that we’d be able to understand what was going on in other people’s minds when they decided what went where.
What capabilities should you look for in a messaging system when you design the architecture for a streaming data project? Let’s start with a hypothetical IoT data aggregation example to illustrate specific business goals and the requirements they place on messaging technology and data architecture needed to meet those goals...
We are honored to announce that MapR was named one of the Top 10 Banking Analytics Solution Providers for 2016 by Banking CIO Outlook magazine.
Having participated in a number of fantasy sports leagues and being a Data Scientist at MapR gives me a unique perspective on my approach to choosing who I think will most likely “win” the tournament...my predictions for the six players, ranked in order, who I predict will most likely to finish in 10th or better place this year (and hopefully 1st) based on my statistical modeling are:
There are substantial advantages to being able to make decisions at the speed required to respond to events in the moment. In fact, real time is at the foundation of many transformational applications. Let’s take a closer look at what real time really means, and why real time is required across the entire process.
In this week's Whiteboard Walkthrough, Dale Kim, Director of Industry Solutions at MapR, describes the 540° Customer View.
Can we agree at the outset that modern businesses rely heavily on data to make critical decisions, and the ability to make decisions in real time is very valuable? Good.
In the world of data warehouses and data marts, OLAP analysis has existed for many years. Concepts like drill down, drill across and roll ups have allowed business analysts and users to easily access and analyze data across a variety of dimensions such as product, customers and regions.
There are 150 quintillion (i.e. the one after quadrillion) permutations to consider when completing your NCAA bracket. Some of us don’t have time to review them all; if you are likewise short on time, you can let MapR do the heavy lifting for you and get your personalized bracket from the Crystal B-Ball!
Dimensionality reduction is a critical component of any solution dealing with massive data collections. Being able to sift through a mountain of data efficiently in order to find the key descriptive, predictive and explanatory features of the collection is a fundamental required capability for coping with the Big Data avalanche.
We live in a world where the combination of Moore’s Law and Metcalfe’s Law heralds a data revolution. The billions of smartphone and broadband users today already generate massive quantities of data.
Most likely, you’ve seen quite a few “Internet of Things” headlines in the last year. But how will the IoT really transform the world as we know it? Here are just a few ways both organizations and consumers are benefiting from IoT
Actionable insights from real time analytics – that’s a goal for many new projects being designed to make use of streaming data, and it’s no wonder so many organizations are aiming at this prize. If you can develop programs to process streaming data with near or actual real time analytics, you gain the ability to react to life as it happens.
For the past 25 years, applications have been built using an RDBMS with a predefined schema that forces data to conform with a schema on-write. Many people still think that they must use an RDBMS for applications, even though records in their datasets have no relation to one another.
Processing data from social media streams and sensors devices in real time is becoming increasingly prevalent, and there are plenty of open source solutions to choose from. Here is the presentation that I gave at Strata+Hadoop World, where I compared three popular Apache projects that allow you to do stream processing: Apache Storm, Apache Spark, and Apache Samza.
In this week's whiteboard walkthrough, Tugdual Grall, technical evangelist at MapR, explains the advantages of a publish-subscribe model for real-time data streams.
Companies everywhere are looking for ways to improve customer service. For example, companies with call-in support centers might track how long agents take to answer calls, or how long customers stay on hold.
In this week's whiteboard walkthrough, Balaji Mohanam, Product Manager at MapR, explains the difference between Apache Spark and Apache Flink and how to make a decision which to use.
Someone once said “if you can’t measure something, you can’t understand it.” Another version of this belief says: “If you can’t measure it, it doesn’t exist.” This is a false way of thinking – a fallacy – in fact it is sometimes called the McNamara fallacy.
It’s the start of a new year -- we’re on the threshold of something new -- so let’s look forward to what you’re likely to be doing in 2016.
Banks are among the many businesses taking advantage of big data and IoT opportunities, including for mobile payments, online banking, and smart kiosks, but the huge quantities of personally sensitive data from these activities must be protected at all stages.
2015 was a groundbreaking year for banking and financial markets firms, as they continue to learn how big data can help transform their processes and organizations. Now, with an eye towards what lies ahead for 2016, we see that financial services organizations are still at various stages of their activity with big data in terms of how they’re changing their environments to leverage the benefits it can offer. Banks are continuing to make progress on drafting big data strategies, onboarding providers and executing against initial and subsequent use cases.
Big data and Hadoop-based approaches are now widely recognized but are still considered by many to be new technologies. The potential benefit of these approaches already is clear, but are they able to deliver practical value now?
Emotions should not be discarded as a distraction. Understanding a pattern in a user’s emotion is important in order for an intelligent device or system to respond appropriately. A system can exhibit “artificial emotion” to engage with the user.
Walmart is an industry leader in global e-commerce and brick-and-mortar retail, and they’re also a leader in the use of Hadoop-based technologies to implement their new data-driven approach to business.
The cost of waste, fraud and abuse in the healthcare industry is a key contributor to spiraling health care costs in the United States. In 2012, healthcare waste and abuse accounted for nearly $60 billion.
Australian shoppers are some of the most digitally influenced in the world; a majority of Australians go online to research a product before buying it, according to a 2015 report by Deloitte.
It's an exciting time for those in pharmaceutical research these days, given that research organizations can now leverage big data to improve their business.
In this week's Whiteboard Walkthrough, Steve Wooledge, VP of Industry Solutions at MapR, talks about an Apache Sark + Hadoop use case for drug discovery that one of our customers is currently running in production.
The explosion of data from new devices and technologies has forced the telecommunications industry to completely change the way they handle big data. Their traditional storage and analytics solutions cannot adequately manage the expanding, diverse volume of data generated today.
Apache Hadoop is revolutionizing big data in more than one way. While the Hadoop platform introduced reliable distributed storage and processing, various packages such as Spark on top of Hadoop make it possible to build applications and analyze data much faster. Here are some cool ways the Hadoop stack is being used right now.
There’s a reason the industry refers to Big Data as “Big” Data. According to IBM, we create 2.5 quintillion bytes of data. Here’s another eye-opening stat: 90 percent of the data in the world today has been created in the last two years alone.
Apache Spark on Hadoop is great for processing large amounts of data quickly. The story gets even better when you get into the realm of real time applications.
You are probably all somewhere on the Spark journey to production scale—you're either at Spark Summit to learn, to start doing something with Spark, or perhaps you have mission-critical applications already running in your enterprise. On this journey, there's a lot to think about—mostly about your application—but you also need to figure out how to actually get Spark into production scale as more and more groups will want the power of the results and the value of using Spark in mission-critical, operational deployments.
Advertising has come a long way since the days of Don Draper. While data has always played a part in ad campaigns, big data has enabled a new era of advertising.
Reducing operating costs and increasing efficiency are, and will always be, priorities for any business, but they become imperative when an industry is facing cyclical challenges. Given the current volatility in the oil market, the oil and gas industry is looking for solutions that can proactively address inefficiencies through better asset tracking and predictive maintenance.
“Dad, why do they call it “Three Musketeers” when it’s all about d’Artagnan?” asked my son after we finished watching the movie. D’Artagnan was the true hero of the story, without whom there would have been no adventures.
Curious to know how American Express uses machine learning successfully, in production, at very large scale?
Dr. Pramod Varma, Chief Architect and Technology Advisor to Unique Identification Authority of India (UIDAI), gave an informative talk titled “Architecting World's Largest Biometric Identity System - Aadhaar Experience”. He began his talk by talking about why the Aadhaar project was created. In India, the inability to prove one’s identity is one of the biggest barriers that prevents the poor from accessing benefits and subsidies. India is a country with 1.2 billion residents in over 640,000 villages. The Indian government spends $50 billion on direct subsidies (food coupons for rice, cooking gas, etc.) every year. Both public and private agencies in India require proof of identity before providing services or benefits to those living in India.
Our daily commute may not feel like such a high tech experience, but whether you feel it or not – it is. Big data and Hadoop have revolutionized the transportation industry over the past several years. Whether in a car, a train, a plane or a delivery truck, we all use big data throughout our travels. Let’s go through a few specific use cases to spotlight transportation businesses that are using big data in a big way.
Five minutes is easily squandered without much thought; however with Hadoop, five minutes can make a big impact. John Schroeder, MapR CEO and Founder, recently used a five-minute keynote address to illustrate this point.
Following is an edited transcript of John's message.
With Google Capital’s latest investment in MapR, it’s clear that big data and Hadoop are firmly established in the enterprise. Big data is helping enterprises across diverse industries improve their businesses through increased efficiencies and opportunities to serve their customers better.
I’m on a plane to London as I write this. As usual, the plane is filled to capacity and coveted snack items are scarce. The airlines must know something about passenger consumption behavior … or do they? Accessing multiple pieces of data and analyzing the information in every imaginative way for an actionable result is what is driving Apache Hadoop technology. MapR is helping companies take advantage of that.
Blog Sign Up
Sign up and get the top posts from each week delivered to your inbox every Friday!