Predictive Maintenance using Hadoop for the Oil and Gas Industry

Executive Summary

Oil and gas companies have a major opportunity to increase efficiency and reduce operational costs through better asset tracking and predictive maintenance. With falling oil prices, oil and gas companies are facing increasing pressure to reduce operating costs and manage the business more efficiently. Many companies are not operating their assets at optimum production efficiency.

Oil and gas companies collect a vast amount of data through sensors in their digital oilfields around the world. McKinsey estimates that a typical offshore production platform can have more than 40,000 data tags, though many may not be connected or used. While many companies use oilfield sensors to monitor real-time data on operations, the data is not often stored and analyzed to help predict potential equipment problems.

And as oil and gas operations become more complex and remote, it’s often difficult to have visibility into the condition of equipment, especially in remote offshore or deep-water locations. Additionally, inspection of equipment in remote locations is expensive. This lack of visibility can lead to expensive unscheduled maintenance and non-productive time (NPT) or oil spills or accidents resulting from failing equipment. Even small improvements in efficiency can yield significant savings. McKinsey estimates that improving production efficiency by ten percentage points can yield up to $220 million to $260 million bottom-line impact on a single brownfield asset.

Next Generation Solution and Benefits

With current approaches, it takes a great deal of time to load large data sets, forcing unacceptable delays before analysis can start. A next-generation solution should provide the following capabilities in order to predict potential equipment failures before they can occur:

  • Stores and processes real-time and historical sensor data
  • Ingests and analyzes real-time sensor and historical data alongside maintenance data that is generated from industrial equipment for oil rigs, chemical plants, or mining operations.
  • Proactively learns patterns of normal and errant behavior across various types of equipment to provide warnings of minor degradation.

A next-generation solution makes it possible for subject matter experts to perform remote monitoring and analysis on data in a central repository. Subject matter experts can act on volumes of data retrieved from many assets at many locations to enable new levels of predictive maintenance. By mitigating these problems early, oil and gas companies can prevent equipment failure and increase net product output. The platform also enables entirely new insights into machine and process operations efficiency, quality, and utilization.

The solution from MapR and Mtell provides oil and gas companies with predictive analytics in real time on large amounts of data. The solution dramatically reduces loading time, while also enabling the ingestion and analysis of high speed, real-time sensor data streams, and can handle over 100 million data points per second.


The solution combines the MapR Distribution including Apache Hadoop, Mtell Previse Software, and Open TSDB (time-series database) software. The MapR Enterprise Database Edition is well suited to perform time-series analysis of massive amounts of data. It works by using OpenTSDB, an open source, distributed, scalable, time-series database. OpenTSDB has the capability to collect thousands of metrics from tens of thousands of hosts and applications, at a high rate (every few seconds).

A typical reference architecture using MapR-DB shows data from input sources that is stored in a message queue to be picked up by the collector. It is then stored in MapR–DB. A web service queries and retrieves the time-series data from MapR-DB. The data is then displayed to users via a web application or a data visualization tool.

Customer Example

An oil and gas customer chose the MapR and Mtell solution in order to improve predictive maintenance and cut operational costs and Non-Productive Time (NPT). Reducing operating costs is always a priority; with oil prices being low, this becomes even more of an imperative. The impact of equipment failure and NPT means that if the asset is shut down, operations stop, and costly resources have to be brought in to fix the problem.

The cost of Non-Productive Time per asset during drill to completion is $500K - $1M per day, and post completion is $40K-$300K per day on average. The number of days per year this customer experiences downtime varies, but the industry average is anywhere from 1-3 days per year per asset pre-Completion, and 2-5 days per asset post-Completion. With several hundred assets, this customer is experiencing a productivity boost and benefiting from reduced operational costs. It is worth mentioning that their current solution could not scale to accomplish this in a cost-effective way.

The customer is using MapR to capture massive amounts of data in a cost-effective and scalable way. They rely on the MapR Enterprise Database Edition to easily ingest structured and unstructured data. MapR provides a centralized data center for supporting applications with a predictable scaling model. MapR and Mtell are equipping the customer with a powerful mechanism for rapidly analyzing all variable inputs against existing failure rate models and alerting them before equipment is likely to fail.


There is an enormous opportunity to utilize predictive analytics and identify when equipment and assets are likely to fail or need service, and to perform preventive maintenance to minimize costly unscheduled downtime. The customer example shown in this white paper is evidence that oil and gas companies can realize significant benefits by deploying a predictive maintenance solution built on top of the MapR Distribution including Hadoop.


Digitizing Oil and Gas Production, McKinsey, August 2014

Oil firms are swimming in data they don’t use, CNBC, February 2015.