Using Hadoop to Detect Health Care Fraud, Waste and Abuse


UnitedHealthcare, which provides health benefits and services to nearly 51 million people. The company contracts with more than 850,000 physicians and care professionals and approximately 6,100 hospitals nationwide. Their Payment Integrity group has the tough job of ensuring that claims are paid correctly and on time. The better they do at paying what they’re supposed to—and not paying for fraudulent services—the more they can save on overall health care costs.

Their previous approach to managing more than one million claims every day (10 TB of data daily) was ad hoc, heavily rule-based and limited by data silos and a fragmented data environment. UnitedHealthcare knew they had to transition to a predictive modeling environment based on a Hadoop big data platform so they drew up a list of several key requirements:

  • A flexible platform that can integrate any new tool or technology seamlessly
  • Enterprise-grade features such as high availability and disaster recovery
  • A cost-effective platform to collocate data
  • Multi-tenancy capabilities to support multiple business groups and applications in one cluster
  • Direct Access NFS for direct data ingestion and the ability to integrate with familiar tools and the existing environment

UnitedHealthcare came up with a unique dual model strategy, which meant focusing on operationalizing savings, while at the same time pursuing innovation to constantly leverage the latest technologies.

Here’s how they are doing it: in terms of operationalizing savings, the group is building a predictive analytics “factory” where they can identify inaccurate claims in a systematic, repeatable way. Hadoop is now the data framework for a single platform that’s equipped with tools to analyze a slew of information from claims, prescriptions, plan participants, contracted care providers and associated claim review outcomes.

They integrated all this data from multiple data silos across the business, including over 36 data assets. And they now have multiple predictive models (PCR, True Fraud, Ayasdi, etc.) at their fingertips that provide a rank-ordered list of potentially fraudulent providers they can pursue in a targeted, systemic way.

In terms of innovation, the Payment Integrity group is wisely investing in R&D to discover the next game-changing technologies that will help them with:

  • Knowledge and data discovery to get more insights from the data
  • Data enhancement including better structure, meta-data layers, graph analytics, new data sources
  • Machine learning, new application of AI tools and methods
  • Data mining and reporting

The end result is phenomenal: UnitedHealthcare has generated a whopping 2200% return on their big data/advanced technology. The predictive analytics “factory” drives millions in annual savings through initiatives such as high-risk provider flagging. They’ve also been able to increase the speed and depth of extracting insights from all these internal data sources. UnitedHealthcare is now able to use Hadoop to leverage their massive amounts of data and reduce costs, raise quality and tighten operational efficiencies. And that’s what we call a healthy outcome.

Learn More

Keeping an Eye on the Analytic End Game at UnitedHealthcare


Hadoop Action using Hadoop to Detect Fraud, Waste and Abuse in Healthcare