Hadoop in Action: Razorsight Offers Telecom Clients Predictive Analytics Solutions based on Hadoop and Apache Spark

The explosion of data from new devices and technologies has forced the telecommunications industry to completely change the way they handle big data. Their traditional storage and analytics solutions cannot adequately manage the expanding, diverse volume of data generated today. To address this problem, telco companies are turning to Hadoop to help them effectively create data lakes where information of all formats, including structured and unstructured data, and whether online or archived, can be stored. One such company that has evolved its technology stack with Hadoop is Razorsight, a provider of cloud-based predictive analytics software that's used by the world's best-known communication and media brands.

Razorsight got its start a decade ago, offering financial assurance analytics solutions to telecom companies. As they grew as a company and big data evolved, there were a lot more data sets available. Today’s telecom data has higher volumes, frequency and complex structures, so Razorsight had to figure out a way to use this newly available data to generate predictive insights using data science and predictive analytics. To accomplish this, Razorsight needed to evolve its technology stack to scale at a reasonable cost so they could deliver new predictive solutions to their clients. The company decided to move to an Apache Hadoop-based infrastructure to take advantage of the emerging trends in big data architecture and parallel computing. In addition, having the flexibility of the full Spark stack as part of the Hadoop distribution was very important to them.

Editor’s Note: Download our free E-Book Getting Started with Apache Spark: From Inception to Production here.

Hadoop used to build data lake as primary data store

Razorsight used Hadoop to build a central data lake as a primary data store for both online and archived data. Since the launch of this new stack in late 2014, the production cluster has received, processed, and analyzed more than 40 terabytes of data. Since Razorsight’s customers send data in all shapes and formats from multiple sources, they use an NFS gateway to move these data sets in and out of the cluster seamlessly, making it extremely easy and intuitive to integrate Hadoop into the overall data flow. Razorsight then uses Spark as an in-memory processing engine to enrich and transform the source data to prepare the analytical records for advanced modeling. Spark provides the required high performance to accomplish this function, and they also used ElasticSearch for search-based analysis. Additionally, the end users and business analysts continue to use existing business intelligence and visualization tools on their downstream data warehouse.

Improved performance eliminates bottlenecks

With their previous architecture, Razorsight ran into multiple bottlenecks because data ingestion, processing, analytics, querying and visualization were all competing with each other for processing power. With the new Hadoop platform, they can completely separate these, making a huge impact on performance and scalability.

Technology platform is bursting with new efficiencies and enhancements

As Razorsight built their technology platform, new efficiencies and enhancements were incorporated into the predictive analytics solutions. “We saw that we could leverage big data and add to the data sets we already gathered to be able to offer additional solutions to our customers. Since all data (raw, enriched, transformed and aggregated) is present in the data lake, it is easier for us to insert additional use cases to deliver newer insights for our customers quickly,” explains CTO Suren Nathan. “One example of this is the operational use case, where we predict the repeat callers into the call center or predict network node failures before they occur.”

New Hadoop-based architecture significantly lowers storage and computing costs

Razorsight’s new Hadoop-based architecture has helped them gain horizontal scalability on commodity hardware, and reduce storage and computing costs. The total cost of storage and processing for a traditional EDW platform is about $15,000-20,000 per terabyte. With the Hadoop ecosystem, this has dropped to about $2,500-3,000 per terabyte for Razorsight.

New platform allows Razorsight to expand into new solution areas

The new platform has enabled Razorsight to expand into new solution areas for its telecommunications service provider customers. For example, Razorsight’s sales and marketing solution is designed to improve the customer experience, reduce churn and identify the next best offer. The marketing team at Virgin Mobile Latin America has deployed Razorsight’s sales and marketing solution in multiple countries to support its expansion there. The predictive analytics will help them tailor targeted marketing campaigns based on a particular customer’s propensity to churn.

Razorsight's technology stack is one of their major differentiators

This new technology stack allows Razorsight to continuously innovate and deliver predictive insights to its telecommunications customers so they can improve customer experiences, reduce churn and increase margins. “Our technology stack is one of our major differentiators. Our customers trust us with their data, and our ability to generate accurate and reliable predictive insights in the fastest possible timeframe,” says Nathan.

Want to learn how other companies are using Hadoop to transform their business? Check out our Solutions area, which features details on over 50 organizations that are using Hadoop to capitalize on the opportunity that big data presents.

no

CTA_Inside

Ebook: Getting Started with Apache Spark
Apache Spark is a powerful, multi-purpose execution engine for big data enabling rapid application development and high performance.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams

 

 

 

Download for free