Hadoop in Action: Quantium Delivers Lightning-Fast Customer Analytics Using Hadoop and Apache Spark

Australian shoppers are some of the most digitally influenced in the world; a majority of Australians go online to research a product before buying it, according to a 2015 report by Deloitte. They're looking for a personalized experience that includes messages and recommendations specifically tailored for them.

Enter Quantium, a Sydney-based data analytics firm that develops insights into consumer needs, behaviors, shopping habits, and media consumption by applying what's called "actuarial science" to consumer transaction data. The company works with market-leading companies such as Woolworths, National Australia Bank, Westfield, Coca Cola, and Qantas.

Opportunity to expand data assets and the business
During its early years, Quantium performed analytical work using data provided by its clients. While that business model had been highly successful, the firm took a giant step forward by acquiring and integrating its own data assets, which enabled a whole new range of innovative, value-added services. For example, Quantium could correlate the client’s internal data with external information about shopping behavior, so the client gained a much broader understanding of consumer needs and behaviors. Armed with this new insight, Quantium's clients were able to offer highly personalized recommendations and promotions that would increase revenues, enhance customer retention, and create a competitive advantage.

Legacy technology platform could not scale
However, moving from concept to deployment required overcoming one major obstacle: the legacy analytics platform. As the data analysis expanded in scope and complexity, the firm’s Microsoft SQL Server platform and aging server hardware simply couldn't handle the load. Quantium needed a new analytics platform with scalable, reliable, high-performance servers and an enterprise-grade software framework.

Quantium realized that a big data solution was needed, not only because of the data volume, but also the heavy analytical requirements. One of the big four Australian banks, for example, has more than two million customers who generate 14 million transaction a week—that's more than five billion transactions a year! Quantium selected Hadoop to improve its ability to perform analysis on large volumes of data without having to spend time restructuring the data. They wanted to gain faster time to market at a reduced cost, improve analytic performance, enable real-time ad-hoc queries, and perform security management and audits. Quantium is now in the process of building the largest Hadoop cluster in Australia.

New Hadoop platform allowed Quatium to exceed performance targets

Performance is vital for Quantium to deliver rapid results to customers like Woolworths, who continually demand expanded feedback and analytics on consumer activity and behavior. During the requirements phase, Quantium had set a target of a ten-fold increase in performance. That goal has been exceeded: before and after testing shows that the new Hadoop platform decreases query processing time by 92 percent, which represents a 12.5X increase in performance.

Integrating external data sets enhances data quality
Quantium gains a real competitive advantage by integrating these external data sets. “Having access to external data sets to combine with our clients’ data distances us from everybody else in this space,” says Alex Shaw, head of Technical Operations at Quantium. “For example, a retail chain may have a fairly good picture of customer behavior in their own stores, but little knowledge about how often their customers shop with competitors. We can provide those kinds of insights because we have the ability to leverage the full value of our data assets by marrying data sets containing in-store and online purchasing information with media consumption behavior to make advertising more effective.”

Hadoop means greater innovation, shorter time to market 
As the data sets grew over time and analytical complexity increased, Quantium relied on complex and time-consuming sampling methods. Designing and implementing sampling techniques takes time and specialized skills. Now, with the new Hadoop and Spark solution, data scientists can design complex queries that run against multi-terabyte data sets. The result? They get more accurate results in just minutes rather than hours or days.

In addition, the more powerful platform drives innovation, because scientists can test alternative scenarios quickly and accurately, shortening development time and improving time to market. “We have a lot of smart people who have been hamstrung by technology and its ability to implement their ideas. Now they have improved ways of executing analytics which opens up the ability to create new and innovative solutions for our clients” says Shaw.

Scaling to accommodate business growth 
Quantium’s client base is expanding rapidly, which requires the firm to increase its compute capacity accordingly. The new Hadoop platform features a clustered architecture, which scales easily with high reliability and a lower total cost of ownership. The clustered approach allows Quantium to fine-tune performance in a cost-effective way, adding servers to the cluster as necessary to meet service-level agreements instead of replacing servers with more powerful models. Unlike the legacy system, all users benefit from the addition of new servers to the cluster.

Expanding to new markets 
Going forward, Quantium will work to take full advantage of the efficiencies of the Hadoop platform to maintain a strong competitive edge. Development cycles will continue to decrease, reducing costs and reducing the time by which new products are brought to market. Most importantly, Quantium is poised to enter new market segments where its expertise in complex analytical methods and big data analysis can provide tangible business value. “We’ve expanded the range of problems that we can solve, enabling our clients to grow their business by interacting with each of their customers as individuals with specific wants and needs,” says Shaw.

How are other companies using Hadoop to transform their business? Find out by visiting our Solutions area, which features details on over 50 organizations that are ensuring production success with Hadoop.

Want to learn more?



Ebook: Getting Started with Apache Spark
Apache Spark is a powerful, multi-purpose execution engine for big data enabling rapid application development and high performance.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free