At the Big Data Everywhere conference held in Israel, Atzmon Hen-Tov, Vice President of R&D of Pontis, and Lior Schachter, Director of Cloud Technology and Platform Group Manager of Pontis, gave an informative talk titled “Data on the Move: Transitioning from a Legacy Architecture to a Big Data Platform.” The five phase, two-year migration of their operational and analytical functions to MapR resulted in a true, real-time operational analytics environment on Hadoop.
Pontis, established in 2004, is a developer of online marketing automation solutions for the telecommunications industry. Their clients include some of the biggest names in telecommunications, including Vodafone, Telefónica, and VimpelCom. Pontis engages with 400 million customers on a daily basis, and their mission-critical systems support billions of transactions every day.
In this talk, Atzmon provided an overview of their legacy system, as well as their process for migrating to Hadoop. He also summarized their Hadoop platform selection process, which involved a thorough RFP and a detailed analysis of the top three Hadoop vendors. Lior Schachter then discussed how Pontis deployed MapR into their architecture to get higher scalability and lower TCO. This gave them the ability to analyze billions of mobile subscriber events every day.
This blog captures the Pontis story told at the conference.
Pontis Data Flow
Pontis typically connects to 15-20 systems to collect all historical data into their analytics subsystem, which creates an analytical profile for each of their millions of customers. The analysis is transferred back to their online transaction processing (OLTP) subsystem which can then target subscribers with a relevant promotion.
Pontis Data flow
For many years, they were happy with the classic architecture of relational database management systems (RDBMS), one for OLTP and one for analytics. In 2011, they realized their architecture was becoming outdated. The daily transactions grew and rapidly approached the architecture’s full capacity, which was about 100 million transactions a day. The architecture could technically grow, but with steep costs. At the same time, Pontis started to acquire new deals with even bigger clients whose initial demands reached billions of daily transactions. At this point, Pontis decided to look into big data technologies as a possible solution.
Getting Started: Defining the Technical Requirements
Pontis started with big data by reading all the popular Hadoop books. That was the only big data training they ended up needing. With that self-training, they were able to develop a blueprint of their target architecture using big data.
The target architecture graphic above shows that their RDBMSs would be all but gone. Their data would be transferred to a big Hadoop cluster that uses the popular NoSQL database Apache HBase. The biggest challenge was migrating their analytics code. Pontis had tens of thousands of lines of complex T-SQL code that couldn’t be converted to HiveQL, the SQL-like query language used by Apache Hive in Hadoop. In addition, their target architecture had to: 1) support the scale 2) support clients who would upgrade to big data and 3) support existing clients who didn’t need to upgrade to a big data architecture. Their system had to continue to support the legacy architecture so clients who didn’t upgrade could still receive new versions with new features.
Pontis reverse-engineered their SQL code using a platform-independent model that could then be executed either in SQL or Hadoop (for reference, see: http://adaptiveobjectmodel.com/2013/09/smooth-transition-to-big-data-with-a-aom/). This would help with their hybrid legacy and big data architecture.
They knew they needed an enterprise-grade Hadoop distribution that could run a mission-critical telecommunications system. They knew exactly what they needed – features such as high availability (no single point of failure), backups and snapshots, consistent response time for sub-second OLTP responses, and support for heterogeneous hardware. They conducted a selection process with an RFP that focused on non-functional capabilities such as sustainable performance, high availability, and manageability. After a thorough evaluation process, they chose MapR.
Overall ApproachPontis next identified several key objectives in their overall approach to the project:1) Each step should increase scalability and reduce TCO2) The project should minimize changes to the proven business logic layer in the OLTP subsystem3) Each feature in the OLTP subsystem could be turned on or off on both the legacy and big data architectures, to adapt to the different client needs4) Transfer their analytics T-SQL code to Java and MapReduce to solve scalability, as well as to leverage their large team of Java programmers5) Wrap the analytics business logic layer in a domain-specific language (DSL), which lets non-programmers join the development process and help customize the product
Moving Forward: Implementation in 5 Phases
Phase 1: Data Transfer
Pontis’ legacy architecture used Oracle for OLTP and SQL Server for analytics. The first step of the migration was a small one: they transferred Oracle queues to the MapR file system, MapR-FS. This fortunately required no change to business logic code, and allowed Pontis to nearly double their OLTP system scale.
Phase 2: Introducing the MapR Hadoop Cluster
Phase 2 involved more Hadoop-specific work. This stage was essentially about offloading the analytics activity from SQL Server. It required two things: 1) transferring the data warehouse to Hive, and 2) moving the complex analytics calculations to MapReduce. They also needed Hue and Pig, and building up the system and its associated components – monitoring, alerts, operations, etc.
They were able to load new imports and logs via NFS. This avoided the extra work required to implement other loading technologies like Kafka and Flume. All the data would be available to MapReduce. The logs and additional client files were terabyte-sized Avro files that include a complex structure of individual subscriber history. These include thousands of objects in the history, and all of the subscribers’ activities and aggregations over the past six months. After each run, they’d generate a new profile for the next day’s run, enabling good overall performance.
They used Sqoop to move individual subscriber information data back into SQL Server as the profile store to avoid having to change working client code. This was very fast, and they were able to insert tens of millions of records in minutes.
The result of this phase was that they had a tremendous increase in their analytics capabilities, and the architecture could theoretically grow to support an unlimited number of subscribers and customers. At this point, they could handle 100 million clients. They gained higher capacity, simplified the architecture, and maintained high availability with no data loss. They could now collect two or three months of history data, which they couldn’t do before. And by using MapReduce, the subscriber profile calculation only took two or three hours, and they could’ve improved that simply by adding servers to the cluster.
Pontis also gained from the MapR ability to store many small files. Pontis last counted 130 million files, which helps them to avoid continual file concatenation techniques required with other distributions.
Phase 3: Introducing MapR-DB
During Phase 3, Pontis deployed MapR-DB, the MapR in-Hadoop NoSQL database. Basically, when Pontis communicates with clients, each of these interactions is documented in an activity log stored in tables. These tables can reach billions of lines in a relational structure, so clients who haven’t received their special offer or don’t understand some text they received can contact the call center, which then directly accesses their system and database.
Most of the system is made up of writes (99.9%). This put such a strain on Oracle that it affected the availability of their other servers, which required far more reading. With the move to MapR-DB, Pontis now has one line per customer, with thousands of columns describing that customer’s transactions. This allowed some of their OLTP activity to run through an additional channel, which further increased their system capacity. It was now at 300 million events per day.
Pontis found MapR-DB tables easy to create and manage. They performed a lot of YCSB tests to validate everything before production. The loss of multi-row transactionality that they had in Oracle was a downside, but they developed a mechanism on their own to support that.
Phase 4: Migrating OLTP Features to MapR-DB
Stage 4 involved migrating the remaining SQL Server functionality, which meant moving the profile data to MapR-DB. Pontis had to change it for all the relevant clients. This was a plumbing change, however, and not a business logic one, so it was straightforward. All Oracle functionality was also moved to MapR-DB, so the OLTP subsystem could reach unlimited capacity in terms of customers and events per day, contingent on the number of servers. Pontis can now handle hundreds of millions of subscribers, and billions of transactions per day. By deploying all functionality on MapR, they consolidated disparate systems to create a real-time operational analytics deployment.
Phase 5: Decommission Legacy RDBMS
In Phase 5, Pontis had to decommission their legacy RDBMS. By removing Oracle and SQL Server from Pontis’ architecture and adopting MapR, the company realized a significant positive impact in terms of system cost, deployment, and monitoring.
By choosing MapR as their target architecture, Pontis was able to successfully integrate Hadoop into their organization. By gradually transitioning from their legacy architecture to MapR over the course of several phases, Pontis achieved higher scalability and lower TCO in incremental steps. With MapR, Pontis is now able to engage with 400 million customers on a daily basis.
You can see all of the slides from Atzmon’s presentation here: