Running at Google Scale With the Zeta Architecture

Google has set the standard for most of the world when it comes to running systems at scale. It has created a number of different technologies to benefit its business. It built those technologies in a way that makes sense for its business, but it has also written many white papers to share these technologies with the rest of the world. This is, after all, how the entire Hadoop ecosystem came to fruition. Google white papers have also inspired many other great open source projects such as Apache Drill, Apache Mesos, and Apache Spark.

All of these white papers have a common theme in that they solve problems that Google has faced. What has been missing until now, however, has been a way of bringing these technologies together such that any data-centric organization can benefit from the capabilities of each technology across its entire data center, and in new ways not documented by any single white paper. This is called the “Zeta Architecture.”

Enterprise architecture should focus on a holistic approach to deliver effectiveness, efficiency, agility, and durability. These points should underlie nearly every enterprise application, but all too often those applications lack holistic enterprise architecture. This may lead to missed opportunities, complex business processes, and even business continuity issues. The Zeta Architecture is an enterprise architecture built with big data and real time in mind.

The Zeta Architecture lays out the foundational premise for a data-centric enterprise and is comprised of seven tenets:

  • Distributed File System
  • Solution Architecture
  • Real-Time Data Storage
  • Enterprise Applications
  • Pluggable Compute Model/ Execution Engine
  • Dynamic and Global Resource Management
  • Deployment/Container Management System

Google has never formally documented its enterprise architecture for public consumption, but when looking at the seven components of the Zeta Architecture, it becomes quite clear that this is its foundational approach. These are the technologies that Google has created, uses, or contributes to, and are listed in the same order as the components of the Zeta Architecture:

  • GoogleFS, Colossus
  • cgroups, Kubernetes
  • Spanner, Megastore, BigTable, F1
  • Recommenders, Machine Learning
  • BigQuery, Dataflow,  Dremel, MillWheel
  • HTTP Servers, Gmail
  • Borg, Omega

Let’s look at this in a couple of different ways. The first will be through the eyes of Gmail. Gmail is delivered via a web server, which has recommendation engines running under it to deliver advertisements as well as other machine learning libraries to handle things similar to spam. As a company, it deploys more than 2 billion containers per week, some of which will contain those pieces of software that support Gmail. It uses Spanner as the real-time store for Gmail. It uses compute engines similar to Dremel for analytics. All of these tools store their data in Colossus or GoogleFS. To round out this entire use case, Google uses Borg and Omega to manage all of the resources globally. When it needs more instances of any of those pieces of software, it can spin them up dynamically.

The second use case is Google BigQuery. This is a service offering for querying big data at scale, which scales dynamically. Using the same technologies mentioned above, we can see the foundation to support the BigQuery. Depending on the size of the dataset, Google automatically gives BigQuery more horsepower via the scheduling system Omega. The query performance can be optimized based on the size of the data and the number of computers needed to process the data in a reasonable time. This same functionality exists in Apache Drill. The piece that Drill doesn’t currently handle is auto-scaling. In the Zeta Architecture, it is realistic that a Mesos framework could support Drill in this endeavor. Just imagine running queries across massive quantities of data that always fall within a service level window.

This architecture is the real secret to running at Google scale. With the proper technologies, not only can you dynamically scale your data-centric applications to handle anything in real time, but business processes and your overall system design can be simplified, dramatically reducing costs. Complex operational processes and procedures for things like security, disaster recovery, deployment management, and even contingency planning are pulled together in a holistic and seamless way.

There are two freely available published white papers on the Zeta Architecture that will give you more ideas on how to take your enterprise architecture to the next level. The first is technical ( and the second is a high level executive summary (

This post was originally published here



Ebook: Getting Started with Apache Spark
Interested in Apache Spark? Experience our interactive ebook with real code, running in real time, to learn more about Spark.

Streaming Data Architecture:

New Designs Using Apache Kafka and MapR Streams




Download for free