Streaming Architecture:

New Designs Using Apache Kafka and MapR Streams

by Ted Dunning & Ellen Friedman

Preface

The ability to handle and process continuous streams of data provides a considerable competitive edge. As a result, being able to take advantage of streaming data is beginning to be seen as an essential part of building a data-driven organization.

The expanding use of streaming data raises the question of how best to design systems to handle it effectively, from the ingestion from multiple sources, through a variety of uses, including streaming analytics and the question of persistence.

Emerging best practices for the design of streaming architectures may surprise you—the scope of powerful design for streaming systems extends far beyond specific real-time or near–real time applications. New approaches to streaming designs can greatly improve the efficiency of your overall organization.

Who Should Use This Book

If you already use streaming data and want to design an architecture for best performance, or if you are just starting to explore the value of streaming data, this book should be helpful. You’ll also find real-world use cases that help you see how to put these approaches to work in several different settings. For developers, you’ll also find links to sample programs.

This book is designed for both nontechnical and technical audiences, including business analysts, architects, team leaders, data scientists, and developers.

What Is Covered

In this book, we:

  • Explain how to recognize opportunities where streaming data may be useful
  • Show how to design streaming architecture for best results in a multiuser system
  • Describe why particular capabilities should be present in the message-passing layer to take advantage of this type of design
  • Explain why stream-based architectures are helpful to support microservices
  • Describe particular tools for messaging and streaming analytics that best fit the requirements of a strong stream-based design.

Chapters 1–3 explain the basic aspects of strong architecture for streaming and microservices. If you are already familiar with many business goals for streaming data, you may want to start with Chapter 2: Stream-based Architecture, in which we describe the type of architecture that we recommend for streaming systems.

In addition to explaining the capabilities needed to support this emerging best practice, we also describe some of the currently available technologies that meet these requirements well. Chapter 4: Kafka as Streaming Transport goes into some detail on Apache Kafka, including links to sample programs provided by the authors. Chapter 5: Mapr Streams describes another preferred technology for effective message passing known as MapR Streams, which uses the Apache Kafka API but with some additional capabilities.

Later chapters provide a deeper dive into real-world use cases that employ streaming data as well as a look forward to how this exciting field is likely to evolve.

Conventions Used in This Book

This icon indicates a general note.

This icon signifies a tip or suggestion.

This icon indicates a warning or caution.