Course Curriculum

  1. 1

    Free Preview

    1. Free Preview Free preview
  2. 2

    Chapter 1: Introduction to Apache Beam and Data Processing

    1. (Included in full purchase)
  3. 3

    Chapter 2: Stateful and Stateless Processing with Apache Beam

    1. (Included in full purchase)
  4. 4

    Chapter 3: Handling Event Time, Windows, and Triggers

    1. (Included in full purchase)
  5. 5

    Chapter 4: Building Pipelines with Apache Beam

    1. (Included in full purchase)
  6. 6

    Chapter 5: Transformations and Coders in Apache Beam

    1. (Included in full purchase)
  7. 7

    Chapter 6: Advanced Pipeline Optimization Techniques

    1. (Included in full purchase)
  8. 8

    Chapter 7: Deploying Apache Beam Pipelines on Different Runners

    1. (Included in full purchase)
  9. 9

    Chapter 8: Monitoring, Debugging, and Tuning Apache Beam Pipelines

    1. (Included in full purchase)
  10. 10

    Chapter 9: Case Studies: Apache Beam in the Real World

    1. (Included in full purchase)
  11. 11

    Index

    1. (Included in full purchase)

About the Course

Building Data Pipelines Using Apache Beam provides a practical, production-focused guide to using Beam’s unified programming model to write processing logic once, and run it across multiple runners, without rewriting core code. The book begins with the fundamentals of distributed data processing and Beam’s core abstractions—PCollections, transforms, and pipeline design. You will then progress into stateful and stateless processing, event-time semantics, windows, triggers, watermarks, state, and timers—building the mental models required to reason about correctness at scale. From there, the book moves into advanced transformations, coders, and optimization techniques to help you improve performance, control costs, and ensure reliability. In the later chapters, you will learn how to deploy pipelines across runners such as Dataflow, Flink, and Spark, monitor and debug production workloads, and apply the best practices drawn from real-world case studies. Thus, by the end of the book, you will be able to design, deploy, and operate robust, portable, production-grade data pipelines with confidence.

About the Author

Nuzhi Meyen is a fintech entrepreneur, data scientist, and AI practitioner, Co-Founder and CEO of Helios P2P. He builds production-grade AI, analytics, and blockchain systems for lending and credit risk. With advanced degrees and strong community contributions, he bridges theory and practice to deliver scalable, real-world financial technology solutions.