Real-Time Big Data Systems with Spark Streaming and Kafka

Brochure Image

by Jesse Anderson download a PDF brochure Download Event Brochure


Takes a participant through the benefits and challenges of real-time Big Data systems. We cover real-time Big Data services that are open source or managed services from Cloud providers. The class focuses on Apache Kafka and Apache Spark Streaming. It shows how to create consumers and publishers in Kafka. Then, we see how to use Apache Spark Streaming to process the data in Kafka and send it back to Kafka. Finally, the data is visualized in real-time on a webpage using Kafka REST.

What you will learn

  • How to create large scale real-time systems using both Apache Kafka and Apache Spark Streaming
  • How real-time distributed systems are different from batch systems
  • How to create Kafka producers and consumers
  • How to process data in Kafka with Spark Streaming and place the results back into Kafka
  • How to visualize data and show data in real-time on a Web page

Main Topics

  • Real-time Data Pipelines
  • Using the Cloud
  • Ingesting Data
  • Kafka
  • Processing Data
  • Spark Streaming
  • Data Products