About Professional Spark Development

Takes a participant from no knowledge of Apache Spark to being able to develop with Spark professionally. It covers the main technologies of Hadoop: HDFS and MapReduce. There is an in-depth coverage of essential Big Data and Hadoop ecosystem technologies. The class ends with a consideration of how to architect Big Data solutions with Hadoop and its ecosystem.

Duration: 3 days

Intended Audience: Technical, Software Engineers, QA, Analysts

Prerequisites: Intermediate-Level Java

You Will Learn

What exists in the Big Data ecosystem so you can use the right tool for the right job.
An understanding of how HDFS works and how to interact with it.
An understanding of how MapReduce works and how each phase works.
An understanding of how Spark works and how each phase works.
What are Java 8 Lambdas and how they make your Spark code humanly readable.
The basics of coding a Spark job with Java to build your Big Data foundation.
The various API methods in Spark and what they do.
How SQL can be used with a Spark job and when that vastly improves your productivity and code.
How to create Java code that runs as a function during a Spark SQL command to use existing Java code or do use case specific queries.
How to process data in real-time with Spark.
How to integrate and use Spark with the rest of your Big Data systems.

Course Outline

Professional Spark Development
Thinking in Big Data
Introducing Big Data
What is Hadoop?
The Ecosystem
Introduction to HDFS
Introduction to MapReduce
Coding With Spark
About Spark
Using Eclipse
Using Apache Maven
Functional Programming
Java API
Built-In Transformations and Actions
Advanced Spark
Advanced API
Shuffles
Caching
Avro
Spark and Avro
Unit Testing
Spark SQL
Spark SQL
Spark SQL API
Spark SQL UDFs
Spark Streaming
Spark Streaming
Streaming API
Advanced Streaming
Integrating Spark
Real-time Systems
Using With Hadoop MapReduce
Replacing Other Systems
Conclusion

Technologies Covered

Apache Spark
Apache Hadoop
Apache Kafka

About Professional Spark Development

You Will Learn

Course Outline

Technologies Covered

I want this class

Get your free copy of Data Engineering Teams: Creating Successful Big Data Teams and Products

Data Engineering Teams Book

Would you like to know what I teach successful organizations to do?

Mentoring

We’re here to help make the process more successful and the outcome more effective.

Architecture Reviews

The right tool for the job saves countless hours, time, money. Are you using the right tool for the job?

Project Acceleration

Why do so few companies create enormous value from Big Data while most fail?

Company

Resources

Resources

Stay updated with the latest.

Have a question?

Send us a message

or give us a call at +1 775.393.9122

© 2025 Big Data Institute

Privacy

© 2025 Big Data Institute

Privacy

Have a question?

Send us a message

or give us a call at +1 775.393.9122