HDP Developer: Apache Spark 2.3 Training

This training provides you the knowledge about Apache Spark distributed computing engine which is appropriate for developers, technical managers, architects, data analysts and any learner who want to utilize Spark.

📘 Hortonworks 👥 1423 Enrolled ⏱️ 4 Days 💼 Level ⭐ 4.5 | 113 Reviews

Why Microtek Learning?

500+

Courses

10+ Years

Experience

95K+

Global Learners

Virtual Instructor-Led Training

$2800
Brand Logo | HDP Developer: Apache Spark 2.3

Course Overview

This training provides you the knowledge about Apache Spark distributed computing engine which is appropriate for developers, technical managers, architects, data analysts and any learner who want to utilize Spark.

The course also provides technical knowledge about Spark architecture and its functionalities.

It also covers the basic building blocks along with HL constructs providing a capable and simpler interface.

The training also helps you to gain in-depth knowledge of DataSets, Spark SQL and DataFrames. 

Mode of Training

🏫 Classroom 💻 Live Online 🧪 Blended 👨‍👩‍👧‍👦 Private Group

What you will learn

  • Installing and acquiring Spark.
  • Identifying Supported Data Formats
  • Utilizing Accumulators and Broadcast Variables.
  • Creating and configuring SparkSession.

Who Should Attend This Course?

  • This training is intended for software developers who are seeking to develop in-memory apps and highly apps within HDP environment.

 

Prerequisites

Recommend familiarity with programming principles and good experience in software developing utilizing Scala.

However, any previous experience with SQL, HDP and data streaming is also beneficial.

📞 Talk to a Learning Advisor

Please enter Name
Please enter a valid email address.
Please enter a valid phone number in international format (e.g., +14155552671).
Please enter Message
Please agree to I agree to Terms & Privacy Policy*.
Please agree to I authorize Microtek Learning to contact me via Phone/Email*.

📘 HDP Developer: Apache Spark 2.3 Outline

  • Scala Introduction
  • Working with: Variables, Data Types, and Control Flow
  • The Scala Interpreter
  • Collections and their Standard Methods (e.g. map())
  • Working with: Functions, Methods, and Function Literals
  • Define the Following as they Relate to Scale: Class, Object, and Case Class
  • Overview, Motivations, Spark Systems
  • Spark Ecosystem
  • Spark vs. Hadoop
  • Acquiring and Installing Spark
  • The Spark Shell, SparkContext

LABS

  • Setting Up the Lab Environment
  • Starting the Scala Interpreter
  • A First Look at Spark
  • A First Look at the Spark Shell
  • RDD Concepts, Lifecycle, Lazy Evaluation
  • RDD Partitioning and Transformations
  • Working with RDDs Including: Creating and Transforming
  • An Overview of RDDs
  • SparkSession, Loading/Saving Data, Data Formats
  • Introducing DataFrames and DataSets
  • Identify Supported Data Formats
  • Working with the DataFrame (untyped) Query DSL
  • SQL-based Queries
  • Working with the DataSet (typed) API
  • Mapping and Splitting
  • DataSets vs. DataFrames vs. RDDs

LABS

  • RDD Basics
  • Operations on Multiple RDDs
  • Data Formats
  • Spark SQL Basics
  • DataFrame Transformations
  • The DataSet Typed API
  • Splitting Up Data
  • Working with: Grouping, Reducing, Joining
  • Shuffling, Narrow vs. Wide Dependencies, and Performance Implications
  • Exploring the Catalyst Query Optimizer
  • The Tungsten Optimizer
  • Discuss Caching, Including: Concepts, Storage Type, Guidelines
  • Minimizing Shuffling for Increased Performance
  • Using Broadcast Variables and Accumulators
  • General Performance Guidelines

LABS

  • Exploring Group Shuffling
  • Seeing Catalyst at Work
  • Seeing Tungsten at Work
  • Working with Caching, Joins, Shuffles, Broadcasts, Accumulators
  • Broadcast General Guidelines
  • Core API, SparkSession.Builder
  • Configuring and Creating a SparkSession
  • Building and Running Applications
  • Application Lifecycle (Driver, Executors, and Tasks)
  • Cluster Managers (Standalone, YARN, Mesos)
  • Logging and Debugging
  • Introduction and Streaming Basics
  • Spark Streaming (Spark 1.0+)
  • Structured Streaming (Spark 2+)
  • Consuming Kafka Data

LABS

  • Spark Job Submission
  • Additional Spark Capabilities
  • Spark Streaming
  • Spark Structured Streaming
  • Spark Structured Streaming with Kafka

Still have questions?

Reach out to our learning advisors for personalized guidance on choosing the right course, group training, or enterprise packages.

📞 Talk to an Advisor

What You Get with Microtek Learning

Instructor-Led Excellence

  • Certified Instructor-led Training
  • Top Industry Trainers
  • Official Student Handbooks

Measurable Learning Outcomes

  • Pre- & Post-Training Assessments
  • Practice Tests
  • Exam-Oriented Curriculum

Real-World Skill Building

  • Hands-on Activities & Scenarios
  • Interactive Online Courses
  • Peer Collaboration (Not in self-paced)

Full Support & Perks

  • Exam Scheduling Support *
  • Learn & Earn Program *
  • Support from Certified Experts
  • Gov. & Private Pricing *

Our Clients

For over 10 years, Microtek Learning has helped organizations, leaders, students and professionals to reach their maximum potential. We have led the path by addressing their challenges and advancing their performances.

Actemium
US Dept of Defense
Education Advisory Board
GE Digital
Department of Homeland Security
Pacific Life
MetLife
AIG
Chase
DC Gov
Johnson & Johnson
William Osler Health System
Google

Our Awards

Microsoft Award

Microsoft Learning
Partner of the Year

Inc 5000

5000 List of the Fastest-Growing Private Companies in America

Top IT Training

Top IT Training Companies
(Multiple Years)

Why We Are Best To Choose?

Team Support

Professional Team Support

Our expert counseling team provides round-the-clock assistance with the best value offers.

Experienced Trainers

Experienced Trainers

Certified trainers with 5–15 years of real-world industry experience guide your learning.

Satisfaction Guarantee

100% Satisfaction Guarantee

We guarantee satisfaction with top-quality content and instructor delivery.

Real-World Experience

Real-World Experience

Train with industry projects and curricula aligned to current standards.

Best Price Guarantee

Best Price Guarantee

We promise the lowest pricing and best offers in the market.

Guaranteed to Run

Guaranteed to Run

All courses are assured to run on scheduled dates via all delivery methods.

Hortonworks Learning Resources

Explore our collection of free resources to boost your Hortonworks learning journey

Blogs

Hortonworks Expert Blogs

Explore insights from industry experts to stay ahead in tech—dive into our Expert Blogs now!

Read Blogs
Talk to Advisor