This training provides you the knowledge about Apache Spark distributed computing engine which is appropriate for developers, technical managers, architects, data analysts and any learner who want to utilize Spark.

📘 Hortonworks 👥 1423 Enrolled ⏱️ 4 Days ⭐ 4.5 | 113 Reviews

Why Microtek Learning?

500+

Courses

10+ Years

Experience

95K+

Global Learners

Virtual Instructor-Led Training

$2800

| HDP Developer: Apache Spark 2.3

Course Overview

The course also provides technical knowledge about Spark architecture and its functionalities.

It also covers the basic building blocks along with HL constructs providing a capable and simpler interface.

The training also helps you to gain in-depth knowledge of DataSets, Spark SQL and DataFrames.

Mode of Training

🏫 Classroom 💻 Live Online 🧪 Blended 👨‍👩‍👧‍👦 Private Group

What you will learn

Installing and acquiring Spark.
Identifying Supported Data Formats
Utilizing Accumulators and Broadcast Variables.
Creating and configuring SparkSession.

Who Should Attend This Course?

This training is intended for software developers who are seeking to develop in-memory apps and highly apps within HDP environment.

Prerequisites

Recommend familiarity with programming principles and good experience in software developing utilizing Scala.

However, any previous experience with SQL, HDP and data streaming is also beneficial.

📞 Talk to a Learning Advisor

📘 HDP Developer: Apache Spark 2.3 Outline

Scala Introduction
Working with: Variables, Data Types, and Control Flow
The Scala Interpreter
Collections and their Standard Methods (e.g. map())
Working with: Functions, Methods, and Function Literals
Define the Following as they Relate to Scale: Class, Object, and Case Class
Overview, Motivations, Spark Systems
Spark Ecosystem
Spark vs. Hadoop
Acquiring and Installing Spark
The Spark Shell, SparkContext

LABS

Setting Up the Lab Environment
Starting the Scala Interpreter
A First Look at Spark
A First Look at the Spark Shell

RDD Concepts, Lifecycle, Lazy Evaluation
RDD Partitioning and Transformations
Working with RDDs Including: Creating and Transforming
An Overview of RDDs
SparkSession, Loading/Saving Data, Data Formats
Introducing DataFrames and DataSets
Identify Supported Data Formats
Working with the DataFrame (untyped) Query DSL
SQL-based Queries
Working with the DataSet (typed) API
Mapping and Splitting
DataSets vs. DataFrames vs. RDDs

LABS

RDD Basics
Operations on Multiple RDDs
Data Formats
Spark SQL Basics
DataFrame Transformations
The DataSet Typed API
Splitting Up Data

Working with: Grouping, Reducing, Joining
Shuffling, Narrow vs. Wide Dependencies, and Performance Implications
Exploring the Catalyst Query Optimizer
The Tungsten Optimizer
Discuss Caching, Including: Concepts, Storage Type, Guidelines
Minimizing Shuffling for Increased Performance
Using Broadcast Variables and Accumulators
General Performance Guidelines

LABS

Exploring Group Shuffling
Seeing Catalyst at Work
Seeing Tungsten at Work
Working with Caching, Joins, Shuffles, Broadcasts, Accumulators
Broadcast General Guidelines

Core API, SparkSession.Builder
Configuring and Creating a SparkSession
Building and Running Applications
Application Lifecycle (Driver, Executors, and Tasks)
Cluster Managers (Standalone, YARN, Mesos)
Logging and Debugging
Introduction and Streaming Basics
Spark Streaming (Spark 1.0+)
Structured Streaming (Spark 2+)
Consuming Kafka Data

LABS

Spark Job Submission
Additional Spark Capabilities
Spark Streaming
Spark Structured Streaming
Spark Structured Streaming with Kafka

Still have questions?

Reach out to our learning advisors for personalized guidance on choosing the right course, group training, or enterprise packages.

📞 Talk to an Advisor

What You Get with Microtek Learning

Instructor-Led Excellence

✓ Certified Instructor-led Training
✓ Top Industry Trainers
✓ Official Student Handbooks

Measurable Learning Outcomes

✓ Pre- & Post-Training Assessments
✓ Practice Tests
✓ Exam-Oriented Curriculum

Real-World Skill Building

✓ Hands-on Activities & Scenarios
✓ Interactive Online Courses
✓ Peer Collaboration (Not in self-paced)

Full Support & Perks

✓ Exam Scheduling Support ^*
✓ Learn & Earn Program ^*
✓ Support from Certified Experts
✓ Gov. & Private Pricing ^*

Our Clients

For over 10 years, Microtek Learning has helped organizations, leaders, students and professionals to reach their maximum potential. We have led the path by addressing their challenges and advancing their performances.