This training provides you the knowledge about Apache Spark distributed computing engine which is appropriate for developers, technical managers, architects, data analysts and any learner who want to utilize Spark. The course also provides technical knowledge about Spark architecture and its functionalities. It also covers the basic building blocks along with HL constructs providing a capable and simpler interface. The training also helps you to gain in-depth knowledge of DataSets, Spark SQL and DataFrames.
Prerequisites for this training
Recommend familiarity with programming principles and good experience in software developing utilizing Scala. However, any previous experience with SQL, HDP and data streaming is also beneficial.
Who should attend this course?
This training is intended for software developers who are seeking to develop in-memory apps and highly apps within HDP environment.
Oops! For this course, there are currently no public schedules available. Clicking on "Notify Me" will allow you to express your interest.
For dates, times, and location customization of this course, get in touch with us.
You can also speak with a learning consultant by calling 800-961-0337.
What you will learn
Installing and acquiring Spark.
Identifying Supported Data Formats
Utilizing Accumulators and Broadcast Variables.
Creating and configuring SparkSession.
With Microtek Learning, you’ll receive:
Certified Instructor-led training
Industry Best Trainers
Official Training Course Student Handbook
Pre and Post assessments/evaluations
Collaboration with classmates (not available for a self-paced course)
Real-world knowledge activities and scenarios
Exam scheduling support*
Learn and earn program*
Knowledge acquisition and exam-oriented
Interactive online course.
Support from an approved expert
For Government and Private pricing*
* For more details call: +1-800-961-0337 or Email: firstname.lastname@example.org
For many years, Microtek Learning has been helping organizations, leaders, and professionals to reach their maximum performance by addressing the challenges they are facing.
- 300+ enterprise clients
- 100,000+ professionals trained
- Service 70 of the Fortune 100
- 96% of our clients would recommend us
Working with: Variables, Data Types, and Control Flow
The Scala Interpreter
Collections and their Standard Methods (e.g. map())
Working with: Functions, Methods, and Function Literals
Define the Following as they Relate to Scale: Class, Object, and Case Class
Overview, Motivations, Spark Systems
Spark vs. Hadoop
Acquiring and Installing Spark
The Spark Shell, SparkContext
Setting Up the Lab Environment
Starting the Scala Interpreter
A First Look at Spark
A First Look at the Spark Shell
RDD Concepts, Lifecycle, Lazy Evaluation
RDD Partitioning and Transformations
Working with RDDs Including: Creating and Transforming
An Overview of RDDs
SparkSession, Loading/Saving Data, Data Formats
Introducing DataFrames and DataSets
Identify Supported Data Formats
Working with the DataFrame (untyped) Query DSL
Working with the DataSet (typed) API
Mapping and Splitting
DataSets vs. DataFrames vs. RDDs
Operations on Multiple RDDs
Spark SQL Basics
The DataSet Typed API
Splitting Up Data
Working with: Grouping, Reducing, Joining
Shuffling, Narrow vs. Wide Dependencies, and Performance Implications
Exploring the Catalyst Query Optimizer
The Tungsten Optimizer
Discuss Caching, Including: Concepts, Storage Type, Guidelines
Minimizing Shuffling for Increased Performance
Using Broadcast Variables and Accumulators
General Performance Guidelines
Exploring Group Shuffling
Seeing Catalyst at Work
Seeing Tungsten at Work
Working with Caching, Joins, Shuffles, Broadcasts, Accumulators
Broadcast General Guidelines
Core API, SparkSession.Builder
Configuring and Creating a SparkSession
Building and Running Applications
Application Lifecycle (Driver, Executors, and Tasks)
Cluster Managers (Standalone, YARN, Mesos)
Logging and Debugging
Introduction and Streaming Basics
Spark Streaming (Spark 1.0+)
Structured Streaming (Spark 2+)
Consuming Kafka Data
Spark Job Submission
Additional Spark Capabilities
Spark Structured Streaming
Spark Structured Streaming with Kafka
REVIEWS ON OUR POPULAR COURSES