This training is recommended for developers who know how to create apps in analyzing Big data stored in Apache Hadoop by utilizing Hive and Pig. The topics also coves HDFS, data ingestion, Hadoop, data ingestion and workflow definition, utilizing Pig and Hive in performing data analytics stored on Big Data. The course also covers introductory course Spark SQL and Spark Core.
Prerequisites for this training
The recommended prerequisite for this course is familiarity with software development and programming principles. However, any specific knowledge of Hadoop is not required.
Who should attend this course?
This training is intended for software developers who want to gain knowledge in developing apps for Hadoop.
What you will learn
Describing about YARN and utilize cases for Hadoop.
Describing about Hadoop frameworks and ecosystem tools.
Describing Hadoop frameworks and ecosystem tools.
Describing about HDFS architecture.
Utilizing the Hadoop client and input data into HDFS.
Transferring of data between Hadoopn and relation database.
Explaining MaoReduce architectures and YARN.
Running MapReduce job on YARN.
Utilizing Pig to transform and explore data in HDFS.
Understanding about Hive Tables which are implemented and defined.
Utilizing the functionalities used in Hive Windows.
Utilizing Hive to analyze and explore datasets.
Utilizing and explaining about the several Hive File Formats.
Populating and creating a Hive table that utilized ORC file supported extension.
Utilizing Hive to execute SQL queries in performing data analysis.
Utilizing Hive for joining data sets and utilizing widespread techniques.
Writing effectual Hive queries.
Performing data analytics utilizing the DataFu Pig library.
Explaining the purposes and utilizing HCatalog.
Scheduling and defining about Oozie workflow.
Presenting high-level architecture and Spark ecosystem.
Exploring DataFrame API and Spark SQL.
With Microtek Learning, you’ll receive:
Certified Instructor-led training
Industry Best Trainers
Official Training Course Student Handbook
Pre and Post assessments/evaluations
Collaboration with classmates (not available for a self-paced course)
Real-world knowledge activities and scenarios
Exam scheduling support*
Learn and earn program*
Knowledge acquisition and exam-oriented
Interactive online course.
Support from an approved expert
For Government and Private pricing*
For many years, Microtek Learning has been helping organizations, leaders, and professionals to reach their maximum performance by addressing the challenges they are facing.
- 300+ enterprise clients
- 100,000+ professionals trained
- Service 70 of the Fortune 100
- 96% of our clients would recommend us
The Hadoop Distributed File System
Ingesting Data into HDFS
The MapReduce Framework
Starting an HDP Cluster
Demonstration: Understanding Block Storage
Using HDFS Commands
Importing RDBMS Data into HDFS
Exporting HDFS Data to an RDBMS
Importing Log Data into HDFS Using Flume
Demonstration: Understanding MapReduce
Running a MapReduce Job
Introduction to Apache Pig
Advanced Apache Pig Programming
Demonstration: Understanding Apache Pig
Getting Starting with Apache Pig
Exploring Data with Apache Pig
Splitting a Dataset
Joining Datasets with Apache Pig
Preparing Data for Apache Hive
Demonstration: Computing Page Rank
Analyzing Clickstream Data
Analyzing Stock Market Data Using Quantiles
Apache Hive Programming
Advanced Apache Hive Programming
Understanding Hive Tables
Understanding Partition and Skew
Analyzing Big Data with Apache Hive
Demonstration: Computing NGrams
Joining Datasets in Apache Hive
Computing NGrams of Emails in Avro Format
Using HCatalog withApachePig
Advanced Apache Hive Programming (Continued)
Hadoop 2 and YARN
Introduction to Spark Core and Spark SQL
Defining Workflow with Oozie
REVIEWS ON OUR POPULAR COURSES