• 100 Queen St W, Brampton, ON L6X 1A4, Canada
  • +1-800-961-0337
For more details, please call us on +1800-961-0337 or email us on info@microteklearning.com

Course Description

This course is designed for the professionals and systems administrators who wants to learn or are responsible for designing, installing, configuring, and managing Hortonworks Data Platform (HDP). The course will be providing you in-depth knowledge and provide you the experience of how to use Apache Ambari as platform for operational management of HDP. This course does not need any prior knowledge or experience with Hadoop.

Audience Profile

The course is targeted to the administrators and system operators working with Linux and are responsible for the installation, configuration, and management of an HDP cluster.

Course Objectives

  • Introducing Big Data, Hadoop and Hortonworks Data Platform
  • Management of HDFS Storage, Rack Awareness, Snapshots and Centralized Cache
  • Introducing YARN
  • Achieving High Availability with HDP, Deployment of HDP with Blueprints, and Upgrade Process of HDP
    1. Module: 
      • Give Description of Architecture and Operation of HDFS
    2. LABS
      1. Setting Up the Environment
      2. HDP installation
      3. Ambari Users and Groups Management
      4. Hadoop Services Management
      5. HDFS Storage use
      6. WebHDFS use
      7. Using Access Control Lists of HDFS
    3. Apache Hadoop description
    4. Summarizing the Purpose of Hortonworks Data Platform Software
    5. Hadoop Cluster Management
    6. Apache Ambari Description
    7. Identification of Hadoop Cluster Deployment Options
    8. Planning Deployment of Hadoop Cluster
    9. Performing an Interactive HDP Installation process using Apache Ambari
    10. Apache Ambari Installation
    11. Give the Differences Between Hadoop Users, Service Owners, and Apache Ambari Users
    12. Management of Users, Groups and Permissions associated
    13. Identification of Files related to Hadoop Configuration
    14. Summarizing the Operations of Web UI Tool
    15. Management of Configuration of Hadoop Service Properties Using Web UI of Apache Ambari
    16. Hadoop Distributed File System (HDFS)
    17. Performing HDFS Shell Operations
    18. WebHDFS Use
    19. Data Protection Using HDFS Access Control Lists (ACLs)
        1. Management of HDFS using Ambari Web, NameNode and DataNode UIs
        2. HDFS management using Command-line Tools
        3. understanding Purpose and Benefits of Rack Awareness
        4. Rack Awareness Configuration
        5. Understanding Considerations of Hadoop Backup
        6. Enabling and Managing HDFS Snapshots
        7. Use of DistCP for copying Data
        8. Using Snapshots and DistCP Together
        9. Identification of the Purpose and Operation of Heterogeneous HDFS Storage
        10. Purpose and Operation of Centralized Caching of HDFS
        11. HDFS Centralized Cache configuration
        12. Defining and Managing Cache Pools and Cache Directives
        13. Identification of HDFS NFS Gateway Use Cases
        14. Recalling Gateway Architecture and Operation of HDFS NFS
        15. Installation and Configuration of HDFS NFS Gateway
        16. Configuration of client for an HDFS NFS Gateway
      1. LABS
        • HDFS Storage management
        1. HDFS Quota management
        2. Rack Awareness configuration
        3. HDFS Snapshot management
        4. DistCP usage
        5. Configuration of HDFS Storage Policies
        6. Configuration of HDFS Centralized Cache
        7. Configuration of NFS Gateway
      1. Module: 3
        • YARN Resource Management description
        1. YARN Architecture and Operation
        2. Identification and Use of options in YARN Management
        3. understanding YARN Response to Component Failure
        4. Basics of Running Simple YARN Applications
        5. understanding Purpose and Operation of Capacity Scheduler of YARN
        6. Configuration and Management of YARN Queues
        7. Control Access to YARN Queues
        8. Purpose and Operation of YARN Node Labels
        9. Process description of Creating Node Labels
        10. Description of Process Used for Add, Modification and Removal of Node Labels
        11. Configuration of Queues to Access Node Label Resources
        12. Running Test Jobs for Confirming Node Label Behavior
      1. LABS
        • YARN Management Using Ambari
        1. YARN management Using CLI
        2. Sample YARN Applications an running them
        3. Set Up for Capacity Scheduler
        4. YARN Containers and Queues Management
        5. YARN ACLs and User Limits Management
        6. Working with YARN Node Labels
      1. Module: 4
        • Understanding the Purpose of NameNode HA
        1. Configuration of NameNode HA Using Ambari
        2. Purpose of ResourceManager HA
        3. Configuration of ResourceManager HA using Apache Ambari
        4. Identifying Reasons for Adding, Replacing and Deleting Worker Nodes
        5. Demonstration of Adding a Worker Node
        6. Configuration and Running HDFS Balancer
        7. Decommissioning and Re-commissioning a Worker Node
        8. Description of Process of Moving Master Component
        9. Purpose and Operation of Apache Ambari Metrics
        10. Basics and Benefits of the Apache Ambari Dashboard
        11. Purpose and Benefits of Apache Ambari Blueprints
        12. Deployment of a Cluster Using Ambari Blueprints
        13. Definition of an HDP Stack and Interpret its Version Number
        14. View the Current Stack and Identify Compatible Apache Ambari Software Versions
        15. Types of Methods and Upgrades Available in HDP
        16. Description of Upgrade Process, Restrictions and Pre-upgrade Checklist
        17. Use Apache Ambari Web UI for Upgrade
      1. LABS
        • Configuration of NameNode HA
        1. Configuration of Resource Manager HA
        2. Add, Decommission and Re-commission a Worker Node
        3. Configuration of Ambari Alerts
        4. Deployment of HDP Cluster Using Ambari Blueprints
        5. Perform and HDP Upgrade - Express



  • If there is some basic knowledge of SQL statements, then it will be an advantage but not necessarily.
  • Students must have some basic experience of working in a Linux environment and should be well versed with the standard Linux system commands. Students should know how to read and execute basic shell scripts of Linux
  • Students having some operational experience in practices related to datacenter, like change management, release management, incident or problem management.