18756 Stone Oak Park Way, Suite200, San Antonio TX 78258 USA
100 Queen St W, Brampton, ON L6X 1A4, Canada
country flagUnited States
share button

HDP Analyst Data Science Training





Course Description

The training provides practice of Data science covering machine learning and other natural language processed. Other tools covered are programming and tools languages such as Mathout, Pig, NumPY, Natural Language Toolkit, pandas, Spark MLlib and SciPy.

  • Identifying the cases for data science.
  • Describing about YARN architecture and Hadoop.
  • Describing unsupervised and supervised learning differences.
  • Utilizing Mahout to execute a ML algorithm on Hadoop.
  • Describing the lifecycle of data science.
  • Utilizing Pig to prepare and transform data on Hadoop.
  • Writing Python script.
  • Describing about the options for running Python code.
  • Writing Pig User-Defined functionalities in Python.
  • Utilizing Pig streaming with Python script on Hadoop.
  • Utilizing ML algorithms.
  • Describing the utilizing cases for NLP.
  • Utilizing the NLTK.
  • Writing about Spark application in Python.
  • Running ML algorithms by utilizing Spark MLlib.
Who Should Attend?

This training is intended for software developers and architects. However, the primary audience for this training are data scientists and analysts who want to apply machine learning and data science on Hadoop.


It is suggested to have basic knowledge in at least one programming language. However, you must have knowledge in statistics/mathematics and fundamental basic knowledge of Hadoop principles and big data.

Course Details
  • Duration: 3 Days
  • Certification: NO
  • Enrolled: 1246
Get In Touch
Are you being sponsored by your employer to take this class?
* I authorize Microtek Learning to contact me via Phone/Email