The training provides practice of Data science covering machine learning and other natural language processed. Other tools covered are programming and tools languages such as Mathout, Pig, NumPY, Natural Language Toolkit, pandas, Spark MLlib and SciPy.
Identifying the cases for data science.
Describing about YARN architecture and Hadoop.
Describing unsupervised and supervised learning differences.
Utilizing Mahout to execute a ML algorithm on Hadoop.
Describing the lifecycle of data science.
Utilizing Pig to prepare and transform data on Hadoop.
Writing Python script.
Describing about the options for running Python code.
Writing Pig User-Defined functionalities in Python.
Utilizing Pig streaming with Python script on Hadoop.
Utilizing ML algorithms.
Describing the utilizing cases for NLP.
Utilizing the NLTK.
Writing about Spark application in Python.
Running ML algorithms by utilizing Spark MLlib.
Who Should Attend?
This training is intended for software developers and architects. However, the primary audience for this training are data scientists and analysts who want to apply machine learning and data science on Hadoop.
It is suggested to have basic knowledge in at least one programming language. However, you must have knowledge in statistics/mathematics and fundamental basic knowledge of Hadoop principles and big data.
Related Training and Certification