Sale Date Ended
This four day course of Spark Developer is for data engineers, analysts, architects; software engineers; IT operations; and technical managers interested in a thorough, hands-on overview of Apache Spark.
The course covers the core APIs for using Spark, fundamental mechanisms and basic internals of the framework, SQL and other high-level data access tools, as well as Spark’s streaming capabilities and machine learning APIs.
Objectives
After taking this class you will be able to:
◦ Basic to intermediate Linux knowledge, including: The ability to use a text editor, such as vi
Familiarity with basic command-line options such a mv, cp, ssh, grep, cd, useradd
◦ Knowledge of application development principles
◦ Knowledge of functional programming
◦ Knowledge of Scala or Python
◦ Beginner fluency with SQL
Course Overview
Lesson 1 – Introduction to Apache Spark
◦ From existing RDD
◦ From data sources
◦ Use DataFrame operations
◦ Use SQL
◦ Explore data in DataFrames
◦ UDF used with Scala DSL
◦ UDF used with SQL
◦ Create and use user-defined functions