Get introduced to big data and the challenges associated with handling it. Understand the different ways to manage the big data problem and how Hadoop fits in this role.
Introduction to Hadoop Framework
Master the Hadoop framework, Hadoop federation, and the features of Hadoop that makes it an unparalleled framework for processing big data.
Understand MapReduce with detailed discussions on various MapReduce phases and data processing for various file format along with real-world examples.
Understand Apache Pig by contrasting it with MapReduce. Sift through various data types and explore data processing techniques using Pig. Learn to deal with exceptional scenarios using UDFs and by optimizing Pig Query.
Get introduced to Hive and its similarity with SQL. Understand the architecture of Hive, databases creation, tables, and perform various operations using Hive.
Learn about NoSql database and difference between HBase and relational databases. Explore features of the NoSQL databases, CAP theorem, and the HBase architecture. Understand the data model and perform various operations.
Sqoop and Flume
Import and export data from traditional databases, like SQL, Oracle to Hadoop using Sqoop to perform various operations. Master import streaming of data to Hadoop using Apache Flume.
Learn about Oozie and implement it in the workflow to schedule a Hadoop job.