Course Objective Summary During this course, you will learn:
• Introduction to Big Data and Hadoop • Hadoop ecosystem - Concepts• Hadoop Map-reduce concepts and features• Developing the map-reduce

Course Objective Summary During this course, you will learn:

• Introduction to Big Data and Hadoop 
• Hadoop ecosystem - Concepts
• Hadoop Map-reduce concepts and features
• Developing the map-reduce Applications
• Pig concepts
• Hive concepts
• Oozie workflow concepts
• Flume Concepts
• Hue Concepts
• HBASE Concepts
• Real Life Use Cases

Virtual box/VM Ware

• Basics
• Installations
• Backups
• Snapshots

• Basics
• Installations • Commands


• Why Hadoop?
• Scaling
• Distributed Framework
• Hadoop v/s RDBMS
• Brief history of hadoop

Setup hadoop

• Pseudo mode
• Cluster mode
• Ipv6
• Ssh
• Installation of java, hadoop
• Configurations of hadoop
• Hadoop Processes ( NN, SNN, JT, DN, TT)
• Temporary directory
• UI
• Common errors when running hadoop cluster, solutions
HDFS- Hadoop distributed File System

• HDFS Design and Architecture
• HDFS Concepts
• Interacting HDFS using command line
• Interacting HDFS using Java APIs
• Dataflow
• Blocks
• Replica

Hadoop Processes

• Name node
• Secondary name node
• Job tracker
• Task tracker
• Data node

Map Reduce

• Developing Map Reduce Application
• Phases in Map Reduce Framework
• Map Reduce Input and Output Formats
• Advanced Concepts
• Sample Applications
• Combiner

Joining datasets in Mapreduce jobs

• Map-side join
• Reduce-Side join

Map reduce – customization

• Custom Input format class
• Hash Partitioner
• Custom Partitioner
• Sorting techniques
• Custom Output format class
Hadoop Programming Languages :-


• Introduction
• Installation and Configuration
• Interacting HDFS using HIVE
• Map Reduce Programs through HIVE
• HIVE Commands
• Loading, Filtering, Grouping….
• Data types, Operators…..
• Joins, Groups….
• Sample programs in HIVE


• Basics
• Installation and Configurations
• Commands….


The Motivation for Hadoop

• Problems with traditional large-scale systems
• Requirements for a new approach

Hadoop: Basic Concepts

• Map-side join
• Reduce-Side join


• An Overview of Hadoop
• The Hadoop Distributed File System
• Hands-On Exercise
• How MapReduce Works
• Hands-On Exercise
• Anatomy of a Hadoop Cluster
• Other Hadoop Ecosystem Components

Writing a MapReduce Program

• The MapReduce Flow
• Examining a Sample MapReduce Program
• Basic MapReduce API Concepts
• The Driver Code
• The Mapper
• The Reducer
• Hadoop’s Streaming API
• Using Eclipse for Rapid Development
• Hands-on exercise
• The New MapReduce API

Common MapReduce Algorithms

• Sorting and Searching
• Indexing
• Machine Learning With Mahout
• Term Frequency – Inverse Document Frequency
• Word Co-Occurrence
• Hands-On Exercise.

PIG Concepts..

• Data loading in PIG.
• Data Extraction in PIG.
• Data Transformation in PIG.
• Hands on exercise on PIG.

Hive Concepts.

• Hive Query Language.
• Alter and Delete in Hive.
• Partition in Hive.
• Indexing.
• Joins in Hive.Unions in hive.
• Industry specific configuration of hive parameters.
• Authentication & Authorization.
• Statistics with Hive.
• Archiving in Hive.
• Hands-on exercise

Working with Sqoop

• Introduction.
• Import Data.
• Export Data.
• Sqoop Syntaxs.
• Databases connection.
• Hands-on exercise

Working with Flume

------------02 Hours

• Introduction.
• Configuration and Setup.
• Flume Sink with example.
• Channel.
• Flume Source with example.
• Complex flume architecture.
• OOZIE Concepts
• IMPALA Concepts
• HUE Concepts

Reporting Tool:

Tableau Software…
1.Tableau Fundamentals.
2.Tableau Analytics.
3.Visual Analytics.
4. Hands-on exercise

