Best Online Training

Goals & Objectives:

By the end of the course, you will:

  • Get a clear understanding of Apache Hadoop, HDFS, Hadoop Cluster and Hadoop Administration
  • Gain insight on Hadoop 2.0, Name Node High Availability, HDFS Federation, YARN, MapReduce v2
  • Plan and Deploy a Hadoop Cluster
  • Load Data and Run Applications
  • Configuration and Performance Tuning
  • Manage, Maintain, Monitor and Troubleshoot a Hadoop Cluster
  • Secure a deployment and understand Backup and Recovery
  • Understand about Oozie, Hcatalog/Hive, and HBase Administration

Curriculum:

Hadoop Cluster Administration

Learning Objectives – In this module, you will understand what is Big Data and Apache Hadoop, How Hadoop solves the Big Data problems, Hadoop Cluster Architecture, Introduction to MapReduce framework, Hadoop Data Loading techniques, and Role of a Hadoop Cluster Administrator.

Topics – Introduction to Big Data, Hadoop Architecture, MapReduce Framework, A typical Hadoop Cluster, Data Loading into HDFS, Hadoop Cluster Administrator: Roles and Responsibilities

Hadoop Architecture and Cluster setup

Learning Objectives – After this module, you will understand Multiple Hadoop Server roles such as NameNode and DataNode, and MapReduce data processing. You will also understand the Hadoop 1.0 Cluster setup and configuration, Setting up Hadoop Clients using Hadoop 1.0, and important Hadoop configuration files and parameters.

Topics – Hadoop server roles and their usage, Rack Awareness, Anatomy of Write and Read, Replication Pipeline, Data Processing, Hadoop Installation and Initial Configuration, Deploying Hadoop in pseudo-distributed mode, deploying a multi-node Hadoop cluster, Installing Hadoop Clients

Hadoop Cluster: Planning and Managing

Learning Objectives – In this module, you will understand Planning and Managing a Hadoop Cluster, Hadoop Cluster Monitoring and Troubleshooting, Analyzing logs, and Auditing. You will also understand Scheduling and Executing MapReduce Jobs, and different Schedulers.

Topics – Planning the Hadoop Cluster, Cluster Size, Hardware and Software considerations, Managing and Scheduling Jobs, types of schedulers in Hadoop, Configuring the schedulers and run MapReduce jobs, Cluster Monitoring and Troubleshooting.

Backup, Recovery and Maintenance

Learning Objectives – In this module, you will understand day to day Cluster Administration tasks such as adding and Removing Data Nodes, NameNode recovery, configuring Backup and Recovery in Hadoop, Diagnosing the Node Failures in the Cluster, Hadoop Upgrade etc.

Topics – Configure Rack awareness, Setting up Hadoop Backup, whitelist and blacklist data nodes in a cluster, setup quota’s, upgrade Hadoop cluster, copy data across clusters using distcp, Diagnostics and Recovery, Cluster Maintenance.

Hadoop 2.0 and High Availability

Learning Objectives – In this module, you will understand Secondary NameNode setup and check pointing, Hadoop 2.0 New Features, HDFS High Availability, YARN framework, MRv2, and Hadoop 2.0 Cluster setup in pseudo- distributed and distributed mode.

Topics – Configuring Secondary NameNode, Hadoop 2.0, YARN framework, MRv2, Hadoop 2.0 Cluster setup, Deploying Hadoop 2.0 in pseudo-distributed mode, deploying a multi-node Hadoop 2.0 cluster.

Advanced Topics: QJM, HDFS Federation and Security

Learning Objectives – In this module, you will understand basics of Hadoop security, Managing security with Kerberos, HDFS Federation setup and Log Management. You will also understand HDFS High Availability using Quorum Journal Manager (QJM).

Topics – Configuring HDFS Federation, Basics of Hadoop Platform Security, Securing the Platform, Configuring Kerberos.

Oozie, Hcatalog/Hive and HBase Administration

Learning Objectives – In this module, you will understand Setting up Apache Oozie Workflow Scheduler for Hadoop Jobs, Hcatalog/Hive Administration, deploying HBase with other Hadoop components, Using HBase effectively to load data, writing to and reading from HBase.

Topics – Oozie, Hcatalog/Hive Administration, HBase Architecture, HBase setup, HBase and Hive Integration, HBase performance optimization.

Project: Hadoop Implementation

Learning Objectives – In this module, you will understand how multiple Hadoop ecosystem components work together in a Hadoop implementation to solve Big Data problems. You will also learn how to plan, design, and deploy a Hadoop Cluster using a typical Real-World Use Case.

Topics – Understanding the Problem, Plan, Design, and Create a Hadoop Cluster for a Real World Use Case, Setup and Configure commonly used Hadoop ecosystem components such as Pig and Hive, Configure Ganglia on the Hadoop cluster and troubleshoot the common Cluster Problems.

The practical exercises were useful in offering ‘hands on’ experience. The interactive atmosphere and live examples used for illustration were refreshing.

VENKATESH

Very well organized and conceived. By following the course, I was able to learn and build on the concepts with minimal questions or frustration. It taught me what I was looking to learn, was well organized, and well-paced. I’m already applying what I learned at work.

Priyanka

Course Reviews

4

33 ratings
  • 1 stars0
  • 2 stars0
  • 3 stars0
  • 4 stars0
  • 5 stars0

No Reviews found for this course.

PRIVATE COURSE
  • PRIVATE
  • 45 Days

Instructors

70 STUDENTS ENROLLED
    •   Live Instructor Led Course
    •   Batch Flexibility
    •   Customized Course
    •   Live Projects
    •   Resume Preparation

      Drop Us A Query

    Contact Training Adviser?



    Register

    Free Demo

    Or
    Call Us @

    9502434001

    ©2016 4bssolutions. All rights Reserved.DISCLAIMER.

    Show Buttons
    Share On Facebook
    Share On Twitter
    Share On Google Plus
    Share On Linkdin
    Share On Youtube
    Hide Buttons