Apache Hadoop Masterclass

Brochure Image

by Tim Seears download a PDF brochure Download Event Brochure

Description

This one-day course is designed to help both IT professionals and decision-makers understand the concepts and benefits of Apache Hadoop and how it can help them meet business goals.
You will get a good understanding of the Hadoop technology stack, including MapReduce, HDFS, Hive, Pig, HBase, and provides an initial introduction to Mahout and other common utilities.

What you will learn

  • The essential components of a Hadoop-based Data Management solution
  • Pros and cons of implementing Hadoop
  • How does Hadoop fit into our existing environment and architecture?
  • The differences between various Hadoop distributions

Main Topics

  • Why Hadoop?
    o History & background
    o Real-world use cases and case studies
  • The Hadoop Platform

    o Introduction to MapReduce and Hadoop File System (HDFS)
    o Data warehousing with Hive
    o Parallel processing with Pig
    o Data mining with Mahout
    o Data storage with HBase
    o Common utilities - Sqoop, Flume, Hue, Scribe, Zookeeper, HCatalog
    o Hadoop distributions - Apache Foundation, Cloudera, Hortonworks, MapR, IBM
  • The future of Hadoop

    o YARN - Next generation MapReduce
    o Other programming paradigms on Hadoop