Developing Solutions Using Apache Hadoop

Brochure Image

by Tim Seears download a PDF brochure Download Event Brochure

Description

This four-day course provides Java programmers the necessary training for creating enterprise solutions using Apache Hadoop. It consists of an effective mix of interactive lecture and extensive hand-on lab exercises. After successfully completing this course each student will receive one free voucher for the Hadoop Certified Developer exam.

Students will work through the following lab exercises using the Hortonworks Data Platform:

  • MapReduce Programming
  • HDFS
  • MapReduce in Operation
  • MapReduce with Combiner
  • MapReduce with Partitioner
  • MapReduce with a Secondary Sort and a Custom Comparator
  • MapReduce with Distributed Cache
  • MapReduce with Data Handling
  • MapReduce with Streaming
  • Pig Into
  • Pig Data Operations
  • Pig ETL Features
  • Pig Clustering Solution
  • Hive Introduction
  • Hive Features
  • Combined Hive and Pig Solution
  • HBase
  • HCatalog
  • MapReduce, Pig, Hive and HCatalog in a Combined Solutions

What you will learn

At the completion of the course students will be enabled to perform the following:

  • Write a MapReduce program using Hadoop API
  • Utilize HDFS for effective loading and processing of data with CLI and API
  • Understand best practices for building, debugging and optimizing Hadoop solutions
  • Use Pig, Hive, HBase and HCatalog effectively

Main Topics

  • MapReduce Code
  • HDFS
  • MapReduce -JobTracker, TaskTracker and Running Jobs
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • MapReduce Streaming
  • MapReduce Data Handling
  • Pig Into
  • Pig Data Model
  • Pig Scripting Language
  • Hive
  • HCatalog
  • HBase
  • Enterprise Integration
  • Future of Hadoop