ETL for the Data Warehouse:
a Template-Driven Approach

by Michael Schmitz

Description

Extract, transformation, and load process development (ETL) typically accounts for more than half of the work on a Data Warehouse project. Although complex and challenging a rigorous ETL process ensures data quality and currency thus ensuring Data Warehouse credibility and usefulness.
The good news is that using a standardized approach along with proven techniques and templates can exponentially lessen the amount of effort required and can ensure data quality, scalability and performance.
This class gives a broad overview of ETL processing for the Data Warehouse and delves into the in-depth issues and considerations involved. The class looks at the increasing need for Real-Time data feeds to the Warehouse and discusses the various methods to meet these needs. It specifically presents and teaches a template-driven approach which quickens development speed and provides completeness.
These templates are demonstrated with working Informatica/Oracle code, but can and have been adapted for other ETL tools and database platforms. They are also applicable for hand-coded efforts.

What you will learn

  • Gain a thorough understanding of the critical ETL development issues
  • Understand current main-stream ETL architectural approaches
  • Learn in-depth techniques for addressing common development issues including how to develop near Real-Time data feeds
  • Be introduced to using standardized maintenance templates and learn how to apply them to your particular environment
  • Take back working code to jump start your ETL development efforts

Main Topics

  • The ETL Process Overview
  • ETL Process Detail
  • ETL Architectures
  • ETL Development Frameworks
  • ETL Tool based
  • Processing Options
  • ETL Tool Performance Issues
  • A Near Real-Time Case Study
  • The BK-PRO Template-Driven Approach
  • Template Case Study