Why Attend
This class provides an overview of all the factors necessary for effectively designing the back room ETL system of your DW/BI environment. It tries to guarantee that critical processes within the ETL system are not overlooked. Even if you don’t have an immediate qualified need for every ETL subsystem on our list, it is likely that you will over time. By the end of this course, you will understand how your data warehouse ETL system can be built to anticipate these potential requirements.
This is not a microscopic code-oriented implementation class; it is a vendor-agnostic architecture class for the designer who must keep a broad perspective. The course is organized around the 34 necessary ETL subsystems which are discussed in detail throughout the course.
Bringing Kimball University onsite enables everyone on the team to get on the same page with a common vocabulary and understanding of core techniques; the result is more efficient education with lower travels cost and lost productivity, plus less downstream “tire spinning” within the team.
Who Should Attend
This course is designed for those responsible for building the back room ETL system of a data warehouse environment, including ETL architects, ETL designers and developers, and data warehouse operational staff.
Since dimensional models are the ultimate ETL deliverables, some familiarity with the basic principles of dimensional modeling is necessary. Students can gain this knowledge by reading the following Data Management Review articles found at www.kimballgroup.com.
- Resist the Urge to Start Coding (Nov 2007)
- Set Your Boundaries (Dec 2007)
- Data Wrangling (Jan 2008)
- Dimensional Perspectives (Feb 2008)
Instructors
Bob Becker, co-instructor of Kimball University’s public ETL Architecture in Depth course
Course Overview
Day 1
- Surrounding the ETL Requirements
- 34 ETL Subsystems: Data Profiling through Delivering Dimension Tables
Day 2
- Continuation of 34 ETL Subsystems: Delivering Fact Tables through Operations
- Modifying your ETL Architecture for Real Time Data Warehousing
Course Details
Day 1
Surrounding the Requirements
- Business needs
- Data profiling
- ETL tool versus coding
Introduction to 34 ETL Subsystems
- Subsystem 1 – Data profiling
- Subsystem 2 – Change data capture
- Subsystem 3 – Extract
- Extract window
- Immediate transformations
- Extract staging table designs
- Retention
- Subsystems 4, 5 and 6 – Cleaning
- Data quality architecture
- Data quality screens
- Error event fact table
- Audit dimension
- Compliance tracking
- Subsystem 28 – Sorting
- Subsystem 7 – De-duplication and survivorship
- Subsystem 8 – Conforming
- Conformed dimensions
- Mapping incompatible structures
- Subsystems 9, 10, 11, 12 and 15 – Delivering dimension tables
- Time variance designs
- Surrogate key generator
- Bridge tables
- Hierarchies
- Special dimensions
Day 2
Continuation of the 34 ETL Subsystems
- Subsystems 13, 14 and 16 – Delivering fact tables
- Fact table builders for transaction, periodic, and accumulating grains
- Surrogate key pipeline
- Graceful extensibility
- Late arriving data
- Subsystems 17 and 18 – Dimension manager and fact provider
- Responsibilities and procedures
- Real time complexities
- Distributed, federated warehouses
- Delivering remote dimensions, attributes, and facts
- Subsystems 19 and 20 – Delivering aggregates
- Aggregations
- Aggregate navigation
- OLAP cubes
- Subsystem 21 – Data propagation manager
- Feeding data mining
- Presentation layer extracts
- 3rd party flat files
- Subsystem 22 – Job scheduler
- Subsystems 23 and 24 – Backup and recovery
- Subsystems 25 and 26 – Version control
- Version control
- Migration and testing
- Subsystems 27, 29 and 30 – Managing ETL workflow
- Workflow monitor
- Job scheduler
- Lineage and dependency tracking
- Problem escalation system
- Subsystems 31, 32, 33 and 34 – Development and operations
- Parallel processing and pipelining
- Security
- Compliance
- Metadata
Modifying the ETL Architecture for Real Time Data Warehousing
- Hot partition
- Streaming versus batch ETL
Why Attend
Excellence in dimensional modeling remains the keystone of a well-designed data warehouse presentation area, regardless of whether you’ve adopted a Kimball, Corporate Information Factory (CIF), or hybrid architecture.
The world “Kimball” is synonymous with dimensional modeling. Ralph Kimball established an extensive portfolio of dimensional techniques and vocabulary, including conformed dimensions, slowly changing dimensions, junk dimensions, mini-dimensions, bridge tables, periodic and accumulating snapshot fact tables, and the list goes on.
In this course, you will learn practical dimensional modeling techniques covering fundamental to advanced patterns and best practices. Concepts are illustrated through real-world industry scenarios, conveyed through a combination of lectures, class exercises, small group workshops, and individual problem solving.
Bringing Kimball University onsite enables everyone on the team to get on the same page with a common vocabulary and understanding of core techniques. The result is more efficient education with lower travels cost and lost productivity, plus less downstream “tire spinning” within the team.
Who Should Attend
This course is primarily intended for team members who’ve had prior exposure to dimensional modeling. The first day is appropriate for anyone on the team, including project managers, data warehouse architects, data modelers, database administrators, business analysts, and ETL or BI application developers and system designers. The pace picks up on the second day when more advanced concepts are discussed.
Instructors
Margy Ross, co-author of The Data Warehouse Toolkit, 2nd Edition
Course Overview
Day 1
- Introductions
- Dimensional Modeling Fundamentals
- Retail Sales “Basics” Case Study
- Inventory “Beyond the 1st Business Process” Case Study
- Slowly Changing Dimensions
- Order Management Design Workshop
Day 2
- Billing Design Review Exercise
- More on Dimension Tables
- Transportation “Design Enhancement” Case Study
- More on Fact Tables
- Dimensional Modeling Process
- Client-Specific Workshop
Course Details
Day 1
Introductions
- Course agenda and assumptions
Dimensional Modeling Fundamentals
- Role of dimensional modeling in the Kimball, Corporate Information Factory (CIF), and hybrid architectures
- Fact and dimension table characteristics
- Benefits of dimensional modeling
Retail Sales “Basics” Case Study
- 4-step process for designing dimensional models
- Fact table granularity
- Transaction fact tables
- Degenerate dimensions
- Denormalized dimension table hierarchies
- Dealing with nulls
- Surrogate keys for dimensions
- Centipede fact tables with too many dimensions
- Star versus snowflake schemas
- Factless fact tables
Inventory “Beyond the 1st Business Process” Case Study
- Implications of business processes on data architecture
- Periodic snapshot fact tables
- Semi-additive facts
- Conformed dimensions – identical and shrunken roll-ups
- Enterprise Data Warehouse Bus Architecture and matrix for master data and integration
- Exercise: Translate business requirements into DW bus matrix
Slowly Changing Dimensions
- Basic Type 1, 2 and 3 techniques
Order Management Design Workshop
- Drilling across fact tables
- Consolidated cross-process fact tables
- Dimension table role-playing
- Complications with operational header/line data
- Simultaneous facts and dimensions
- Junk dimensions for miscellaneous transaction indicators
- Accumulating snapshot fact tables
Day 2
Billing Design Review Exercise
- Common design flaws and mistakes to avoid
- Checklist for conducting design reviews
More on Dimension Tables
- Series of predictable dimension attribute changes
- Advanced techniques to deliver current and point-in-time attribute values
- Mini-dimensions for large, rapidly changing dimensions
- Multiple mini-dimensions and outriggers
- Modeling many-valued dimension attributes with bridge tables – correctly weighted versus “impact” reports
- Alternatives to bridge tables for handling many-valued dimension attributes
- Slightly ragged dimension hierarchies
- Bridge tables to drill up/down variable-depth ragged hierarchies
- Date and time-of-day dimension considerations
- Freeform text comments
- Generic abstract dimensions
- Outrigger dimensions
- Time series of dimension tags
- Super-types and sub-types for heterogeneous products
- Audit dimensions
Transportation “Design Enhancement” Case Study
- Exercise: Schema enhancements for changing requirements
- Multiple time zones
- Design trade-offs
More on Fact Tables
- Fact table surrogate keys
- Allocated facts at different levels of detail
- Multiple currencies or units of measure
- Fact table effective and expiration dates
- Complex, unpredictable accumulating snapshots
- Value banding facts
- Fact table normalization
- Detailed implementation bus matrix
Dimensional Modeling Process
- Process flow, tasks and deliverables
Client-Specific Workshop
- Development of client-centric preliminary data warehouse bus matrix or discussion of client-specific design questions and issues
Why Attend
Excellence in dimensional modeling remains the keystone of a well-designed data warehouse presentation area, regardless of whether you’ve adopted a Kimball, Corporate Information Factory (CIF), or hybrid architecture.
The world “Kimball” is synonymous with dimensional modeling. Ralph Kimball established an extensive portfolio of dimensional techniques and vocabulary, including conformed dimensions, slowly changing dimensions, junk dimensions, mini-dimensions, bridge tables, periodic and accumulating snapshot fact tables, and the list goes on.
In this course, you will learn practical dimensional modeling techniques covering fundamental patterns and best practices. Concepts are illustrated through real-world industry scenarios, conveyed through a combination of lectures, class exercises, small group workshops, and individual problem solving. While this course doesn’t cover every dimensional modeling pattern, those most commonly observed across industries are thoroughly discussed. In addition, you will receive an overview of the Kimball Lifecycle Approach for designing and building a DW/BI system.
Bringing Kimball University onsite enables everyone on the team to get on the same page with a common vocabulary and understanding of core techniques. The result is more efficient education with lower travels cost and lost productivity, plus less downstream “tire spinning” within the team.
Who Should Attend
This course is appropriate for anyone interested in dimensional modeling, whether they’re project managers, data warehouse architects, data modelers, database administrators, business analysts, and ETL or BI application developers and system designers.
Instructors
Either Margy Ross, co-author of The Data Warehouse Toolkit, 2nd Edition, or Bob Becker. Both Bob and Margy have focused on dimensional modeling for over two decades.
Course Overview
Day 1
- Introductions
- Dimensional Modeling Fundamentals
- Retail Sales “Basics” Case Study
- Inventory “Beyond the 1st Business Process” Case Study
- Slowly Changing Dimensions
- Order Management Design Workshop
Day 2
- Billing Design Review Exercise
- Kimball Lifecycle Approach for DW/BI System Development
- Transportation “Design Enhancement” Case Study
- Retail Brokerage Design Workshop
- Insurance “Pulling it all Together” Case Study
- Client-Specific Workshop
Course Details
Day 1
Introductions
- Course agenda and assumptions
Dimensional Modeling Fundamentals
- Role of dimensional modeling in the Kimball, Corporate Information Factory (CIF), and hybrid architectures
- Fact and dimension table characteristics
- Benefits of dimensional modeling
Retail Sales “Basics” Case Study
- 4-step process for designing dimensional models
- Fact table granularity
- Transaction fact tables
- Degenerate dimensions
- Date and time-of-day dimension considerations
- Denormalized dimension table hierarchies
- Dealing with nulls
- Surrogate keys for dimensions
- Centipede fact tables with too many dimensions
- Star versus snowflake schemas
- Factless fact tables
Inventory “Beyond the 1st Business Process” Case Study
- Implications of business processes on data architecture
- Periodic snapshot fact tables
- Semi-additive facts
- Conformed dimensions – identical and shrunken roll-ups
- Enterprise Data Warehouse Bus Architecture and matrix for master data and integration
- Exercise: Translate business requirements into DW bus matrix
Slowly Changing Dimensions
- Basic Type 1, 2 and 3 techniques
- Advanced techniques to deliver current and point-in-time attribute values
- Mini-dimensions for large, rapidly changing dimensions
Order Management Design Workshop
- Drilling across fact tables
- Consolidated cross-process fact tables
- Dimension table role-playing
- Allocated facts at different levels of detail
- Simultaneous facts and dimensions
- Complications with operational header/line data
- Multiple currencies
- Junk dimensions for miscellaneous transaction indicators
- Accumulating snapshot fact tables
Day 2
Billing Design Review Exercise
- Common design flaws and mistakes to avoid
- Checklist for conducting design reviews
Kimball Lifecycle Approach for DW/BI System Development
- Best practices from project launch through deployment
- Common challenges and pitfalls
- Exercise: Surrogate key administration for slowly changing dimensions
Transportation “Design Enhancement” Case Study
- Exercise: Schema enhancements for changing requirements
- Multiple time zones
- Design trade-offs
Retail Brokerage Design Workshop
- Complementary transaction and periodic snapshot schemas
- Modeling many-valued dimension attributes with bridge tables
- Generic abstract dimensions
- Freeform text comments
Insurance “Pulling it all Together” Case Study
- Review of design patterns and techniques
- Development of bus matrix from extended case study
- Detailed implementation bus matrix
Client-Specific Workshop
- Development of client-centric preliminary data warehouse bus matrix
