In Design Tip #140, I discussed the challenges of designing dimensional schemas for processes of indeterminate length such as a sales pipeline or insurance claims processing. We concluded they are best represented as accumulating snapshot fact tables characterized by one row per pipeline occurrence where each row is updated multiple times over its lifetime. However, […]

This Design Tip continues our series on how to implement common dimensional design patterns in your ETL system. The relationship between a fact table and its dimensions is usually many-to-one. That is, one row in a dimension, such as customer, can have many rows in the fact table, but one row in the fact table should belong to […]

For most subject areas, it’s pretty easy to identify the major dimensions: Product, Customer Account, Student, Employee, and Organization are all easily understood as descriptive dimensions. A store’s sales, a telecommunication company’s phone calls, and a college’s course registrations are all clearly facts. However, for some subject areas, it can be challenging – especially for […]

This Design Tip continues my series on implementing common ETL design patterns. These techniques should prove valuable to all ETL system developers, and, we hope, provide some product feature guidance for ETL software companies as well. Recall that a shrunken dimension is a subset of a dimension’s attributes that apply to a higher level of […]

Some clients and students lament that while they want to deliver and share consistently-defined master conformed dimensions in their data warehouse/business intelligence (DW/BI) environments, it’s “just not feasible.” They explain that they would if they could, but with senior management focused on using agile development techniques to deliver DW/BI solutions, it’s “impossible” to take the […]

A data warehouse / business intelligence system is challenging to test. Standard testing methodology tests one little thing at a time, but a DW/BI system is all about integration and complexity, not to mention large data volumes. Here are my top five recommendations for building and executing a testing environment for your DW/BI project. 1. […]

Factless fact table are“fact tables that have no facts but captures the many-to-many relationship between dimension keys.” We’ve previously discussed factless fact tables to represent events or coverage information. An event-based factless fact table is student attendance information; the grain of the fact table is one row per student each day. A typical coverage factless fact […]

Accumulating snapshots are one of the three fundamental types of fact tables. We often state that accumulating snapshot fact tables are appropriate for predictable workflows with well-established milestones. They typically have five to ten key milestone dates representing the workflow/pipeline start, completion, and the key event dates in between. Our students and clients sometimes ask […]

Design Tip #43 Dealing with Nulls in the Dimensional Model describes two cases where null values should be avoided in a dimensional model; in these situations, we recommend using default values rather than nulls. This Design Tip provides guidance for selecting meaningful, verbose defaults. Handling Null Foreign Keys in Fact Tables The first scenario where […]

This Design Tip describes how to create and manage mini-dimensions. Recall that a mini-dimension is a subset of attributes from a large dimension that tend to change rapidly, causing the dimension to grow excessively if changes were tracked using the Type 2 technique. By extracting unique combinations of these attribute values into a separate dimension, […]