Most ETL tools provide some functionality for handling slowly changing dimensions. Every so often, when the tool isn’t performing as needed, the ETL developer will use the database to identify new and changed rows, and apply the appropriate inserts and updates. I’ve shown examples of this code in the Data Warehouse Lifecycle in Depth class using standard INSERT […]

Fact tables are the foundation of the data warehouse. They contain the fundamental measurements of the enterprise, and they are the ultimate target of most data warehouse queries. There is no point in hoisting fact tables up the flagpole unless they have been chosen to reflect urgent business priorities, have been carefully quality assured and […]

The owner of the data warehouse must decide how to respond to the changes in the descriptions of dimensional entities like Employee, Customer, Product, Supplier, Location and others. In 30 years of studying this issue, I have found that only three different kinds of responses are needed. I call these slowly changing dimension (SCD) Types […]

Students often blur the concepts of snowflakes, outriggers, and bridges. In this Design Tip, I’ll try to reduce the confusion surrounding these embellishments to the standard dimensional model. When a dimension table is snowflaked, the redundant many-to-one attributes are removed into separate dimension tables. For example, instead of collapsing hierarchical rollups such as brand and category into columns […]

The notion of time pervades every corner of the data warehouse. Most of the fundamental measurements we store in our fact tables are time series, which we carefully annotate with time stamps and foreign keys connecting to calendar date dimensions. But the effects of time are not isolated just to these activity-based time stamps. All […]