The notion of time pervades every corner of the data warehouse. Most of the fundamental measurements we store in our fact tables are time series, which we carefully annotate with time stamps and foreign keys connecting to calendar date dimensions. But the effects of time are not isolated just to these activity-based time stamps. All […]

The owner of the data warehouse must decide how to respond to the changes in the descriptions of dimensional entities like Employee, Customer, Product, Supplier, Location and others. In 30 years of studying this issue, I have found that only three different kinds of responses are needed. I call these slowly changing dimension (SCD) Types […]

How do you deal with changing dimensions? Hybrid approaches fill gaps left by the three fundamental techniques. Unlike most OLTP systems, a major objective of a data warehouse is to track history. So, accounting for change is one of the analyst’s most important responsibilities. A sales force region reassignment is a good example of a […]

Most ETL tools provide some functionality for handling slowly changing dimensions. Every so often, when the tool isn’t performing as needed, the ETL developer will use the database to identify new and changed rows, and apply the appropriate inserts and updates. I’ve shown examples of this code in the Data Warehouse Lifecycle in Depth class using standard INSERT […]

Ralph introduced the concept of slowly changing dimension (SCD) attributes in 1996. Dimensional modelers, in conjunction with the business’s data governance representatives, must specify the data warehouse’s response to operational attribute value changes. Most Kimball readers are familiar with the core SCD approaches: type 1 (overwrite), type 2 (add a row), and type 3 (add […]

Drawing the Line Between Dimensional Modeling and ER Modeling Techniques Dimensional modeling (DM) is the name of a logical design technique often used for data warehouses. It is different from, and contrasts with, entity-relation modeling (ER). This article points out the many differences between the two techniques and draws a line in the sand. DM […]

This Design Tip describes how to create and manage mini-dimensions. Recall that a mini-dimension is a subset of attributes from a large dimension that tend to change rapidly, causing the dimension to grow excessively if changes were tracked using the Type 2 technique. By extracting unique combinations of these attribute values into a separate dimension, […]

People often engage us to conduct dimensional model design reviews. In this column, I’ll provide a laundry list of common design flaws to scout for when performing a review. I encourage you to use this list to critically review your own draft schemas in search of potential improvements. What’s the Grain? When a data warehouse team […]

The Kimball Group has been exposed to hundreds of successful data warehouses. Careful study of these successes has revealed a set of extract, transformation, and load (ETL) best practices. We first described these best practices in an Intelligent Enterprise column three years ago. Since then we have continued to refine the practices based on client […]

Meaningless integer keys, otherwise known as surrogate keys, are commonly used as primary keys for dimension tables in data warehouse designs. Our students frequently ask us – what about fact tables? Should a unique surrogate key be assigned for every row in a fact table? Although for the logical design of a fact table, the answer is no, […]