It seems like a small thing, but names are important. Good names for tables and columns are particularly important for ad hoc users of the DW/BI system who need to find the objects they’re looking for. Object names should be oriented to the business users, not the technical staff. As much as possible, strive to […]

Although dimension tables are typically much smaller than fact tables, the dimension tables are the true drivers of the data warehouse. Dimension tables provide the descriptive context for all the measurements recorded in the fact tables. Although it’s something of an obvious point, without the dimensions the data warehouse would be a meaningless ocean of […]

The third edition of The Data Warehouse Toolkit, our flagship book, was released in July. Although the second edition published in 2002 continued to sell amazingly well, this new edition gave Margy and me the opportunity to “re-master” the science and logic of dimensional modeling, update and extend many dimensional modeling use cases, and publish […]

Drilling across separate business processes is one of the most powerful applications in a data warehouse. We often describe drilling across as magic: separately open connections to the dimensional models for each business process, fetch answer sets from each process labeled identically with row headers drawn from specially conformed dimensions, then deliver the result by […]

Design Tip #43 Dealing with Nulls in the Dimensional Model describes two cases where null values should be avoided in a dimensional model; in these situations, we recommend using default values rather than nulls. This Design Tip provides guidance for selecting meaningful, verbose defaults. Handling Null Foreign Keys in Fact Tables The first scenario where […]

Industry-standard data models are an appealing concept at first blush, but they aren’t the time savers they are cracked up to be. What’s more, these prebuilt models may inhibit data warehouse project success. Vendors and proponents argue that standard, prebuilt models allow for more rapid, less risky implementations by reducing the scope of the data […]

Most readers are familiar with the basic date dimension. At the grain of a calendar day, we include various labels for calendar and fiscal years, quarters, months, and days. We include both short and long labels for various reporting requirements. Even though the labels in the basic date dimension could be constructed at the time we design a […]

A student attending one of Kimball Group’s recent onsite dimensional modeling classes asked me for a list of “Kimball’s Commandments” for dimensional modeling. We’ll refrain from using religious terminology, but let’s just say the following are not-to-be-broken rules together with less stringent rule-of-thumb recommendations. Rule #1: Load detailed atomic data into dimensional structures. Dimensional models […]

Most ETL tools provide some functionality for handling slowly changing dimensions. Every so often, when the tool isn’t performing as needed, the ETL developer will use the database to identify new and changed rows, and apply the appropriate inserts and updates. I’ve shown examples of this code in the Data Warehouse Lifecycle in Depth class using standard INSERT […]

In the first half of this article, published in the April issue, we saw that drilling across was the defining act of the integrated enterprise data warehouse (EDW). Remember that drilling across requires the integration of data from two more business process subject areas. We then defined two personality archetypes: the dimension manager and the fact […]