The importance of the time dimension in data marts and data warehouses. The time dimension is a unique and powerful dimension in every data mart and enterprise data warehouse. Although one of the tenets of dimensional modeling is that all dimensions are created equal, the truth is that the time dimension is very special and […]

One of the tasks of the ETL system’s customer dimension manager is to “assign a unique durable key to each customer.” By durable key, we mean a single key value that uniquely and reliably identifies a given customer over time. In most cases, this unique durable key is the natural business key from the operational […]

Dimensional modeling is a design discipline that straddles the formal relational model and the engineering realities of text and number data. Compared to entity/relation modeling, it’s less rigorous (allowing the designer more discretion in organizing the tables) but more practical because it accommodates database complexity and improves performance. Contrasted with other modeling disciplines, dimensional modeling […]

Wiley, 2013 Tools and Utilities NOTE: You may need to “Save Link As” to download the files. Chapter 3 Sample date dimension spreadsheet Download Correction to Figure 3-13: The first heading in the lower report shown in Figure 3-13 should read “Calendar Week Ending Date,” just like the top report in that figure. Correction to […]

Data warehousing has never been more valuable and interesting than it is now. Making decisions based on data is so fundamental and obvious that the current generation of business users and data warehouse designers/implementers can’t imagine a world without access to data. I’ll resist the urge to tell stories about what it was like before […]

Do you know the difference between dimensional modeling truth and fiction? According to Merriam-Webster, fables are fictitious statements. Unfortunately, fables about dimensional modeling circulate throughout our industry. These false claims and assertions are a distraction, especially if you’re trying to align a team. In this column, we’ll describe the root misunderstandings that perpetuate these myths so […]

According to the Webster’s Unabridged Dictionary, a surrogate is an “artificial or synthetic product that is used as a substitute for a natural product.” Thatýs a great definition for the surrogate keys we use in data warehouses. A surrogate key is an artificial or synthetic key that is used as a substitute for a natural […]

In the dimensional modeling world, we try very hard to separate data into two contrasting camps: numerical measurements that we put into fact tables, and textual descriptors that we put into dimension tables as “attributes”. If only life were that easy… Remember that numerical facts usually have an implicit time series of observations, and usually participate in numerical […]

This Design Tip continues my series on implementing common ETL design patterns. These techniques should prove valuable to all ETL system developers, and, we hope, provide some product feature guidance for ETL software companies as well. Recall that a shrunken dimension is a subset of a dimension’s attributes that apply to a higher level of […]