There are three fundamental types of fact tables in the data warehouse presentation area: transaction fact tables, periodic snapshot fact tables, and accumulating snapshot fact tables. Most DW/BI design teams are very familiar with transaction fact tables. They are the most common fact table type and are often the primary workhorse schema for many organizations. […]
Fact Table Core Concepts
In Design Tip #140, I discussed the challenges of designing dimensional schemas for processes of indeterminate length such as a sales pipeline or insurance claims processing. We concluded they are best represented as accumulating snapshot fact tables characterized by one row per pipeline occurrence where each row is updated multiple times over its lifetime. However, […]
For most subject areas, it’s pretty easy to identify the major dimensions: Product, Customer Account, Student, Employee, and Organization are all easily understood as descriptive dimensions. A store’s sales, a telecommunication company’s phone calls, and a college’s course registrations are all clearly facts. However, for some subject areas, it can be challenging – especially for […]
Factless fact table are“fact tables that have no facts but captures the many-to-many relationship between dimension keys.” We’ve previously discussed factless fact tables to represent events or coverage information. An event-based factless fact table is student attendance information; the grain of the fact table is one row per student each day. A typical coverage factless fact […]
Fact tables are the foundation of the data warehouse. They contain the fundamental measurements of the enterprise, and they are the ultimate target of most data warehouse queries. There is no point in hoisting fact tables up the flagpole unless they have been chosen to reflect urgent business priorities, have been carefully quality assured and […]
In the dimensional modeling world, we try very hard to separate data into two contrasting camps: numerical measurements that we put into fact tables, and textual descriptors that we put into dimension tables as “attributes”. If only life were that easy… Remember that numerical facts usually have an implicit time series of observations, and usually participate in numerical […]
Many transaction processing systems consist of a transaction header “parent” with multiple line item “children.” Regardless of your industry, you can probably identify source systems in your organization with this basic structure. When it’s time to model this data for DW/BI, many designers merely reproduce these familiar operational header and line constructs in the dimensional world. In this Design […]
When developing fact tables, aggregated data is NOT the place to start. To avoid “mixed granularity” woes including bad and overlapping data, stick to rich, expressive, atomic-level data that’s closely connected to the original source and collection process. The power of a dimensional model comes from a careful adherence to “the grain.” A clear definition […]
Meaningless integer keys, otherwise known as surrogate keys, are commonly used as primary keys for dimension tables in data warehouse designs. Our students frequently ask us – what about fact tables? Should a unique surrogate key be assigned for every row in a fact table? Although for the logical design of a fact table, the answer is no, […]
We are often asked about degenerate dimensions in our modeling workshops. Degenerate dimensions cause confusion since they don’t look or feel like normal dimensions. It’s helpful to remember that according to Webster, “degenerate” refers to something that’s 1) declined from the standard norm, or 2) is mathematically simpler. A degenerate dimension (DD) acts as a dimension key in […]