In the dimensional modeling world, we try very hard to separate data into two contrasting camps: numerical measurements that we put into fact tables, and textual descriptors that we put into dimension tables as “attributes”. If only life were that easy… Remember that numerical facts usually have an implicit time series of observations, and usually participate in numerical […]
Yearly Archives: 2007
The Kimball Group has been exposed to hundreds of successful data warehouses. Careful study of these successes has revealed a set of extract, transformation, and load (ETL) best practices. We first described these best practices in an Intelligent Enterprise column three years ago. Since then we have continued to refine the practices based on client […]
In this white paper, Ralph proposes a comprehensive architecture for capturing data quality events, as well as measuring and ultimately controlling data quality in the data warehouse. This scalable architecture can be added to existing data warehouse and data integration environments with minimal impact and relatively little upfront investment. Using this architecture, it is even […]
Many transaction processing systems consist of a transaction header “parent” with multiple line item “children.” Regardless of your industry, you can probably identify source systems in your organization with this basic structure. When it’s time to model this data for DW/BI, many designers merely reproduce these familiar operational header and line constructs in the dimensional world. In this Design […]
When developing fact tables, aggregated data is NOT the place to start. To avoid “mixed granularity” woes including bad and overlapping data, stick to rich, expressive, atomic-level data that’s closely connected to the original source and collection process. The power of a dimensional model comes from a careful adherence to “the grain.” A clear definition […]