Data warehousing has never been more valuable and interesting than it is now. Making decisions based on data is so fundamental and obvious that the current generation of business users and data warehouse designers/implementers can’t imagine a world without access to data. I’ll resist the urge to tell stories about what it was like before […]
Dimensional data models have been around for a very long time, almost certainly tracing their lineage back to the original Data Cube project between Dartmouth and General Mills in the late 1960s. The appeal of dimensional modeling stems from the obvious simplicity of the models and the natural way in which both business people and […]
Predictive analytics is the name for a broad range of analysis techniques used for making predictions about future behavior. Credit scoring, risk analysis, and promotion selection are among the many applications that have proven to drive revenue and profit. It is worth taking a look at the “predictive analytics” section of Wikipedia to appreciate the […]
One of the most effective tools for managing data quality and data governance, as well as giving business users confidence in the data warehouse results, is the audit dimension. We often attach an audit dimension to every fact table so that business users can choose to illuminate the provenance and confidence in their queries and […]
Drilling across separate business processes is one of the most powerful applications in a data warehouse. We often describe drilling across as magic: separately open connections to the dimensional models for each business process, fetch answer sets from each process labeled identically with row headers drawn from specially conformed dimensions, then deliver the result by […]
Fact tables are the foundation of the data warehouse. They contain the fundamental measurements of the enterprise, and they are the ultimate target of most data warehouse queries. There is no point in hoisting fact tables up the flagpole unless they have been chosen to reflect urgent business priorities, have been carefully quality assured and […]
The owner of the data warehouse must decide how to respond to the changes in the descriptions of dimensional entities like Employee, Customer, Product, Supplier, Location and others. In 30 years of studying this issue, I have found that only three different kinds of responses are needed. I call these slowly changing dimension (SCD) Types […]
The notion of time pervades every corner of the data warehouse. Most of the fundamental measurements we store in our fact tables are time series, which we carefully annotate with time stamps and foreign keys connecting to calendar date dimensions. But the effects of time are not isolated just to these activity-based time stamps. All […]
In the dimensional modeling world, we try very hard to separate data into two contrasting camps: numerical measurements that we put into fact tables, and textual descriptors that we put into dimension tables as “attributes”. If only life were that easy… Remember that numerical facts usually have an implicit time series of observations, and usually participate in numerical […]
In this white paper, Ralph proposes a comprehensive architecture for capturing data quality events, as well as measuring and ultimately controlling data quality in the data warehouse. This scalable architecture can be added to existing data warehouse and data integration environments with minimal impact and relatively little upfront investment. Using this architecture, it is even […]