Most of the guidance in the Kimball method for designing, developing, and deploying a DW/BI system is just that: guidance. There are hundreds or thousands of rules in the Kimball Group’s many books, and I confess to having bent many of those rules over the decades, when faced with conflicting goals or unpleasant political realities.
There are several tenets to the Kimball method that I am passionate about. This Design Tip lists the things I say over and over, to audiences experienced and new. Any reader who has attended a class that I’ve taught, or hired me as a consultant, has heard me discourse on most or all of the items on this list.
- The dimensional model is the key asset.
The Kimball method, as outlined in The Data Warehouse Lifecycle Toolkit, Second Edition, is centered on the dimensional model. Dimensional modeling principles are Ralph Kimball’s and the Kimball Group’s most widely known contribution to the world of business intelligence. It’s our focus because a good dimensional model is absolutely vital to the success of your DW/BI endeavor. If you get the model right, and populate it with integrity, the rest is straightforward.
- Dimensional modeling is a group activity.
Even the best dimensional modeler will create a poor dimensional model if she works solo. Not only is dimensional modeling a group activity, it is a group activity that must involve the business user community. Over the years, we have turned down countless consulting requests to design a model without business input. Or, worse, struggled through painful projects where the promised business user participation did not materialize.
It is undoubtedly a huge request of the user community. Our design process usually requires 50-60 hours of sessions over a 4 to 6 week period (or more, depending on project complexity). The people whom we want to participate in the design sessions are always valuable and in demand. But if they can’t be convinced to put in the time and energy, the resulting system will fail.
- The dimensional model is the best specification for the DW/BI system.
The majority of clients that I work with don’t have a written specification for the DW/BI system, certainly no document that reflects reality in a meaningful way. The most common specification format includes mind-numbing lists of what users want to filter and drill on, as well as the demand that all 2000 of the existing reports be supported by the new system. If all we’ve accomplished with our new DW/BI system is re-platforming the existing canned reports, we have failed.
We ask the business users at the end of the design process to think about the analyses they’ve recently done, tried to do, or would have liked to do, from the information in the current design scope. We want them to say “Yes, this model meets our needs.” At the same time, the IT people on the team have been watching the discussion of data sources, transformations, and other technical details. We ask them to affirm “Yes, we can populate this data model.” The dimensional model design write-up is a meaningful and actionable specification of the requirements.
- The dimensional model should add value beyond restructuring.
Some of the most valuable improvements that you can deliver in your DW/BI system are to add improved descriptors and groups to frequently used data. Yet these opportunities are often missed by the design team. I’ve even encountered teams with an explicit policy to add nothing beyond what is in the source system.
Examples of valuable data model additions include:
- Making it easy for users to filter out the seldom used transaction codes.
- Providing attractive sort columns for pick lists and reports.
- Precompute banding, such as age ranges or transaction quality measures.
- Supporting different or deeper hierarchies than are managed in the source systems, such as marketing versus manufacturing views of the product dimension.
- Master data management systems are a great source.
De-duplication is one of the most difficult tasks facing a data warehouse team. Since the earliest days of data warehousing, the back room team has struggled to design ETL processes that de-duplicate entities such as customer. The increasing popularity and functionality of master data management (MDM) technology and programs provides a much better solution than within the ETL flow. And not just because it’s hard! The rhythm of the ETL process, which on an ongoing basis we want to be bullet proof and hands free, is fundamentally at odds with the de-duplication process. No matter how great our tools, how clever our code, how complete our business rules, the automated de-duplication process cannot achieve 100% accuracy. A person is required to make a judgment on the questionable cases. This works much better if it is a job responsibility during the business day, rather than waiting for the ETL load.
A simple MDM system can be used to build the value-added data elements discussed previously in this Design Tip. Dimension hierarchies are notorious for being imperfectly structured if sourced from users’ desktops; the MDM tools deliver a simple platform to manage this information.
- Don’t skip the relational data warehouse.
Designing and populating an enterprise conformed data warehouse is hard. Everyone would like to skip this step. Throughout the 23+ years of my career in data warehousing, I have observed any number of technological attempts to simplify the process, from building the BI layer directly on the transaction system, to so-called virtual data warehouses, back full circle to the current flavor of visualization tool build scripts.
As I discussed in Design Tip #162 Leverage Data Visualization Tools, But Avoid Anarchy, just say no! Unless you completely control all of your source data, you should leave ETL to the ETL tool, leave data storage and management to a relational database engine, and let the BI tools shine at what they do best: great visualizations and user experience.
- It’s all about the business.
I say this many times during classes and consulting. It’s the most important characteristic of the Kimball Lifecycle method: practical, uncompromising focus on the business. It infuses everything we do, and it’s the single most important message to carry forward.
I feel incredibly lucky to have been part of Kimball Group for the past twelve years. It’s been an honor to work not only with Ralph, Margy, Bob, and Warren, but also with the hundreds of students, clients, and colleagues from whom I’ve learned so much. I am humbled by what we’ve achieved together, and blessed to have had what I consider to be the best job in the world.
I feel particularly blessed in comparison to Warren, whose death last year from brain cancer profoundly affected me and all of Kimball Group. In 2016, I will be celebrating Warren and his passion for fitness by cycling across the United States to raise funds for brain cancer research. I’d be honored if you’d follow our blog that will chronicle this journey (https://itinerantphilosopher.wordpress.com/) and thrilled if you would consider donating to our efforts to raise research funds for brain cancer.