Steve Hoberman’s 10 Questions to use in your Data Model Scorecard
- How do the characteristics of the model support the type of model e.g., OLTP or OLAP
- How well does the model capture the requirements?
- How complete is the model?
- How structurally sound is the model? Have you checked primary and Alternate Keys, no null values, referential integrity, check constraints, Indexes, Partitions, etc
- How well does the model leverage generic structures?
- How well does the model follow naming standards?
- How well is the model arranged for readability?
- How good are the definitions? Document the entities and each attribute, and give examples of values.
- How consistent is the model with the enterprise standards? Did you refer to the enterprise glossary for physical and logical definitions reuse?
- How well does the metadata match the data?
Implementing Dimensional Models
Implementing dimensional models effectively requires both strategic thinking and attention to detail. Here are key best practices to follow when designing and implementing dimensional models:
1. Understand Business Processes First
- Model the business, not just the data.
- Begin by identifying key business processes (e.g., sales, inventory, shipping).
- Focus on the metrics (facts) business users care about (e.g., revenue, quantity sold).
2. Use Clear and Consistent Naming Conventions
- Use business-friendly, readable names for tables and columns.
- Avoid technical jargon or abbreviations that aren’t widely understood.
3. Design Star Schemas, Not Snowflakes
- Favor denormalized dimension tables to simplify querying.
- Only use snowflaking (normalizing dimensions) if it significantly reduces redundancy or simplifies updates.
4. Create Conformed Dimensions
-
Reuse dimensions like
Date
,Customer
,Product
across different fact tables. - Ensures consistency in reporting and enables cross-domain analysis.
5. Handle Slowly Changing Dimensions (SCDs) Thoughtfully
-
Decide on the appropriate method:
- Type 1: Overwrite with new values (simple, loses history)
- Type 2: Create new records for changes (retains history)
- Type 3: Keep limited history in additional columns
- Apply SCD types based on reporting requirements.
6. Include Surrogate Keys
- Use surrogate keys (e.g., integer IDs) in dimensions instead of natural keys.
- Improves performance, handles data changes gracefully, and avoids key collisions.
7. Design Fact Tables Carefully
- Use additive or semi-additive measures where possible.
- Ensure grain (level of detail) is well defined and consistent.
- Include foreign keys to all related dimensions.
8. Add a Date Dimension (Always)
-
Even if timestamps exist, use a well-structured
Date
dimension. - Support fiscal periods, holidays, and relative time filters (e.g., YTD, QTD).
9. Build Coverage and Bridge Tables Where Needed
- Use coverage tables to model what should be there versus what is there (useful in sales quotas or expected delivery).
- Use bridge tables to manage many-to-many relationships (e.g., students enrolled in multiple classes).
10. Test with Real-World Queries
- Validate model design with actual reporting scenarios.
- Make sure performance and usability meet business needs before scaling.
11. Document the Model
-
Maintain documentation for:
- Dimensions and their attributes
- Fact tables and grain
- Business definitions of metrics
- Enables data literacy and reduces onboarding time for analysts.
Adhering to these practices will make your dimensional models more scalable, user-friendly, and aligned with business goals, making them a powerful foundation for analytics and decision-making.