Dimensional Data Modeling in the Modern Era: A Timeless Blueprint for Data Architecture
In a world of ever-evolving data tools and technologies, some approaches stand the test of time. That’s the case Dustin Dorsey — Principal Data Architect at Onyx — makes for dimensional data modeling, a practice born in the 1990s that continues to provide clarity, performance, and scalability in modern data architecture.
With nearly two decades of experience helping major enterprises build robust data platforms, Dorsey walks us through the enduring value of dimensional modeling and why it’s more relevant than ever in today’s fragmented, fast-moving data landscape.
What Is a Data Model — and Why It Still Matters
At its core, a data model is a blueprint. It defines how data is structured, stored, and related across systems. Think of it like architectural plans for a house. Without it, you might be able to build something — but not without risk, rework, and wasted resources.
With today’s tools offering low-friction ways to ingest and analyze data, it’s tempting to skip this step. But the absence of intentional modeling often leads to brittle systems, duplicated data, and spiraling technical debt.
For Dorsey, a solid data model is foundational — not optional.
From Normalization to Denormalization: A Spectrum of Models
Different modeling techniques fall along a spectrum of normalization:
- Highly Normalized: Think Third Normal Form or Data Vault. Ideal for transactional systems, where data integrity and write performance are critical.
- Highly Denormalized: The infamous “one big table” approach — fast reads, but costly and redundant.
- Dimensional Modeling (the focus here): A balanced approach tailored for analytics and business intelligence.
It’s essential to align the model to the use case: normalized models support auditing and operational consistency; denormalized ones enable rapid data consumption. But when it comes to making data usable and understandable, dimensional modeling strikes the ideal middle ground.
Dimensional Modeling 101: Facts, Dimensions, and Business Clarity
Originally developed by the Kimball Group, dimensional modeling is purpose-built for analytics. It organizes data into two core structures:
- Fact Tables: Contain measurable business processes (sales, transactions, revenue).
- Dimension Tables: Provide context (customer, product, time, geography).
These components form a star schema, where fact tables sit at the center, connected to dimensions via foreign keys. This architecture improves query performance and aligns directly with how business users think, making it easy for analysts to write SQL and create reports.
Importantly, Dorsey warns against misusing the terms “fact” and “dimension” — not everything labeled as such by a vendor fits the dimensional mold. A dimensional model isn’t just about structure — it’s about intent and usability.
Why Many Teams Get It Wrong: Short-Term Thinking, Long-Term Pain
One of the biggest mistakes Dorsey sees in the field? Teams responding only to immediate business needs — without stepping back to architect for the bigger picture. The result? Systems that are hard to scale, hard to manage, and riddled with redundancies.
Taking the time to define your data model up front — even if it slows initial progress — pays off in reduced complexity, faster iteration, and cleaner data products in the long run.
Layered Architecture: Where Dimensional Modeling Fits
Modern data platforms are often built in layers:
- Bronze/Raw Layer: Ingested data in its original form, typically highly relational.
- Silver/Intermediate Layer: Cleansed and staged data, possibly using Data Vault for flexibility.
- Gold Layer: A dimensional model optimized for analytics and business use.
- Consumption Layer (Optional): Denormalized one-big-table views for ease of use in BI tools.
This layered approach lets teams leverage different models for different needs — while keeping dimensional modeling as the anchor for BI and reporting.
Busting Myths: Dimensional Modeling in a Modern Context
Critics argue that dimensional modeling is outdated. Dorsey disagrees — and he makes a compelling case:
- “Infrastructure makes it irrelevant.” Not true. While computing has improved, dimensional modeling still provides unmatched business clarity.
- “It doesn’t work with unstructured data.” It can — once that data is transformed upstream.
- “It’s not real-time.” Fair point. Dimensional models favor batch processing. But for many analytics use cases, that’s not a deal-breaker.
- “It’s too slow to implement.” Only at first. Once the framework is in place, future iterations are dramatically faster.
Real-World Proof: From Spaghetti Code to Streamlined Insights
In one real-world example, a Fortune 500 company using DBT and Databricks was drowning in complexity: one massive DBT model with over 1,000 lines of code and minimal testing.
Dorsey proposed a dimensional model instead. The results?
- 21 modular DBT models
- 179 tests (vs. 2)
- 6 measurable columns (vs. 1)
- 250+ accessible attributes
The performance was comparable — but with significantly better governance, scalability, and business usability. Analysts went from reading 1,000 lines of code to writing 20.
Why It Still Matters
Dimensional modeling remains technology-agnostic, analyst-friendly, and BI-optimized. It simplifies governance, reduces redundancy, and enables faster, more accurate insights.
And while no model fits every scenario, dimensional modeling continues to deliver value — especially when paired with a layered, modern architecture.
Be Intentional
Dorsey leaves us with this: don’t default to what’s easy. Think critically. Build with purpose. And remember — good data models don’t just support the business. They reflect it. For those interested in applying these concepts using DBT, Dustin recommends his book Unlocking DBT, which explores dimensional modeling from a practical data engineering lens.