TL;DR
A star schema is a dimensional-modeling pattern where a central <a href="/glossary/fact-table" class="text-brand-600 underline decoration-brand-200 underline-offset-2 hover:text-brand-700">fact table</a> joins to multiple <a href="/glossary/dimension-table" class="text-brand-600 underline decoration-brand-200 underline-offset-2 hover:text-brand-700">dimension tables</a> in a star-shaped layout, with dimensions kept denormalised for query simplicity. Star schemas are the dominant dimensional pattern in BI and remain the default schema choice for analytical-warehouse design — easier to query, faster than normalised alternatives, and better-supported by BI tools.
What is a star schema?
A star schema is a relational database schema design where a central fact table sits at the centre, with dimension tables radiating outward like points of a star. Each dimension table joins directly to the fact table on a foreign key relationship, with no intermediate joins between dimensions.
It is the dimensional-modeling pattern popularised by Ralph Kimball in the 1990s and remains the dominant analytical-schema pattern today. Star schemas optimise for query performance and analytical clarity at the cost of some normalisation purity.
Anatomy of a star schema
A simple sales-data star schema:
[ Customer Dim ]
│
│
[ Product Dim ] ── [ Sales Fact ] ── [ Date Dim ]
│
│
[ Store Dim ]
Sales Fact:
- sale_id (PK)
- product_id (FK -> Product Dim)
- customer_id (FK -> Customer Dim)
- store_id (FK -> Store Dim)
- date_id (FK -> Date Dim)
- quantity, unit_price, discount, total_amount (measures)
Each Dim table:
- has its own primary key
- holds descriptive attributes
- is denormalised (no further joins from dim to other tables) Star vs snowflake
The contrast is to the snowflake schema, where dimension tables are normalised into multiple sub-tables. Snowflakes save storage at the cost of query complexity (more joins) and performance.
| Property | Star schema | Snowflake schema |
|---|---|---|
| Dimension structure | Denormalised (one table per dimension) | Normalised (sub-tables) |
| Storage | Slightly higher | Slightly lower |
| Query simplicity | High (one join per dimension) | Lower (multiple joins) |
| Query performance | Faster | Slower (more joins) |
| BI-tool support | Universal, optimised | Supported, less optimal |
| Maintenance | Easier (fewer tables) | Harder (more tables) |
Why star schemas dominate
Star schemas dominate analytical design because they optimise for the property that matters most in BI: query simplicity and predictable performance. Denormalised dimensions let analysts write straightforward joins; query optimisers handle them efficiently; BI tools generate clean SQL from them.
Modern columnar warehouses (Snowflake, BigQuery, Redshift) further reduce the storage cost of denormalisation through columnar compression, making the storage trade-off of star vs snowflake essentially irrelevant in modern environments.
Common pitfalls
- 1. Over-normalising dimensions. Splitting customer-dimension into customer + customer_address + customer_segment subtables produces a snowflake instead of a star — and complicates every downstream query for marginal storage savings.
- 2. Skipping conformed dimensions. When multiple fact tables don't share dimension definitions (sales fact uses one customer_dim, marketing fact uses another), cross-fact analysis becomes impossible. Use shared 'conformed' dimensions across facts.
- 3. Underspecified date dimensions. A rich date dimension (with fiscal calendar, holiday flags, period-end markers) saves analysts hundreds of hours of date-arithmetic work. Skimping on the date dim is a false economy.
Related concepts
Snowflake schema is the normalised alternative. Dimensional modeling is the broader discipline. Fact tables and dimension tables are the components. Data marts typically use star-schema design.
At a glance
- Category
- Business Intelligence
- Related
- 5 terms
Frequently asked questions
Star or snowflake schema?
Default to star. Modern columnar warehouses (Snowflake, BigQuery, Redshift) have reduced the storage cost of denormalisation to near-zero, while query simplicity and BI-tool optimisation remain firmly in star's favour. Snowflake schemas only justify the complexity in specific cases (very large hierarchical dimensions, strict normalisation requirements).
How many dimension tables in a star schema?
Typically 5–15 dimensions per fact table. More than 20 suggests the fact table is mixing two analytical subjects that should be split. Fewer than 3 suggests under-modeled dimensions (dimensions hidden as fact-table columns).
Should I use surrogate keys?
Yes — use integer surrogate keys for fact-to-dim joins rather than natural keys (customer_id strings, product_skus). Surrogate keys produce smaller, faster joins and handle slowly-changing dimensions cleanly. Keep the natural key as a column for human readability.
Sources
- Kimball Group dimensional-modeling resources
- The Data Warehouse Toolkit (Kimball, 3rd ed)
- dbt Labs analytics engineering documentation
Fairview is an operating intelligence platform that reads star-schema-modelled data directly — preserving the dimensional structure analysts have built rather than forcing data into a Fairview-specific schema. Start your free trial →
Siddharth Gangal is the founder of Fairview. He built the star-schema-native ingestion layer after watching analytics teams maintain two parallel dimensional models — their own well-designed warehouse star schema for BI, and a tool-imposed schema for operating dashboards. The duplication produced exactly the metric-fragmentation problem the original star schema was designed to prevent.
See it in Fairview
Track Star Schema automatically.
14-day free trial. No credit card. First data source connected in 5 minutes.