When choosing a database schema for a data warehouse, snowflake and star schemas tend to be popular choices. This comparison discusses suitability of star vs. snowflake schemas in different scenarios and their characteristics.
Contents: Snowflake Schema vs Star Schema
Consider a database for a retailer that has many stores, with each store selling many products in many product categories and of various brands. A data warehouse or data mart for such a retailer would need to provide analysts the ability to run sales reports grouped by store, date (or month, quarter or year), or product category or brand.
edit Star Schema Example
If this data mart was using a star schema, it would look as follows:
The fact table would be a record of sales transactions, while there are dimension tables for date, store and product. Dimension tables are each connected to the fact table via their primary key, which is a foreign key for the fact table. For example, instead of storing the actual transaction date in a row of the fact table, the date_id is stored. This date_id corresponds to a unique row in the Dim_Date table, and that row also stores other attributes of the date that are required for grouping in reports. e.g., day of the week, month, quarter of the year and so on. The data is denormalized for easier reporting.
Here is how one would get a report of number of televisions sold by brand and by country.
edit Snowflake Schema Example
The same scenario can also use a snowflake schema, in which case it would be structured as follows:
The main difference, when compared with the star schema, is that data in dimension tables is more normalized. For example, instead of storing month, quarter and day of the week in each row of the Dim_Date table, these are further broken out into their own dimension tables. Similarly for the Dim_Store table, the state and country are geographical attributes that are one step removed — instead of being stored in the Dim_Store table, they are now stored in a separate Dim_Geography table.
The same report — the number of televisions sold by country and by brand — is now a little more complicated than in a star schema: