Data Warehouse Snowflake Schema Model and Design

  • Post author:
  • Post last modified:March 28, 2019
  • Post category:Data Warehouse
  • Reading time:3 mins read

Data warehouse Snowflake schema is extension of star schema data warehouse design methodology, a centralized fact table references to number of dimension tables, however, one or more dimension tables are normalized i.e. dimension tables are connected with other dimension tables. Primary Keys from the dimensions flows into fact table as foreign key.

Snowflake schema increases the level of normalization in data, the dimension table is normalized into multiple tables. This schema has a disadvantage in terms of data retrieval, we need to travel through multiple tables to get same information. SQL queries would have more joins in order to fetch the records.

How Data Warehouse snowflake schema is different than star schema?

In snowflake schema, dimension table are normalized, where as in star schema these are denormalized.

Let us understand this model by using the same example that we discussed in star schema designing.

Consider medicine sales table that holds the information about the sales quantity for each product, patient, hospital and on certain day. To implement snowflake schema, let’s create once more dimension table called product family. Primary key from product family table flow into product table as foreign key (product table is normalized). In this example, sales quantity is the measure and primary keys from Product, Date, hospital and Patient dimension flows into fact table. Fact table always holds measures and keys from dimension tables.

Data Warehouse Snowflake Schema

You can see the product table refers the product family table. In order to get the product information, you need to join product table with product family table. This type of schema is called snowflake schema.

Benefits of Data Warehouse snowflake schema

Snowflake model is in same family as the star schema. In fact, it is a special case of star schema. Some of the benifits includes:

  • Some OLAP multidimensional model tools are optimized to use snowflake schema
  • Normalizing table saves the storage
  • Improvement in query performance due to minimized disk storage and joining smaller lookup tables

Disadvantages of Data Warehouse Snowflake schema

  • Additional maintenance efforts needed due to the increase number of lookup tables
  • SQL queries would have more joins in order to fetch the records
  • Data loads into the snowflake model must be highly controlled and managed to avoid update and insert anomalies