In this post, you will learn about the step by step guide to dimensional data modeling. You will see how to use dimensional modeling technique in real life scenarios.
What is Dimensional data Modeling?
Dimensional data modeling is one of the data modeling techniques used in data warehouse design. The main goal of this modeling is to improve the data retrieval, it is optimized for the SELECT operation. Dimensional data modelling is best suited for the data warehouse star and snow flake schema.
Dimensional data modeling in data warehouse is different than the ER modeling where main goal is to normalize the data by reducing redundancy. This model gives us the advantage of storing data in such a way that it is easier to store and retrieve the data once stored in the data warehouse. Dimensional model is the underlying data model used by many of the OLAP systems.
Dimensional model is developed by the legendary Ralph Kimball. You should read his book “The data warehouse Toolkit: The complete Guide to Dimensional modeling”
Steps to Create Dimensional data Modeling
Let’s understand the steps to create dimensional model with the help of example, say, you want to store the information of how many paracetamol and diclofenac tablets sold from single MedPlus store every day. Everything we model fall into two tables , a fact, that hold measures and dimension, hold information that qualifies the measure. Below are the steps for data warehouse dimensional modeling example:
Step 1: Chose Business Objective
The first step in data modeling is, identify the business objective. In our example, the business objective is to store the information of how many paracetamol and diclofenac tablets sold from single MedPlus store every day.
Step2: Identify Granularity
Granularity is the lowest level of information stored in the table. E.g. if the table contains daily sales data then granularity is “Daily”.
In our example, say, a specific MedPlus shop sells 1,000 paracetamol tablets on a specific day then granularity is daily and 10,000 on specific month then granularity would be monthly.
It is very important to set the granularity of the information required. In our case, it is daily.
Step 3: Identify Dimension and its attributes
Dimensions are objects or things. In our example, we are dealing with 3 things, a “Shop”, “Medicine”, and “Day”. We have 3 dimension tables here “Shop”, “Medicine” – paracetamol and diclofenac, and “Day”. Below are the dimension table structures for our simple dimensional model. Complex, systems may have very complex table structure.
Medicine | |
SK | NAME |
1 | Paracetamol |
2 | Diclofenac |
Shop | |
SK | NAME |
1 | Shop1 |
2 | Shop2 |
3 | Shop3 |
Day | |
SK | NAME |
1 | 2016-01-01 |
2 | 2016-01-02 |
3 | 2016-01-03 |
Step 3: Identify Fact
Fact table holds something that is measurable. In our example, number of tablets sold is a measure. We create separate table called Fact to store the measures.
In our example, granularity is medicine getting sold per day; we will add the SK column from Shop, Medicine, and Day to fact table as shown below
Finally, to summarise the process, we have created 3 dimension tables (Shop, Medicine and Day) and 1 fact table. Primary key from the dimension table flows into Fact table as a foreign key. This model looks like star and is also called “Star Schema”
Sometimes, you may want to keep the history of the data that is changing in dimension tables. There are lots of “slowly changing dimensions” methods available to preserve history.
Read:
Benefits of Dimensional Modeling
- Improved data retrieval:Dimensional models are optimized for SELECT operations.
- Simplified business reporting logic
- Fast aggregations:The simpler queries dimension model can result in improved performance for aggregation operations
- Better understanding:Everything in dimensional model falls into two tables, Fact and dimensions
- Extensibility:Dimensional models are scalable and can easily accommodate unexpected new data
Dimensional Data modeling Tools
The erwin data modeling tool is widely used dimensional data modeling tool.
Read:
- Data Warehouse Project Life Cycle and Design
- Star Schema model in Data Warehouse
- Snowflake Schem Model in Data Warehouse
- Slowly Changing Dimensions (SCD) in Data Warehouse
- Rapidly Changing Dimension in Data Warehouse
- Data Warehouse Fact Constellation Schema and Design
I think it should be “Star” in this sentence – This model looks like ‘start’
Thank you. Updated it 🙂
Nice and simple example.
Thank you 🙂