Design Slowly Changing Dimension Type 2 in SQL Cont

  • Post author:
  • Post last modified:February 28, 2018
  • Post category:Data Warehouse
  • Reading time:6 mins read

Slowly Changing Dimension Type 2

This is a continuation of the Design Slowly Changing Dimension (SCD) Type2 in SQL

Step 4: Copy only updated records from LOAD table

These are records which are updated in this load cycle. Since updated records are already in the LOAD table, you have  to compare the LOAD and TGT records on Patient ID or SK column and copy the record which is updated. You should also keep in mind to set EFF_FROM_DT and flag to indicate active record. The updated records will be active from the moment we load them so EFF_FR_DT will be current date and EFF_TO_DT will be either NULL or future date.

Query: Below query can be used to perform task

INSERT INTO PAT_DTLS_LOAD_INT

SELECT SRC.SK ,

SRC.PHONE_NO,

SRC.NAME,

now(),

null,

1

FROM PAT_DTLS_LOAD src

WHERE EXISTS (SELECT 1

                       FROM   PAT_DTLS_LOAD_INT INT1

                       WHERE  src.sk = INT1.sk

                       AND    flag = 0

                       AND NOT EXISTS

                       (SELECT 1

                       FROM   PAT_DTLS_LOAD_INT INT2

                       WHERE  INT1.sk = INT2.sk

                       AND   flag = 1 )

                  );

Results: Highlighted records are updated records. You can also see the expired records in below intermediate results

SK PHONE_NO NAME EFF_FR_DT EFF_TO_DT FLAG
123 123456789 Jhoney 2016-06-01 2016-06-12 0
123 987654321 Jhoney 2016-06-13 NULL 1
234 234567890 Stuart 2016-06-01 2016-06-12 0
234 876543210 Stuart 2016-06-13 NULL 1
345 345678901 Jhoney Stuart 2016-06-01 NULL 1
567 345678901 Max Stuart 2016-06-01 NULL 1

Step 5: Copy fresh records from LOAD to INT

At  any given load cycle , you may receive either updated or fresh records. In this step you are going to load the fresh records. This is the simple step to compare the LOAD and TGT table on patient ID or SK and copy the record which is new.

Query:Below query is useful to perform mentioned task

INSERT INTO PAT_DTLS_LOAD_INT

SELECT SRC.SK ,

SRC.PHONE_NO,

SRC.NAME,

NOW(),

NULL,

1

FROM PAT_DTLS_LOAD SRC

WHERE NOT EXISTS     (SELECT 1

                       FROM   PAT_DTLS_LOAD_INT INT

                       WHERE  SRC.SK = INT.SK);

Results: The fresh record is highlighted. In SCD Type2 fresh records are received during current loading cycle 

SK PHONE_NO NAME EFF_FR_DT EFF_TO_DT FLAG
123 123456789 Jhoney 2016-06-01 2016-06-12 0
123 987654321 Jhoney 2016-06-13 NULL 1
234 234567890 Stuart 2016-06-01 2016-06-12 0
234 876543210 Stuart 2016-06-13 NULL 1
345 345678901 Jhoney Stuart 2016-06-01 NULL 1
456 765432101 Jeff 2016-06-13 NULL 1
567 345678901 Max Stuart 2016-06-01 NULL 1

Step 6: Truncate and load TGT table.

Finally, the Slowly Changing Dimension Type 2 is designed on table and you just need to truncate the TGT and copy records from INT table.

Query: Below query can be used for this task

TRUNCATE TABLE PAT_DTLS_TGT;

INSERT INTO PAT_DTLS_TGT

SELECT * FROM PAT_DTLS_LOAD_INT;

Results: Since TGT is replica of INT table. It hold all records which are available in INT table

SK PHONE_NO NAME EFF_FR_DT EFF_TO_DT FLAG
123 123456789 Jhoney 2016-06-01 2016-06-12 0
123 987654321 Jhoney 2016-06-13 NULL 1
234 234567890 Stuart 2016-06-01 2016-06-12 0
234 876543210 Stuart 2016-06-13 NULL 1
345 345678901 Jhoney Stuart 2016-06-01 NULL 1
456 765432101 Jeff 2016-06-13 NULL 1
567 345678901 Max Stuart 2016-06-01 NULL 1

Hope you have enjoyed reading this post on Slowly Changing Dimension Type 2 implementation. Please drop comment for feedback and any queries or suggestions.

This Post Has 2 Comments

  1. sadiq basha

    very nice, clear and simple explanation. Thanks for sharing

    1. Vithal S

      Thank you 🙂

Comments are closed.