Slowly changing dimensions or SCD are the dimensions that change over time. It called “slowly” because the change is expected to come rarely or rather I shall say not frequently. Let’s understand this with one real life example.
You went to a coffee shop and ordered some coffee and snacks. The cashier asked for your mobile number and name. Assuming that you are the first time customer an entry will be made in CUSTOMER table in the data warehouse. It may look like below with default value for some columns:
|Interaction – 1|
Next time , when you visit the coffee shop , cashier may ask you for your mobile number. Once you tell him, he may identify you as “Mark”. He may ask you about your Date-of-Birth (DOB) to share any special offer on your birthday. And you may decide to share this info or may be not with him. If you share that another info is added to the table in warehouse.
|Interaction – 2|
Now there can be multiple ways to store this info into data warehouse Table. And the manner in which you will store this info will determine the SCD type you are using. We will talk about popular SCD types used in any data warehouse.
SCD-1: In this type, you will simply overwrite existing information in the table. The existing record will be replaced with the new one.
SCD-2: In this type, you will make new entry into the table with existing record. Some identifier is used to easily identify which is the current record.
SCD-3: In this type, record is over-written in the extra ‘technical’ column which holds the most recent entry.
SCD-4: In this type, history i.e. all the records all maintained in the separate history table and the most current record is only kept in the main table.
There are few more hybrid SCD types available but above mentioned ones are the most common ones. In the subsequent posts, we will discuss each one of the SCD types in detail – how to implement with Teradata SQL, advantages and disadvantage of each.