SCD-1 in Teradata

In SCD-1, we overwrite the existing information in the table with the most recent info received from Source. This is very common SCD type used in the data warehouse and is good only for situations when you don't want to keep historical data for that entity. Let's quickly jump to technical example to understand it:

Create Target Table and load it with sample data:

drop table customer;
create table customer
(
c_mobile bigint,
c_name varchar(30),
c_dob date,
c_email varchar(100),
c_insert_dt date,
c_update_dt date
)
primary index(c_mobile);

insert into customer values (111111111,'Jon Groff','1980-03-10',,date,date);
insert into customer values (222222222,'Kenneth Jarrett','1986-08-24',,date,date);
insert into customer values (333333333,'Daniel Jensen','1972-08-22',,date,date);
insert into customer values (444444444,'Jeffrey Kramer','1987-04-15',,date,date);
insert into customer values (555555555,'Pete  Bartels','1989-11-16',,date,date);


The table data will look like

C_MOBILEC_NAMEC_DOBC_EMAILC_INSERT_DTC_UPDATE_DT
111111111Jon Groff1980-03-10NULL2019-05-242019-05-24
222222222Kenneth Jarrett1986-08-24NULL2019-05-242019-05-24
333333333Daniel Jensen1972-08-22NULL2019-05-242019-05-24
444444444Jeffrey Kramer1987-04-15NULL2019-05-242019-05-24
555555555Pete Bartels1989-11-16NULL2019-05-242019-05-24

Now let's create a staging table which holds most recent info shared from Source.

create table stg_customer
(
c_mobile bigint,
c_name varchar(30),
c_dob date,
c_email varchar(100)
)
primary index(c_mobile);

insert into stg_customer values (111111111,'Jon Groff','1980-03-10','jon.groff@gmail.com');
insert into stg_customer values (222222222,'Kenneth Jarrett','1986-10-24',NULL);
insert into stg_customer values (333333333,'Daniel Craig','1972-08-22',NULL);
insert into stg_customer values (666666666,'John Snow','1982-02-11',NULL);
insert into stg_customer values (777777777,'Arya Stark','1998-05-16',NULL);

The data in staging table looks like

C_MOBILEC_NAMEC_DOBC_EMAIL
111111111Jon Groff1980-03-10jon.groff@gmail.com
222222222Kenneth Jarrett1986-10-24NULL
333333333Daniel Craig1972-08-22NULL
666666666John Snow1982-02-11NULL
777777777Arya Stark1998-05-16NULL

Now to implement SCD-1 in Teradata , the easiest method is to run a MERGE statement on Target table "customer" using Staging table "stg_customer"

MERGE customer using stg_customer on 
customer.c_mobile = stg_customer.c_mobile
WHEN MATCHED THEN UPDATE 
SET
c_name = stg_customer.c_name,
c_dob = stg_customer.c_dob,
c_email = stg_customer.c_email,
c_update_dt = current_date
WHEN NOT MATCHED THEN
INSERT values (
stg_customer.c_mobile,
stg_customer.c_name,
stg_customer.c_dob,
stg_customer.c_email,
current_date,
current_date
);

Now the data in Target table will look like

C_MOBILEC_NAMEC_DOBC_EMAILC_INSERT_DTC_UPDATE_DTCOMMENT
111111111Jon Groff1980-03-10jon.groff@gmail.com2019-05-242019-05-25Email Update
222222222Kenneth Jarrett1986-10-24NULL2019-05-242019-05-25DOB Update
333333333Daniel Craig1972-08-22NULL2019-05-242019-05-25Name Update
444444444Jeffrey Kramer1987-04-15NULL2019-05-242019-05-24No Change
555555555Pete Bartels1989-11-16NULL2019-05-242019-05-24No Change
666666666John Snow1982-02-11NULL2019-05-252019-05-25New Record
777777777Arya Stark1998-05-16NULL2019-05-252019-05-25New Record

*Comment is not the actual column of the table. It is just for understanding purpose I have added above.

You can also run separate UPDATE+INSERT to replicate same functionality.Now let's talk about advantage and disadvantage of SCD-1.

Advantage: Easy to implement, Table Size stays in control, Unique records in the table, Table can be accessed directly without worrying about recent or old record.

Disadvantage: No history is maintained. May not be adequate for data analysis.

Leave a Reply

Your email address will not be published. Required fields are marked *