ETL SQL

Learn Extract Transform Load using SQL & PySpark

Teradata SQL – C3 (cast , concatenate , coalesce) with examples

Teradata COALESCE with examples

Teradata COALESCE function is used for NULL Handling. It returns the first not null value which is passed to it. If all the values passed to COALESCE function are NULL then output would be NULL else output would be first NOT NULL value encountered.

Teradata Coalesce Date

You can use coalesce to convert NULL values in Date columns to some default values too. Make sure the value passed to coalesce is of DATE FORMAT else you may get the error. In the example below we have “dob” (date of birth) column as date in the table. We can pass a default value to convert all the NULL values in this column.

SELECT
	EMPID,
	NAME,
	COALESCE(YEARS_OF_EXP,-1) AS YEARS_OF_EXP,
	SALARY,
	coalesce(dob,date '1900-01-01') as dob
FROM
	TST_CASE
ORDER BY
	EMPID;

Teradata Coalesce Multiple Values

You can also pass multiple values to coalesce function and it will return the first NOT NULL value. Mostly it is used in the business use case when we have multiple columns and we can have value in anyone of them.

SELECT
	EMPID,
	coalesce(name,first_name,last_name,'Not Available') as emp_name
	YEARS_OF_EXP,
	SALARY
FROM
	TST_CASE
ORDER BY
	EMPID;

In the example above, for emp_name the logic checks for “name” column first. If it is NULL then it will check for “first_name” column. If it is NULL too then it will check for “last_name” column. And if that is NULL too then coalesce function will return ‘Not Available’ as default value for emp_name.

Teradata Coalesce in JOIN

You can also use coalesce in join conditions. The idea is when you have NULL values in joining column then it does not participate in JOINs. If business requirement is to get such rows in output as well then you can apply coalesce on joining column to some default value.

SELECT
	CUST.CUST_ID,
	CUST.CUST_NAME,
	CUST.CUST_CITY,
	OZON.STATE,
	CUST.CUST_OPERATOR
FROM
	CUSTOMERS CUST
INNER JOIN OPRT_ZONE OZON ON
	coalesce(CUST.CUST_CITY,'NotAvailable') = coalesce(OZON.CITY,'NotAvailable');

In the above example, we have used coalesce on joining column. If the value is NULL then it will be defaulted to ‘NotAvailable’ as default value and thus will participate in JOIN. Hence will get in the output too.

We will create volatile table in this example and run queries to test TERADATA SQL COALESCE expression

CREATE VOLATILE TABLE TST_CASE
(
EMPID INTEGER,
NAME VARCHAR(30),
YEARS_OF_EXP SMALLINT,
SALARY INTEGER
)
PRIMARY INDEX (EMPID)
ON COMMIT PRESERVE ROWS;

INSERT INTO TST_CASE VALUES(1,'WILLIAM',5,50000);
INSERT INTO TST_CASE VALUES(2,'SARA',3,30000);
INSERT INTO TST_CASE VALUES(3,'ROBIN',NULL,40000);
INSERT INTO TST_CASE VALUES(4,'TED',7,70000);
INSERT INTO TST_CASE VALUES(5,'LILY',NULL,20000);

Preview the data in the table.

SELECT * FROM TST_CASE;

EMPID

NAME

YEARS_OF_EXP

SALARY

1

WILLIAM

5

50000

2

SARA

3

30000

3

ROBIN

NULL

40000

4

TED

7

70000

5

LILY

NULL

20000

Let’s apply coalesce to the column YEARS_OF_EXP and if it is NULL then return “-1” else keep the value as-is in the output.

SELECT
	EMPID,
	NAME,
	COALESCE(YEARS_OF_EXP,-1) AS YEARS_OF_EXP,
	SALARY
FROM
	TST_CASE
ORDER BY
	EMPID;

EMPID

NAME

YEARS_OF_EXP

SALARY

1

WILLIAM

5

50000

2

SARA

3

30000

3

ROBIN

-1

40000

4

TED

7

70000

5

LILY

-1

20000

COALESCE can be replaced by CASE statement in which you check the value is NULL then default value else keep the column as the output.

CAST in teradata is used to convert value from one datatype into another datatype. The value must be compatible with new datatype else it will result in error. CAST can also apply different formats on output datatype. This is really helpful for date & timestamp columns.

I have used casting a lot during DATE & TIMESTAMP handling when working in Teradata.

You can also use cast to restrict number of characters in the output for string columns. This is similar to SUBSTRING function in teradata where you want first few characters only.
Example : cast(EMPLOYEE_NAME as varchar(10)).

Also CAST function is widely used while working with DECIMAL data type. It is used to restrict values to number of decimal places required in the output.

Teradata CAST Syntax

Casting in Teradata is done using “CAST” function with the new data type.
You can also specify output format if required. This is optional.

cast(column_name as new_datatype)

cast(column_name as new_datatype format 'some_format')

Teradata CAST as Decimal

You can CAST numeric values into Decimal datatype in teradata.

Decimal datatype has 2 components
  1. Precision – This represents total number of digits
  2. Scale – This represents total number of digits after decimal point.

So a DECIMAL value can be represented as DECIMAL(precision , scale). For example, DECIMAL(10,2) means precision of 10 and scale of 2.

Default DECIMAL value in Teradata is DECIMAL(5,0)

So if you don’t specify explicitly the precision & scale Teradata will try to convert the value into decimal(5,0). This may result into numeric overflow error if the value is more than 5 digits.
CAST numeric values to decimal while handling numeric overflow errors in teradata. Read this post to know more about casting in teradata.

CAST integer as decimal

Cast integer as decimal by specifying DECIMAL(p,s) as new datatype in cast function.

select cast(10000 as decimal(10,2)) as col1;

         col1
 ------------
     10000.00

CAST numeric expression as decimal

Cast mathematical expressions output to decimal datatype as per requirement.

select cast(((19.34*27.20)+20)/9 as decimal(8,3)) as col1;

       col1
 ----------
     60.672

CAST string as decimal

You can cast varchar or char values to decimal as per requirement.

select cast('23.4672864' as decimal(10,4)) as col1;

         col1
 ------------
      23.4673

CASTING values to decimal in Teradata may result in rounding or truncation as per scale defined in the cast function.

Teradata CAST as Date

Convert string to DATE in Teradata using CAST function.

select cast('2022-04-20' as date) as col1;

     col1
 --------
 22/04/20

If you don’t specify any format then the string will be converted to default DATE format. Also the string must be in standard format of ‘YYYY-MM-DD’ or ‘YYYY/MM/DD’ else you may get INVALID DATE error.

Convert string to date by specifying custom format.

select cast('2022-04-20' as date format 'yyyy-mm-dd') as col1;

      col1
----------
2022-04-20

It is absolutely must to specify proper format which is in accordance with string value of date to avoid any INVALID DATE error.

select cast('July-31-2021' as date format 'yyyy-mm-dd') as col1;

 *** Failure 2665 Invalid date.
     Statement# 1, Info =0
 *** Total elapsed time was 1 second.
 
 select cast('July-31-2021' as date format 'mmmm-dd-yyyy') as col1;

            col1
 ---------------
 July-31-2021

Teradata Concatenate with examples

Teradata concatenate operator links multiple string or numeric or expression values into single value. Input for concatenate can be two or more but output of concat is always one. In Teradata, concatenate operator is represented by “||” i.e two vertical lines. You can also use ‘!!’ i.e. two exclamation mark however it is not recommended to use it. In this post we will see syntax and examples of various scenarios where concatenate can be used.

Let’s create a dummy table and load it with some sample records. We will use this table in subsequent examples to highlight usage of concatenate in Teradata.

CREATE volatile TABLE emp (
  emp_no integer ,
  birth_date date ,
  first_name varchar(14) NOT NULL,
  last_name varchar(16) ,
  gender char(1)
)
UNIQUE PRIMARY INDEX (emp_no)
ON COMMIT PRESERVE ROWS;


INSERT INTO emp VALUES(10001,'1978-09-02','Georgi','Facello','M');
INSERT INTO emp VALUES(10002,'1989-06-02','Bezalel','Simmel','F');
INSERT INTO emp VALUES(10003,'1984-12-03','Parto','Bamford','M');
INSERT INTO emp VALUES(10004,'1979-05-01','Chirstian',NULL,'M');
INSERT INTO emp VALUES(10005,'1980-01-21','Kyoichi','Maliniak','M');
INSERT INTO emp VALUES(10006,'1978-04-20','Anneke','Preusig','F');
INSERT INTO emp VALUES(10007,NULL,'Tzvetan','Zielinski','F');
INSERT INTO emp VALUES(10008,'1983-02-19','Saniya',NULL,'M');
INSERT INTO emp VALUES(10009,'1977-04-19','Sumant','Peac','F');
INSERT INTO emp VALUES(10010,'1988-06-01','Duangkaew','Piveteau',NULL);

Case 1: Teradata Concatenate 2 columns

We can concatenate 2 columns by specifying the column names and adding the concatenate operator (||) in between.

select
	emp_no,
	first_name,
	last_name,
	first_name || last_name as full_name
from
	emp
where
	emp_no<10004;

In the above example we have concatenated first_name & last_name to create a new derived column which is full_name.

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloGeorgiFacello
10,002BezalelSimmelBezalelSimmel
10,003PartoBamfordPartoBamford

Now in the output we can see concatenate output without spaces which may not be desired output in most of the cases.

Case 2: Concatenate 2 columns with space

If you want to concatenate multiple columns and have space in between in output then add a space using extra concatenate operator in the query.

select
	emp_no,
	first_name,
	last_name,
	first_name || ' ' || last_name as full_name
from
	emp
where
	emp_no<10004;

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloGeorgi Facello
10,002BezalelSimmelBezalel Simmel
10,003PartoBamfordParto Bamford

Case 3: Concatenate 2 columns with delimiter

In the above output you can see we have used space (‘ ‘) as separator or delimiter. Similarly you can use any other character as delimiter too.

select
	emp_no,
	first_name,
	last_name,
	first_name || '-' || last_name as full_name
from
	emp
where
	emp_no<10004;

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloGeorgi-Facello
10,002BezalelSimmelBezalel-Simmel
10,003PartoBamfordParto-Bamford

In the above output you can see we have used “-” as separator in the concatenate output. You can use any other character or even string as delimiter.

Case 4: Concatenate columns with NULL

So we have seen in earlier examples how to concatenate 2 columns , add a space in between and also how to add a separator. In this example we will see how to concatenate columns with NULL values in it.

select
	emp_no,
	first_name,
	last_name,
	first_name || ' ' || last_name as full_name
from
	emp;

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloGeorgi Facello
10,002BezalelSimmelBezalel Simmel
10,003PartoBamfordParto Bamford
10,004Chirstian[NULL][NULL]
10,005KyoichiMaliniakKyoichi Maliniak
10,006AnnekePreusigAnneke Preusig
10,007TzvetanZielinskiTzvetan Zielinski
10,008Saniya[NULL][NULL]
10,009SumantPeacSumant Peac
10,010DuangkaewPiveteauDuangkaew Piveteau

In the above output, I want you to check output for emp_no 10004 & 10008. We can see output is NULL for full_name column. This is slightly different output than expected. In Teradata, if you concatenate NULL with anything then unlike other RDBMS , output in TERADATA is NULL. So please be very careful while using CONCATENATE in Teradata.

You can fix this with below 2 steps:

  1. Apply concatenate operator on NOT NULL columns
  2. If applied on columns which can have NULL values then apply coalesce function.
select
	emp_no,
	first_name,
	last_name,
	first_name || ' ' || coalesce(last_name,'') as full_name
from
	emp;

In the above SQL , we have added coalesce(last_name,”) to handle NULL value and hence avoided output becoming NULL too.

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloGeorgi Facello
10,002BezalelSimmelBezalel Simmel
10,003PartoBamfordParto Bamford
10,004Chirstian[NULL]Chirstian
10,005KyoichiMaliniakKyoichi Maliniak
10,006AnnekePreusigAnneke Preusig
10,007TzvetanZielinskiTzvetan Zielinski
10,008Saniya[NULL]Saniya
10,009SumantPeacSumant Peac
10,010DuangkaewPiveteauDuangkaew Piveteau

Case 5: Concatenate columns with different data type

You can also concatenate NUMERIC columns like integer with STRING columns like varchar. In this example , we will concatenate emp_no(integer) to first_name & last_name (varchar)

select
	emp_no,
	first_name,
	last_name,
	emp_no || '-' || first_name || ' ' || coalesce(last_name,'') as full_name
from
	emp;

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacello 10001-Georgi Facello
10,002BezalelSimmel 10002-Bezalel Simmel
10,003PartoBamford 10003-Parto Bamford
10,004Chirstian[NULL] 10004-Chirstian
10,005KyoichiMaliniak 10005-Kyoichi Maliniak
10,006AnnekePreusig 10006-Anneke Preusig
10,007TzvetanZielinski 10007-Tzvetan Zielinski
10,008Saniya[NULL] 10008-Saniya
10,009SumantPeac 10009-Sumant Peac
10,010DuangkaewPiveteau 10010-Duangkaew Piveteau

Case 6: Concatenate columns with expression

You can concatenate expression with column values too using concatenate operator in Teradata. Let’s look at the example below in which we have created a derived variable using CASE expression and also we have extracted sub-string of last_name column in concatenation.

select
	emp_no,
	first_name,
	last_name,
	case when gender = 'F' then 'Ms.' else 'Mr.' end || ' ' 
	|| first_name 
	|| ' ' 
	|| coalesce(substr(last_name,1,1),'') as full_name
	from
		emp;

Output:

emp_nofirst_namelast_namefull_name
10,001GeorgiFacelloMr. Georgi F
10,002BezalelSimmelMs. Bezalel S
10,003PartoBamfordMr. Parto B
10,004Chirstian[NULL]Mr. Chirstian
10,005KyoichiMaliniakMr. Kyoichi M
10,006AnnekePreusigMs. Anneke P
10,007TzvetanZielinskiMs. Tzvetan Z
10,008Saniya[NULL]Mr. Saniya
10,009SumantPeacMs. Sumant P
10,010DuangkaewPiveteauMr. Duangkaew P

In some case, while combining expressions or columns you may see extra spaces in the output. In such cases, feel free to add TRIM function to remove unnecessary spaces if any.

If you prefer to watch videos then check the video below:


Leave a Reply

Your email address will not be published. Required fields are marked *