REGULAR EXPRESSION in Teradata

With the use of regular expression many complex tasks actually become a one-line code in SQL. Below are few cases in which using REGULAR EXPRESSION can be really useful in SQL.

1) Regex to check if string consists of repeated character

There may be a case when source is sending you some data with some filler values. If the complete String value consists of same character repeated throughout the string then it is a filler.Say you want to replace multiple blank spaces with just one space. Now traditionally you may do it by writing a stored procedure which may traverse through string and identify if it consists of different characters or just one. With REGULAR EXPRESSION you can do the check very easily. Let us look at below example:

SELECT col1,CASE WHEN col2='' THEN 'Same' ELSE 'Different' END AS Messg
FROM (
SELECT
'11211' AS COL1,
CASE WHEN REGEXP_REPLACE(COL1,SUBSTR(COL1,1,1)||'{1,100}','',1,0,'i') ='' 
THEN '' ELSE COL1 END AS COL2
)tb1;

The logic used here is we are selecting the first character and replacing it with blank in the string. If the complete string is blank then it is a filler value.

2) Regex to remove all spaces from string

We often get the requirement that in a string remove all spaces or remove all blanks present in the string. TRIM function helps in removing spaces which are either leading or trailing. However to remove spaces present with in the string, we can use below regular expression:

SELECT
'1 2 3 4 5 6' AS COL1,
REGEXP_REPLACE(COL1,'[[:blank:]]+','',1,0,'i') AS COL2;

The logic used here is we are identifying the space/blank and replacing it with nothing. So all the spaces are gone.

3) Regex to replace uneven number of spaces with single space

We may get the requirement that a string consists of multiple words. However there are uneven number of spaces separating each word. Now the requirement is we want to have only single space between words inside string. For this we can use below regular expression:

SELECT
'1 2 3 4 5 6' AS COL1,
REGEXP_REPLACE(COL1,'[[:blank:]]{1,100}',' ',1,0,'i') AS COL2;

The logic we have used here is we are identifying blanks/spaces which may extend repeatedly upto 100 characters and replace it by single space.

4) Regex to keep only alphabets and remove other characters

A string may contain alphabets, numeric or symbols. We can use regular expression to keep only alphabets in String and remove rest.

SELECT
'ABC123DEF!@#' AS COL1,
REGEXP_REPLACE(COL1,'[^a-zA-Z]','',1,0,'i') AS COL2;

The logic used here is we are identifying anything but alphabets and replacing it with nothing.

5) Regex to keep only numbers and remove other characters

A string may contain alphabets, numeric or symbols. We can use regular expression to keep only numeric part in String and remove rest.

SELECT
'ABC123DEF!@#' AS COL1,
REGEXP_REPLACE(COL1,'[^0-9]+','',1,0,'i') AS COL2;

The logic used here is we are identifying anything but numbers and replacing it with nothing.

6) Regex to keep only symbols and remove other characters

A string may contain alphabets, numeric or symbols. We can use regular expression to keep only symbols in String and remove rest.

SELECT
'ABC123DEF!@#' AS COL1,
REGEXP_REPLACE(COL1,'[a-zA-Z0-9]+','',1,0,'i') AS COL2;

The logic used here is we are identifying alphabets and numbers and replacing it with nothing.

We will keep updating this post with more regular expression examples in teradata. However if you have any String manipulation requirement, feel free to leave a comment here and we will try to help you.

*Check for apostrophe as it might give some error while copying and running the query.

8 thoughts on “REGULAR EXPRESSION in Teradata

    1. Hi Jayanthi
      Thanks for pointing it out It is actually A-Za-z however the CSS of webpage converted everything to UPPERCASE.
      I have fixed that.
      Thanks again
      Raj

  1. Hi,

    If I want to identify the occurence of a curly braces, and pick up all the characters inside them, how can we achieve it.
    e.g Data contains {Name = X}{Age = Y}{Gender = M}, I should get output as
    Name = X, Age = Y, Gender = M. All of them in separate columns each.

    Can you help please

  2. Hi

    What is use of + in below query?
    SELECT
    ‘ABC123DEF!@#’ AS COL1,
    REGEXP_REPLACE(COL1,'[A-ZA-Z0-9]+’,”,1,0,’I’) AS COL2;

    1. Hi Sk
      You have to define junk characters. Is it symbols or whitespaces or both or something else ?
      Alternate way could be that you can keep alpha numeric and remove everything else.

      Thanks
      Nitin

Leave a Reply

Your email address will not be published. Required fields are marked *