It’s been few years since I have been working on HIVE, Impala, Spark SQL, PySpark, Redshift and in the journey so far I have migrated many applications in different RDBMS like Teradata, Netezza, Oracle to cloud or on-premise Hadoop environment for various clients. I have seen the change in client preferences from Hive to ACID Hive to Spark SQL to PySpark. In this post I will share my observation so far when it comes to migrating existing Teradata application to Hadoop. First question that shall come to your mind is that if you really want to move to Hadoop ? Few common reasons are: 1)Read More →

While calculating measures you may want to convert seconds to TIME. Say you want to know how many HOURS,MINUTES & Seconds constitute this number. Example you may want to convert 500 seconds to hh:mi:ss format. This can be done easily by using “INTERVALs”. Let’s quickly see how to do it This means 5000 seconds means 1 hour 23 minutes and 20 seconds.Another way of writing same command: Let’s increase the value of number (i.e. seconds) This means 500000 seconds constitutes 5 Days 18 Hours 53 Minutes and 20 Seconds.You can change the INTERVAL RANGE to get output in desired format. Like I just want outputRead More →

If writing a typical TPT script is tedious task for you then you may try using existing TPT templates to quickly write a TPT script. Teradata provides list of templates for all the TPT operators which you can use to quickly create a TPT script. In this post, we will see how to export data from a Teradata table using default templates available in tbuild. Below is the TPT script content [exp_Customer.tpt]. I am using just 2 “OPERATOR” in the TPT script and will use below variable file to pass parameters for corresponding operators. Below is the variable file content [var_Customer.txt]. You can check outRead More →

Recently someone in my team asked me how to fetch column list along with datatype from VIEW definition via jdbc connection. I proposed couple of solutions to him however due to existing code base he was looking for solution using “HELP” command only. Now if you know there are multiple HELP command which can give insight into object details. For more details you can read this post- Teradata HELP commands. However for views you may get column names but datatype may be missing. Since the expected solution is via HELP command only, I proposed to create a volatile table from view and use HELP TABLERead More →

“An owner referenced by user does not have SELECT WITH GRANT OPTION access to DB_Name.Table_Name.Col_Name “. This is very common error related with privileges granted to objects in Teradata. So first thing you must understand is this error related with insufficient privileges. In this post, we will see how this error can come up while working with views on Teradata. Whenever we create VIEWS in Teradata on top of base Tables, we must grant permission to VIEW database on Table database. This is required because VIEW has to access underlying TABLE and if does not have proper permissions it won’t be able to access it.Read More →

In most of the Teradata environment I have seen the preferred medium to call BTEQ is shell script. It means that Teradata developer will write a shell script which will invoke BTEQ utility to run SQL commands on Teradata. In this post, we will see how to write a sample BTEQ script using Shell and run it on Teradata. To start with, I would like to discuss why we shall pick Shell script to call BTEQ and why not directly the BTEQ script. BTEQ supports plain ANSI SQL with few TD extension to SQL. This means that PL/SQL constructs like CURSOR , conditional statements, logicalRead More →

Few years ago, I shared a post to “Convert Multiple Rows into Single Row” using recursive queries. Though it solves the problem and gives the correct desired result, the solution was not straightforward. Ideally combining multiple rows into single value should not be lengthy solution. But at that time, it seems like the only solution. With the newer version of Teradata, we have couple of functions available which can help in combining rows into a single line. In this post, we will discuss about 2 such UDF which can give single value for multiple rows. First, XMLAGG: With Teradata supporting various operations on XML datatypeRead More →

ROWS BETWEEN is used in Teradata to determine group for calculating aggregation. Whenever you use window functions or OLAP functions you specify the rows which shall be considered for calculating measures. If you don’t specify anything then by default all the rows in the partition participate in calculating measures. Some of the common window functions used are MIN, MAX, SUM, AVG, COUNT. As we specify range in usual BETWEEN clause similarly we specify range in ROWS BETWEEN in OLAP functions. RANGE must always be from start to end i.e. Start must be before End. Some common specifications are: UNBOUNDED PRECEDING: All rows before current rowRead More →

In SCD type 2, we have to store all the historical information about any Dimension. So unlike SCD-1, we cannot simply overwrite the updated data we are getting from Source. There are multiple ways in which one can implement SCD-2 in Teradata and we will discuss about couple of most popular approach in this post. Approach 1: Create 2 technical DATE/TIMESTAMP columns. If your job is intra-day I will suggest keeping timestamp else you can keep DATE for daily jobs. Let’s call it INSERT_DT & UPDATE_DT.Firstly, you have to run UPDATE on existing TARGET table with records from SOURCE. This is required to identify andRead More →

So often we come across situation to load data file into Teradata table. Also very frequently we extract data to flat file from Teradata table as well. Occasionally we may have to load data from one Teradata table to another in different servers. The first thing which comes to a developer mind is Teradata PT. However traditionally , developer have been writing lengthy TPT scripts for these activities. To make life simple TPT Easy loader is available for developer which internally calls TPT and creates TPT script dynamically and runs it. To check if you have tdload client installed on your machine , just typeRead More →

Very often when working with BTEQ script you will face requirement to use dynamic variables or parameters. This is not possible in plain SQL without using PL/SQL construct in Teradata. However if you are calling BTEQ code in shell script you can easily use shell variables in BTEQ SQL queries. Let’s quickly see an example to understand this. I have below BTEQ script: Filename: abc.bteq .logon 192.168.100.100/dbc,dbc select InfoKey,InfoData from dbc.dbcinfo; .quit Command to run: bteq < abc.bteq Now you can see I have hardcoded SQL query in the BTEQ. What if I want to populate column name , database name or tablename as theRead More →

In SCD-1, we overwrite the existing information in the table with the most recent info received from Source. This is very common SCD type used in the data warehouse and is good only for situations when you don’t want to keep historical data for that entity. Let’s quickly jump to technical example to understand it: Create Target Table and load it with sample data: SyntaxEditor Code Snippet drop table customer; create table customer ( c_mobile bigint, c_name varchar(30), c_dob date, c_email varchar(100), c_insert_dt date, c_update_dt date ) primary index(c_mobile); insert into customer values (111111111,’Jon Groff’,’1980-03-10′,,date,date); insert into customer values (222222222,’Kenneth Jarrett’,’1986-08-24′,,date,date); insert into customer valuesRead More →

Slowly changing dimensions or SCD are the dimensions that change over time. It called “slowly” because the change is expected to come rarely or rather I shall say not frequently. Let’s understand this with one real life example. You went to a coffee shop and ordered some coffee and snacks. The cashier asked for your mobile number and name. Assuming that you are the first time customer an entry will be made in CUSTOMER table in the data warehouse. It may look like below with default value for some columns: Interaction – 1 Customer_ID Customer_Name Customer_Mobile Customer_DOB State Entry_Date 1 Mark 999-999-9999  01-01-1900  NY 24-05-2019Read More →

Loading Data files into Teradata table is one of the most common action required in TD datawarehouse environment. In this post, we will discuss about 6 ways of loading data file into teradata table. Also we will talk about pros/cons of each method. We will use below table for all the examples mentioned below: create multiset table usa_states ( state_id byteint, state_name varchar(100), state_abbr char(2), state_capital varchar(100), state_larg_city varchar(100), state_population int ) primary index (state_id); We will load following data from file into this table: stateid~state_name~state_abbr~state_capital~largest_city~population 1~Alabama~AL~Montgomery~Birmingham~4874747 2~Alaska~AK~Juneau~Anchorage~739795 3~Arizona~AZ~Phoenix~Phoenix~7016270 4~Arkansas~AR~Little Rock~Little Rock~3004279 5~California~CA~Sacramento~Los Angeles~39536653 6~Colorado~CO~Denver~Denver~5607154 7~Connecticut~CT~Hartford~Bridgeport~3588184 8~Delaware~DE~Dover~Wilmington~961939 9~Florida~FL~Tallahassee~Jacksonville~20984400 10~Georgia~GA~Atlanta~Atlanta~10429379 11~Hawaii~HI~Honolulu~Honolulu~1427538 12~Idaho~ID~Boise~Boise~1716943 13~Illinois~IL~Springfield~Chicago~12802023 14~Indiana~IN~Indianapolis~Indianapolis~6666818 15~Iowa~IA~Des Moines~DesRead More →