Interval is the datatype in Teradata which represents duration between 2 time components. We can add INTERVAL to DATE/TIMESTAMP column values to get to new DATE/TIMESTAMP. Also whenever we subtract two timestamp in Teradata, output is of INTERVAL type. Just like any other data type INTERVAL too has permissible range and if the value exceeds it we get “Interval Field Overflow” error. Now you know the real reason of Interval Field Overflow error , let us see few examples which explains it. Create a dummy table with some data: create volatile table dummy_ts ( start_ts timestamp , end_ts timestamp ) on commit preserve rows; insertRead More →

I was working on a project recently which involved data migration from Teradata to Hadoop. Most of the data migration was done using sqoop. However there was a requirement to do a quick lookup into Teradata table and do some processing in Spark. For this, I wanted to use Spark as it involves comparing data in Teradata table with HIVE table. So I connected Teradata via JDBC and created a dataframe from Teradata table. Next I created a dataframe from Hive table and did comparison. Let’s see an example below for connecting Teradata to Spark directly via JDBC connection. Step 1: We will need 3Read More →

In Data-warehouse, Grouping means aggregating values to calculate some meaningful measures. In EDW, we have fact tables which holds measures and calculate some aggregates like SUM, MAX, MIN, AVG etc. In most of the cases, we will use GROUP BY clause however in some cases you may want to calculate measures over different axis. It can be single column or multiple columns combined. This is very useful when you are creating data cubes where each dimension will show you different information and you can drill down to get more info. In Teradata, GROUP BY GROUPING SETS allows you to do exactly same. In a singleRead More →

Apache Zeppelin is a multi-purpose software that can be used for data visualization, data ingestion, data discovery and data analytics. You can check the below mentioned URL to know more about it and to download the software. https://zeppelin.apache.org/ In download section, I’ll suggest to download the “Binary package with all interpreters” which provides number of interpreters that can be used to connect to various sources. In today’s tutorial, we will see: How to create interpreter to be used for Teradata in Zeppelin? How to connect to Teradata using interpreter and run queries? How to create simple pie-chart for data visualization report? Note: We assume youRead More →

As per SQL standards we cannot use ORDER BY clause in subqueries. It also does not make sense to use ORDER BY inside a derived table.Let’s see an SQL example to identify the problem and how to overcome it. This is a perfectly fine query: SELECT E_EMPNO,E_EMPNAME,E_MOB FROM EMPLOYEE ORDER BY 1,2; Now if we create a derived table using this query it will fail. SELECT COUNT(1) FROM ( SELECT E_EMPNO,E_EMPNAME,E_MOB FROM EMPLOYEE ORDER BY 1,2 )TBL1; Error–> 3706: Syntax error: ORDER BY is not allowed in subqueries. Possible Solutions: 1) Now the obvious solution could be to remove ORDER BY clause from the query.Read More →

Recently Raghav reached out to us on our Facebook page asking for a SQL query in Teradata to identify the highest AMPCPU, I/O and SPOOL usage for each hour during a given time period. He was actually working on some activity to find out the consumption of resources on Teradata box on hourly basis. The purpose was to find out time range when Teradata is most heavily loaded and when it is relatively free. He was looking for information in hourly manner for any given day or multiple days. Also he was referring to our post to check SQL Query Performance in Teradata however theRead More →

What is encryption ? Why we need encryption ? How do we implement encryption ? One must clearly know answer to these 3 questions before actually implementing encryption in any datawarehouse environment. Encryption is the process by which we change the actual meaningful value to some other meaningless value so that any person who reads it cannot understand it. For example if I say “VGTCFCVC” it won’t make any sense to you however if I share actual value “TERADATA” then you’ll immediately understand it. We may use any algorithm to convert one value to other ranging from very simple to extreme complex. In this caseRead More →

We recently received a comment on our previous TPT post,(Click here to read it) requesting for a sample TPT script to load multiple data files into one single Teradata Table using Load operator.The TPT script is simple and is very much similar to our TPT script we shared in earlier posts. The main component which will change is the PRODUCER operator. Here we’ll use 2 new variables : VARCHAR FileList = ‘YES’, VARCHAR FileName = @File1 FileList : represent here that we are going to process list of files. FileName: the Filename which has list of DATA files in it. One in each line. SoRead More →

ROWS-DISTRIBUTION-IN-TERADATA

How Teradata distribute rows ? It is very important to understand how Teradata actually distribute rows among AMPs. We all know that PRIMARY INDEX is used for data distribution. All the AMPs in the Teradata system maintains a portion of each table and this forms the basis of ‘parallel architecture and share nothing‘ architecture of Teradata. We will see how exactly it happens and what all hash functions are used to determine the respective AMP which will hold the table. There are three main functions: HASHROW: The function returns 4 BYTE output for each input value. So the primary index value is passed through HASHROWRead More →

Do you want to learn Teradata however have no clue how to start and what all to read then this post is for you. We receive so many requests from our blog readers asking if we can help them in learning Teradata. And when we ask what exactly they want to learn or any specific topic in Teradata they are looking for then they go mum. Understanding all the aspects of TERADATA RDBMS is not simple and definitely requires good time. You cannot become expert in a day or two. So I thought of writing a blog which should help in figuring out what allRead More →

Teradata SQL Test-5-Input

Next Question is easy and the answer is easier however some of you may find it tricky. We have two tables both have just one column in it. INPUT: Tablename: TABLE1 & TABLE2 | Columnname: RDBMS1 & RDBMS2| Value: Teradata OUTPUT:   Write a generic SQL code which should give required Result. Leave your answer as comment below.Read More →

Create Table in Teradata

One of the most basic query one must know is to CREATE table in Teradata. In RDBMS, Tables are databases objects which are used to store data in form of rows and columns. Let us see a very basic example below on how to create table in Teradata and what all options are available while creating a table. Click on the Following options to see how it affects the CREATE Table in Teradata. Option 1 :SET MULTISET Option 2 :VOLATILE GLOBAL TEMPORARY Option 3 :NO Yes Option 4 :NO Yes Option 5 :NO Yes Option 6 :NO Yes Option 7 :NOT NULL NULL Option 8Read More →

We receive so many requests for covering Teradata Utilities tutorials in our blog. The reason primarily I was not doing it because there are other blogs which already cover it. However one utility in specific I find the resources available online are not easy to understand or not very clear. I am talking about Teradata Parallel Transporter or in short TPT. This is my attempt to make it simple and easy to understand. Automatically generate Teradata Import Scripts – BTEQ Import, Fastload, Multiload or TPTLoad with this web utility. Before we proceed with writing TPT script, let me give you some background about this utilityRead More →

Teradata Invalid Date Error

Invalid Date is one of the most common date related error we face while loading data from Source to Target. The error is because either the source is not sending date values in format which is same as Target or the date value is not correct. Eg: If Target is expecting date in format ‘YYYY-MM-DD’ then ‘2015-02-31’ is date in format ‘YYYY-MM-DD’ however the value is not correct as February cannot have 31 as date. Other example could be ‘2015/02/25’ is date with not same format as Target however value is correct. Both the error are pretty common. Now let us see how to identifyRead More →