Tables are two-dimensional representation of data in rows & columns. Tables are the core of any RDBMS environment. In TERADATA, table creation is not as straightforward as it looks as it comes with so many options.
teradata create table
You can create Tables with minimal options as we have in any ANSI standard. However creating a table in Teradata with proper options can help in saving resources and tuning the queries running on the table .
The generic syntax for CREATING TABLE in TERADATA is :
CREATE [SET/MULTISET] TABLE DATABASENAME.TABLENAME, [NO] FALLBACK, [NO] BEFORE JOURNAL, [NO] AFTER JOURNAL, ( COLUMN1 DATATYPE, COLUMN2 DATATYPE, . . . COLUMNZ DATATYPE ) [UNIQUE] PRIMARY INDEX [INDEXNAME] (COLUMN1,COLUMN2...COLUMNZ) [UNIQUE] INDEX [INDEXNAME] (COLUMN1.COLUMN2...COLUMNZ)
Now let us see what few of the common options actually mean:
SET and MULTISET tables in teradata
SET tables discard complete duplicate records however MULTISET allows multiple instances of same row in the table.
If you know that you will be getting distinct rows , it is advisable to go with MULTISET table as it will not check for duplicate rows. Hence, saving time and enhancing performance.
If you know that you would be getting duplicate records however you need to process only one of the duplicate rows then create SET tables. TERADATA default is SET however ANSI default is MULTISET.
teradata Fallback and no Fallback
FALLBACK is TERADATA mechanism to save table data in case of AMP Failures. If you define any table with FALLBACK option , a duplicate copy of table data is maintained in some other AMP. So in case of AMP Failure, the FALLBACK AMP is used for reading the data.
Fallback is used to create highly available tables in Teradata
For critical tables, it is recommended to use FALLBACK option. This will require double the Storage Space so should be used only for critical tables.
In TERADATA, default is NO FALLBACK. This means table data is not copied in back-up AMP.
journal in teradata
There are four types of Permanent Journals in teradata.
BEFORE Journal – The BEFORE JOURNAL holds the image of impacted rows before any changes are made.
AFTER Journal – AFTER JOURNAL holds the image of affected rows after changes are done.
DUAL BEFORE Journal – In DUAL BEFORE Journal, two images are taken and are stored in two different AMP’s before any changes are made to data in table.
DUAL AFTER Journal – In DUAL AFTER Journal, two images are taken and are stored in two different AMP’s after changes to rows are done in the table.
As it was also discussed earlier that PERMANENT JOURNALS use PERMANENT SPACE so if these JOURNALS are not required any more, it is advisable to drop JOURNALS to save some space.
Also, there are several methods of creating tables in TERADATA. The most basic method is shown above. However if you want to CREATE TABLE using some other table then you can use below command:
CREATE TABLE TABLE_1 AS ( SELECT * FROM TABLE_2 ) WITH [DATA/NO DATA] AND [STATS/NO STATS];
teradata create table example
Now we understand the theory part so let us quickly look into few examples to further strengthen our understanding.
- Check/create/SET current database
- CREATE table
- SHOW table
- SET/MULTISET example
- INSERT RECORDS
- CTAS with/without data
Check Current Database in Teradata
select database; *** Query completed. One row found. One column returned. *** Total elapsed time was 1 second. Database --------------------------------------------------------------------------- DBC
create a database with 5GB of space. DBC is the default database in the Teradata and is considered as “root” in the Database tree.
Create new database in Teradata
create database db_emp FROM DBC AS PERM = 5000000000; *** Database has been created. *** Total elapsed time was 1 second.
Set new database as the default database for this session.
database db_emp; *** New default database accepted. *** Total elapsed time was 1 second.
Create a table with minimal options in Teradata
CREATE TABLE emp ( emp_no integer, birth_date date, first_name varchar(14), last_name varchar(16), gender char(1), hire_date date, phone_no varchar(10), alt_phone_no varchar(10) ); *** Table has been created. *** Total elapsed time was 1 second.
To check Table DDL in Teradata we use SHOW command
SHOW table in Teradata
show table emp; *** Text of DDL statement returned. *** Total elapsed time was 1 second. --------------------------------------------------------------------------- CREATE SET TABLE DB_EMP.emp ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( emp_no INTEGER, birth_date DATE FORMAT 'YY/MM/DD', first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC, last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC, gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC, hire_date DATE FORMAT 'YY/MM/DD', phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC, alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC) PRIMARY INDEX ( emp_no );
Few points to notice here:
- SET table is the default table type.
- Table is defined as FALLBACK so copy of the data is stored in another AMP.
- DATE columns have default format i.e. ‘YY/MM/DD’.
- String columns VARCHAR & CHAR are defined as NOT CASESPECIFIC.
- First column is picked as PRIMARY INDEX if not explicitly mentioned in table definition and unique, primary key is also absent.
Let’s create few more tables with some of the params for table/columns explicitly defined.
create set table teradata
CREATE SET TABLE emp_set ( emp_no integer NOT NULL, birth_date date format 'yyyy-mm-dd', first_name varchar(14), last_name varchar(16), gender char(1) NOT NULL default 'M', hire_date date format 'y4-mm-dd', phone_no varchar(10), alt_phone_no varchar(10) ) PRIMARY INDEX (emp_no); *** Table has been created. *** Total elapsed time was 1 second.
create multiset table teradata
CREATE MULTISET TABLE emp_multiset ( emp_no integer NOT NULL, birth_date date format 'yyyy-mm-dd', first_name varchar(14), last_name varchar(16), gender char(1) NOT NULL default 'M', hire_date date format 'y4-mm-dd', phone_no varchar(10), alt_phone_no varchar(10) ) PRIMARY INDEX (emp_no); *** Table has been created. *** Total elapsed time was 1 second.
Now we have tables defined in Teradata, let us insert some records into the table too.
insert into set table teradata
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10002,'1989-06-02','Bezalel','Simmel','F','2005-11-21','9564989814','9564989814'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10003,'1984-12-03','Parto','Bamford','M','2006-08-28','9622254431','9520937883'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10004,'1979-05-01','Chirstian','Koblick','M','2006-12-01','9737926154','9126817150'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10005,'1980-01-21','Kyoichi','Maliniak','M','2009-09-12','9420307319','9721085175'); *** Insert completed. One row added. *** Total elapsed time was 1 second.
insert into multiset table teradata
INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10002,'1989-06-02','Bezalel','Simmel','F','2005-11-21','9564989814','9564989814'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10003,'1984-12-03','Parto','Bamford','M','2006-08-28','9622254431','9520937883'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10004,'1979-05-01','Chirstian','Koblick','M','2006-12-01','9737926154','9126817150'); *** Insert completed. One row added. *** Total elapsed time was 1 second. BTEQ -- Enter your SQL request or BTEQ command: INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10005,'1980-01-21','Kyoichi','Maliniak','M','2009-09-12','9420307319','9721085175'); *** Insert completed. One row added. *** Total elapsed time was 1 second.
SET Table load same record
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034'); *** Failure 2802 Duplicate row error in db_emp.emp_set. Statement# 1, Info =0 *** Total elapsed time was 1 second.
SET table gave the error when we tried to insert same record again. If it would have been in same transaction then SET table would have quietly ignored the record. But in this case, it is different transaction hence it gave the error. Now let’s check for the MULTISET table.
MULTISET Table load same record
INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034'); *** Insert completed. One row added. *** Total elapsed time was 1 second.
So multiset table allow duplicate records where as SET table throws error if same record comes in different transaction and discard if in same transaction.
check multiset table records
select * from emp_multiset; *** Query completed. 6 rows found. 8 columns returned. *** Total elapsed time was 1 second. emp_no birth_date first_name last_name gender hire_date phone_no alt_phone_no ----------- ---------- -------------- ---------------- ------ ---------- ---------- ------------ 10003 1984-12-03 Parto Bamford M 2006-08-28 9622254431 9520937883 10002 1989-06-02 Bezalel Simmel F 2005-11-21 9564989814 9564989814 10005 1980-01-21 Kyoichi Maliniak M 2009-09-12 9420307319 9721085175 10001 1978-09-02 Georgi Facello M 2006-06-26 9234728034 9234728034 10001 1978-09-02 Georgi Facello M 2006-06-26 9234728034 9234728034 10004 1979-05-01 Chirstian Koblick M 2006-12-01 9737926154 9126817150
create table as select teradata
In teradata you can create table as select query output. This is also called as CTAS in short form. You can create table as select with data or without data in teradata.
Create table as select from another table with data
create table emp_ctas1 as (select * from emp_multiset) with data; *** Table has been created. *** Total elapsed time was 1 second.
Check rows of CTAS table just created
select * from emp_ctas1; *** Query completed. 5 rows found. 8 columns returned. *** Total elapsed time was 1 second. emp_no birth_date first_name last_name gender hire_date phone_no alt_phone_no ----------- ---------- -------------- ---------------- ------ --------- ---------- ------------ 10003 84/12/03 Parto Bamford M 06/08/28 9622254431 9520937883 10002 89/06/02 Bezalel Simmel F 05/11/21 9564989814 9564989814 10005 80/01/21 Kyoichi Maliniak M 09/09/12 9420307319 9721085175 10001 78/09/02 Georgi Facello M 06/06/26 9234728034 9234728034 10004 79/05/01 Chirstian Koblick M 06/12/01 9737926154 9126817150
Any guesses why there are just 5 rows where as source table has 6 rows ? Yes, you guessed it right. Since default table type is SET , the new table discards the duplicate records. Also let’s check the table structure.
show table emp_ctas1; *** Text of DDL statement returned. *** Total elapsed time was 1 second. CREATE SET TABLE DB_EMP.emp_ctas1 ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( emp_no INTEGER, birth_date DATE FORMAT 'YY/MM/DD', first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC, last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC, gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC, hire_date DATE FORMAT 'YY/MM/DD', phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC, alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC) PRIMARY INDEX ( emp_no );
So you can see all the table/column options we have set earlier are gone. Like table is converted to SET date formats are converted to ‘YY/MM/DD’ in place of ‘YYYY-MM-DD’. Also default value for column gender is lost too. So be careful while creating a table from another table.
create table from another table with no data
create table emp_ctas2 as (select * from emp_multiset) with no data; *** Table has been created. *** Total elapsed time was 1 second.
teradata create table as select with data primary index
create table emp_ctas3 as (select * from emp_multiset) with data primary index (phone_no); *** Table has been created. *** Total elapsed time was 1 second.
check DDL for recently created table
show table emp_ctas3; *** Text of DDL statement returned. *** Total elapsed time was 1 second. CREATE SET TABLE DB_EMP.emp_ctas3 ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( emp_no INTEGER, birth_date DATE FORMAT 'YY/MM/DD', first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC, last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC, gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC, hire_date DATE FORMAT 'YY/MM/DD', phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC, alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC) PRIMARY INDEX ( phone_no );
There is one way of creating table from another table and keeping most of the table/column properties.
create table emp_ctas4 as emp_multiset with data primary index (phone_no); *** Table has been created. *** Total elapsed time was 1 second.
Let’s see the table structure for this new table
show table emp_ctas4; *** Text of DDL statement returned. *** Total elapsed time was 1 second. CREATE MULTISET TABLE DB_EMP.emp_ctas4 ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( emp_no INTEGER NOT NULL, birth_date DATE FORMAT 'yyyy-mm-dd', first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC, last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC, gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL DEFAULT 'M', hire_date DATE FORMAT 'y4-mm-dd', phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC, alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC) PRIMARY INDEX ( phone_no );
So this manner of creating table retains most of the table/column options like Multiset table, DATE FORMAT, GENDER column default value. However it may not be best way to create table if you just want subset of columns in the new table.
That’s it. I think this is sufficient info to get you started on CREATING tables in Teradata. Check out the video below if you prefer watching than reading
Teradata Volatile Table Syntax with Example
volatile table in teradata
Volatile tables are session-specific tables which stores intermediate data and once the session is over, all the data and table definition is removed. Volatile Tables uses SPOOL SPACE. In Teradata, use keyword “VOLATILE” in CREATE statement to create a volatile table.
Teradata VOLATILE syntax
CREATE VOLATILE TABLE TABLE_NAME ( COLUMN_1 DATATYPE, COLUMN_2 DATATYPE ) PRIMARY INDEX (COLUMN_1) ON COMMIT [PRESERVE/DELETE] ROWS;
Options/Settings for Volatile Tables:
LOG/NO LOG:
If Volatile table is created with LOG option, then it will use Transient Journal to save “BEFORE IMAGES” of all the transactions. Since the data is intermediate and can be generated again in case of failures I strongly recommend to go ahead with NO LOG option. This is one simple thing you can do to improve performance of volatile tables. Default is LOG option.
ON COMMIT [PRESERVE/DELETE] ROWS:
This is very important option and most of the time you will be using PRESERVE option. If PRESERVE option is kept then Volatile Table will store the data after each transaction however if the DELETE option is kept then it will lose the data once the transaction is completed. Default is “ON COMMIT DELETE ROWS” hence by default volatile table retains data at transaction level and not session level. So you must make it “ON COMMIT PRESERVE ROWS”
Options NOT available for VOLATILE Tables:
- Permanent Journals
- Foreign Key
- Compression at Column Level
- Default Values
- Name of Indexes
Volatile Table Example
CREATE VOLATILE TABLE SALARY_STATS , NO LOG ( EMP_ID INTEGER, HIGHEST_SAL INTEGER, AVG_SAL INTEGER, LOWEST_SAL INTEGER ) PRIMARY INDEX (HIGHEST_SAL) ON COMMIT PRESERVE ROWS;
Once the session is over, all the data and table definition is lost and is never saved in Data Dictionary. Volatile Tables used Spool Space and once session is over , the occupied Spool space is again available for some other transactions.
VOLATILE table with DATA
In most of the ETL use-cases you will create a volatile table from another existing table. If you are creating a table with another table and you want to get records then you have to use “WITH DATA” clause in volatile table CREATE statement. Also if you want to specify different primary index then you can mention it after “with data” clause.
teradata create volatile table as select
CREATE VOLATILE TABLE SALARY_STATS AS ( SELECT EMP_ID , max(salary) AS HIGHEST_SAL , avg(salary) AS AVG_SAL , min(salary) AS LOWEST_SAL FROM employee GROUP BY emp_id ) WITH DATA PRIMARY INDEX (HIGHEST_SAL) ON COMMIT PRESERVE ROWS;
To check all the volatile table created in current session use below command:
HELP VOLATILE TABLE;
Teradata Volatile Table vs Global Temporary Table
- Both the Volatile & Global Temporary table retains session-specific data only. Once session completes data is removed from table.
- Global Temporary table retains table definition permanently. However for volatile DDL is removed too. So you can use Temporary table in another session without the need to create it all the time.
- Volatile table use SPOOL space where as Temporary table use TEMPORARY space (basically un-used Permanent Space).
- In earlier version , collect stats on Volatile was not supported but now it is for both kind of tables. For Temporary table, use keyword “TEMPORARY” before tablename in collect stats.
Leave a Reply