Teradata Create Table Example- SET/Multiset/CTAS

Tables are two-dimensional representation of data in rows & columns. Tables are the core of any RDBMS environment. In TERADATA, Table creation is not as straightforward as it looks as it comes with so many options. You can create Tables with minimal options as we have in any ANSI standard. However creating a table in Teradata with proper options can help in saving resources and tuning the queries running on the table . The generic syntax for CREATING TABLE in TERADATA can be :

CREATE [SET/MULTISET] TABLE DATABASENAME.TABLENAME,
[NO] FALLBACK,
[NO] BEFORE JOURNAL,
[NO] AFTER JOURNAL,
(
COLUMN1 DATATYPE,
COLUMN2 DATATYPE,
.
.
.
COLUMNZ DATATYPE
)
[UNIQUE] PRIMARY INDEX [INDEXNAME] (COLUMN1,COLUMN2...COLUMNZ)
[UNIQUE] INDEX [INDEXNAME] (COLUMN1.COLUMN2...COLUMNZ)

Now let us see what few of the common options actually mean:

[SET/MULTISET]: SET tables discard completely duplicate records however MULTISET allows multiple instance of rows. So it depends on the situation. If you know that you will be getting distinct rows , it is advisable to go with MULTISET table as it will not check for duplicate rows. Hence, saving time and enhancing performance. However if you know that you would be getting duplicate records however you need to process only one of the duplicates rows, Go with SET tables. TERADATA default is SET however ANSI default is MULTISET.

FALLBACK: FALLBACK is TERADATA mechanism to save table data in case of AMP Failure. If you define any table with FALLBACK option , a duplicate copy of table data is maintained in some other AMP. SO in case of AMP Failure, the FALLBACK AMP is used. For critical tables, it is recommended to use FALLBACK option. However it also comes with more Storage Space utilisation disadvantage. In TERADATA, default is NO FALLBACK.

PERMANENT JOURNALS:There are four types of Permanent Journals you can use: BEFORE, AFTER, DUAL BEFORE, DUAL AFTER. The BEFORE JOURNAL holds the image of impacted rows before any changes are made. AFTER JOURNAL holds the image of affected rows after changes are done. In DUAL BEFORE/AFTER Journal, two images are taken and are stored in two different AMP’s.

Tips

As it was also discussed earlier that PERMANENT JOURNALS use PERMANENT SPACE so if these JOURNALS are not required any more, it is advisable to drop JOURNALS to save some space. Also, there are several methods of creating tables in TERADATA. The most basic method is shown above. However if you want to CREATE TABLE using some other table then you can use below command:

CREATE TABLE TABLE_1 AS (
SELECT * FROM TABLE_2
) 
WITH [DATA/NO DATA] 
AND [STATS/NO STATS];

Now we understand the theory part so let us quickly look into few examples to further strengthen our understanding.

  1. Check/create/SET current database
  2. CREATE table
  3. SHOW table
  4. SET/MULTISET example
  5. INSERT RECORDS
  6. CTAS with/without data

Check Current Database in Teradata

select database;

 *** Query completed. One row found. One column returned.
 *** Total elapsed time was 1 second.

Database
---------------------------------------------------------------------------
DBC

create a database with 5GB of space. DBC is the default database in the Teradata and is considered as “root” in the Database tree.

Create new database in Teradata

create database db_emp FROM DBC
AS
PERM = 5000000000;

 *** Database has been created.
 *** Total elapsed time was 1 second.

Set new database as the default database for this session.

database db_emp;

 *** New default database accepted.
 *** Total elapsed time was 1 second.

Create a table with minimal options in Teradata

CREATE TABLE emp (
  emp_no integer,
  birth_date date,
  first_name varchar(14),
  last_name varchar(16),
  gender char(1),
  hire_date date,
  phone_no varchar(10),
  alt_phone_no varchar(10)
);

 *** Table has been created.
 *** Total elapsed time was 1 second.

To check Table DDL in Teradata we use SHOW command

SHOW table in Teradata

show table emp;

 *** Text of DDL statement returned.
 *** Total elapsed time was 1 second.

---------------------------------------------------------------------------
CREATE SET TABLE DB_EMP.emp ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      emp_no INTEGER,
      birth_date DATE FORMAT 'YY/MM/DD',
      first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC,
      gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      hire_date DATE FORMAT 'YY/MM/DD',
      phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
      alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( emp_no );

Few points to notice here:

  1. SET table is the default table type.
  2. Table is defined as FALLBACK so copy of the data is stored in another AMP.
  3. DATE columns have default format i.e. ‘YY/MM/DD’.
  4. String columns VARCHAR & CHAR are defined as NOT CASESPECIFIC.
  5. First column is picked as PRIMARY INDEX if not explicitly mentioned in table definition and unique, primary key is also absent.

Let’s create few more tables with some of the params for table/columns explicitly defined.

create set table teradata

CREATE SET TABLE emp_set (
  emp_no integer NOT NULL,
  birth_date date format 'yyyy-mm-dd',
  first_name varchar(14),
  last_name varchar(16),
  gender char(1) NOT NULL default 'M',
  hire_date date format 'y4-mm-dd',
  phone_no varchar(10),
  alt_phone_no varchar(10)
)
PRIMARY INDEX (emp_no);

 *** Table has been created.
 *** Total elapsed time was 1 second.

create multiset table teradata

CREATE MULTISET TABLE emp_multiset (
  emp_no integer NOT NULL,
  birth_date date format 'yyyy-mm-dd',
  first_name varchar(14),
  last_name varchar(16),
  gender char(1) NOT NULL default 'M',
  hire_date date format 'y4-mm-dd',
  phone_no varchar(10),
  alt_phone_no varchar(10)
)
PRIMARY INDEX (emp_no);

 *** Table has been created.
 *** Total elapsed time was 1 second.

Now we have tables defined in Teradata, let us insert some records into the table too.

insert into set table teradata

INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10002,'1989-06-02','Bezalel','Simmel','F','2005-11-21','9564989814','9564989814');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10003,'1984-12-03','Parto','Bamford','M','2006-08-28','9622254431','9520937883');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10004,'1979-05-01','Chirstian','Koblick','M','2006-12-01','9737926154','9126817150');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10005,'1980-01-21','Kyoichi','Maliniak','M','2009-09-12','9420307319','9721085175');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

insert into multiset table teradata

INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10002,'1989-06-02','Bezalel','Simmel','F','2005-11-21','9564989814','9564989814');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
 INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10003,'1984-12-03','Parto','Bamford','M','2006-08-28','9622254431','9520937883');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10004,'1979-05-01','Chirstian','Koblick','M','2006-12-01','9737926154','9126817150');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your SQL request or BTEQ command:
INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10005,'1980-01-21','Kyoichi','Maliniak','M','2009-09-12','9420307319','9721085175');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

SET Table load same record

INSERT INTO emp_set (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034');

 *** Failure 2802 Duplicate row error in db_emp.emp_set.
                Statement# 1, Info =0
 *** Total elapsed time was 1 second.

SET table gave the error when we tried to insert same record again. If it would have been in same transaction then SET table would have quietly ignored the record. But in this case, it is different transaction hence it gave the error. Now let’s check for the MULTISET table.

MULTISET Table load same record

INSERT INTO emp_multiset (emp_no,birth_date,first_name,last_name,gender,hire_date,phone_no,alt_phone_no) VALUES (10001,'1978-09-02','Georgi','Facello','M','2006-06-26','9234728034','9234728034');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

So multiset table allow duplicate records where as SET table throws error if same record comes in different transaction and discard if in same transaction.

check multiset table records

select * from emp_multiset;

 *** Query completed. 6 rows found. 8 columns returned.
 *** Total elapsed time was 1 second.

     emp_no  birth_date  first_name      last_name         gender   hire_date  phone_no    alt_phone_no
-----------  ----------  --------------  ----------------  ------  ----------  ----------  ------------
      10003  1984-12-03  Parto           Bamford           M       2006-08-28  9622254431  9520937883
      10002  1989-06-02  Bezalel         Simmel            F       2005-11-21  9564989814  9564989814
      10005  1980-01-21  Kyoichi         Maliniak          M       2009-09-12  9420307319  9721085175
      10001  1978-09-02  Georgi          Facello           M       2006-06-26  9234728034  9234728034
      10001  1978-09-02  Georgi          Facello           M       2006-06-26  9234728034  9234728034
      10004  1979-05-01  Chirstian       Koblick           M       2006-12-01  9737926154  9126817150

Create table as select from another table with data

create table emp_ctas1 as (select * from emp_multiset) with data;

 *** Table has been created.
 *** Total elapsed time was 1 second.

Check rows of CTAS table just created

select * from emp_ctas1;

 *** Query completed. 5 rows found. 8 columns returned.
 *** Total elapsed time was 1 second.

     emp_no  birth_date  first_name      last_name         gender  hire_date  phone_no    alt_phone_no
-----------  ----------  --------------  ----------------  ------  ---------  ----------  ------------
      10003    84/12/03  Parto           Bamford           M        06/08/28  9622254431  9520937883
      10002    89/06/02  Bezalel         Simmel            F        05/11/21  9564989814  9564989814
      10005    80/01/21  Kyoichi         Maliniak          M        09/09/12  9420307319  9721085175
      10001    78/09/02  Georgi          Facello           M        06/06/26  9234728034  9234728034
      10004    79/05/01  Chirstian       Koblick           M        06/12/01  9737926154  9126817150

Any guesses why there are just 5 rows where as source table has 6 rows ? Yes, you guessed it right. Since default table type is SET , the new table discards the duplicate records. Also let’s check the table structure.

show table emp_ctas1;

 *** Text of DDL statement returned.
 *** Total elapsed time was 1 second.


CREATE SET TABLE DB_EMP.emp_ctas1 ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      emp_no INTEGER,
      birth_date DATE FORMAT 'YY/MM/DD',
      first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC,
      gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      hire_date DATE FORMAT 'YY/MM/DD',
      phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
      alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( emp_no );

So you can see all the table/column options we have set earlier are gone. Like table is converted to SET date formats are converted to ‘YY/MM/DD’ in place of ‘YYYY-MM-DD’. Also default value for column gender is lost too. So be careful while creating a table from another table.

create table from another table with no data

create table emp_ctas2 as (select * from emp_multiset) with no data;

 *** Table has been created.
 *** Total elapsed time was 1 second.

teradata create table as select with data primary index

create table emp_ctas3 as (select * from emp_multiset) with data primary index (phone_no);

 *** Table has been created.
 *** Total elapsed time was 1 second.

check DDL for recently created table

show table emp_ctas3;

 *** Text of DDL statement returned.
 *** Total elapsed time was 1 second.


CREATE SET TABLE DB_EMP.emp_ctas3 ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      emp_no INTEGER,
      birth_date DATE FORMAT 'YY/MM/DD',
      first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC,
      gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
      hire_date DATE FORMAT 'YY/MM/DD',
      phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
      alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( phone_no );

There is one way of creating table from another table and keeping most of the table/column properties.

create table emp_ctas4 as emp_multiset with data primary index (phone_no);

 *** Table has been created.
 *** Total elapsed time was 1 second.

Let’s see the table structure for this new table

show table emp_ctas4;

 *** Text of DDL statement returned.
 *** Total elapsed time was 1 second.

CREATE MULTISET TABLE DB_EMP.emp_ctas4 ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      emp_no INTEGER NOT NULL,
      birth_date DATE FORMAT 'yyyy-mm-dd',
      first_name VARCHAR(14) CHARACTER SET LATIN NOT CASESPECIFIC,
      last_name VARCHAR(16) CHARACTER SET LATIN NOT CASESPECIFIC,
      gender CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL DEFAULT 'M',
      hire_date DATE FORMAT 'y4-mm-dd',
      phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC,
      alt_phone_no VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( phone_no );

So this manner of creating table retains most of the table/column options like Multiset table, DATE FORMAT, GENDER column default value. However it may not be best way to create table if you just want subset of columns in the new table.

That’s it. I think this is sufficient info to get you started on CREATING tables in Teradata. Check out the video below if you prefer watching than reading

Leave a Reply

Your email address will not be published. Required fields are marked *