Category: Teradata SQL Tutorial – BASIC

TERADATA SQL TUTORIAL,TERADATA TUTORIAL, SQL COMMANDS, TERADATA FUNDAMENTALS,TERADATA SQL, SQL ORDER BY QUERY, SELECT SQL QUERIES,SELECT IN SQL QUERY,SELECT ROWS SQL, ORDER BY SQL STATEMENT, SELECT STATEMENTS IN SQL, SQL AND OR ORDER, SQL SQL SELECT FROM A SELECT STATEMENT, SELECT DUPLICATE ROWS SQL, SQL SELECT ROWS, SQL FIND DUPLICATE RECORD,QUERY TO FIND DUPLICATE RECORDS,FIND DUPLICATE RECORDS

6 ways to load data file into Teradata table

6 ways to load data file into Teradata table

Loading Data files into Teradata table is one of the most common action required in TD datawarehouse environment. In this post, we will discuss about 6 ways of loading data file into teradata table. Also we will talk about pros/cons of each method. We will use below table for all the examples mentioned below:

create multiset table usa_states (
state_id byteint,
state_name varchar(100),
state_abbr char(2),
state_capital varchar(100),
state_larg_city varchar(100),
state_population int
)
primary index (state_id);

We will load following data from file into this table:

stateid~state_name~state_abbr~state_capital~largest_city~population
1~Alabama~AL~Montgomery~Birmingham~4874747
2~Alaska~AK~Juneau~Anchorage~739795
3~Arizona~AZ~Phoenix~Phoenix~7016270
4~Arkansas~AR~Little Rock~Little Rock~3004279
5~California~CA~Sacramento~Los Angeles~39536653
6~Colorado~CO~Denver~Denver~5607154
7~Connecticut~CT~Hartford~Bridgeport~3588184
8~Delaware~DE~Dover~Wilmington~961939
9~Florida~FL~Tallahassee~Jacksonville~20984400
10~Georgia~GA~Atlanta~Atlanta~10429379
11~Hawaii~HI~Honolulu~Honolulu~1427538
12~Idaho~ID~Boise~Boise~1716943
13~Illinois~IL~Springfield~Chicago~12802023
14~Indiana~IN~Indianapolis~Indianapolis~6666818
15~Iowa~IA~Des Moines~Des Moines~3145711
16~Kansas~KS~Topeka~Wichita~2913123
17~Kentucky~KY~Frankfort~Louisville~4454189
18~Louisiana~LA~Baton Rouge~New Orleans~4684333
19~Maine~ME~Augusta~Portland~1335907
20~Maryland~MD~Annapolis~Baltimore~6052177
21~Massachusetts~MA~Boston~Boston~6859819
22~Michigan~MI~Lansing~Detroit~9962311
23~Minnesota~MN~St. Paul~Minneapolis~5576606
24~Mississippi~MS~Jackson~Jackson~2984100
25~Missouri~MO~Jefferson City~Kansas City~6113532
26~Montana~MT~Helena~Billings~1050493
27~Nebraska~NE~Lincoln~Omaha~1920076
28~Nevada~NV~Carson City~Las Vegas~2998039
29~New Hampshire~NH~Concord~Manchester~1342795
30~New Jersey~NJ~Trenton~Newark~9005644
31~New Mexico~NM~Santa Fe~Albuquerque~2088070
32~New York~NY~Albany~New York~19849399
33~North Carolina~NC~Raleigh~Charlotte~10273419
34~ North Dakota~ND~Bismarck~Fargo~755393
35~Ohio~OH~Columbus~Columbus~11658609
36~Oklahoma~OK~Oklahoma City~Oklahoma City~3930864
37~Oregon~OR~Salem~Portland~4142776
38~Pennsylvania~PA~Harrisburg~Philadelphia~12805537
39~Rhode Island~RI~Providence~Providence~1059639
40~South Carolina~SC~Columbia~Charleston~5024369
41~South Dakota~SD~Pierre~Sioux Falls~869666
42~Tennessee~TN~Nashville~Nashville~6715984
43~Texas~TX~Austin~Houston~28304596
44~Utah~UT~Salt Lake City~Salt Lake City~3101833
45~Vermont~VT~Montpelier~Burlington~623657
46~Virginia~VA~Richmond~Virginia Beach~8470020
47~Washington~WA~Olympia~Seattle~7405743
48~West Virginia~WV~Charleston~Charleston~1815857
49~Wisconsin~WI~Madison~Milwaukee~5795483
50~Wyoming~WY~Cheyenne~Cheyenne~579315

Following points to observe before loading this file:
1) File comes with header
2) File is delimited with “~” as seperator
3) There are 50 records in the file.

Let’s start loading this file into table.

1) SQL Editor: Nowadays most of the SQL editor comes with Import feature to load data file into Teradata table. I have used Teradata Studio, Teradata SQL Assistant, Squirrel, HeidiSQL etc to load data files and it is very simple. Also steps involved to load data files in different SQL editors is also similar. Here I will show you how to use Teradata Studio to load data file.

Teradata Studio:
Once you create the table, go to “Data Source Explorer” on the left panel. Browse to your database and move to your table. Right click on the table name and select “Load Data” option (refer to screenshot below). It is good idea to maximize the “Load Data Wizard” window and browse the data file. Change the delimiter to the one you have in data file. In our case we have Headers, so I have checked “Column Labels in First Row” option. Click on Finish. That is it. Now you can check data in your table.
Similarly, for Teradata SQL Assistant, you can refer to below link:
http://etl-sql.com/teradata-sql-assistant-import-data/

Pros: Extremely user friendly approach.
Cons: Not good for huge file with millions of records.

2) BTEQ Import: Very common utility to load small data file into Teradata table. Operates at row level. Easy to develop,execute, debug.
.LOGON TDPID/USER, PASS;
.IMPORT vartext ‘~’ file = ‘\PATH_TO_FILE\US_States_data.txt’, skip=1;
.REPEAT *
using state_id (varchar(100)),state_name (varchar(100)),state_abbr (varchar(100)),state_capital (varchar(100)),state_larg_city (varchar(100)),state_population (varchar(100))

INSERT into usa_states(“state_id”, “state_name”, “state_abbr”, “state_capital”, “state_larg_city”, “state_population”) values (:state_id, :state_name, :state_abbr, :state_capital, :state_larg_city, :state_population)

Pros: user friendly script. Operates at rows level but faster than typical SQL editor.
Cons: Since it operates at row level hence not good for huge file with millions of records.

3) Teradata Fastload:
Fastload is lightning fast utility to load data files into Teradata table. It operates at the block-level so it takes as many rows as it can accomodate in a block and push it to various AMP vdisks. Movement of records to respective AMP is done at later stage but it is way faster because it utilise AMP memory for that task. Below is the sample code:
LOGON TDPID/USER, PASS;
RECORD 2;
BEGIN LOADING usa_states
ERRORFILES usa_states_ET, usa_states_UV
CHECKPOINT 10;
SET RECORD VARTEXT “~”;
DEFINE state_id (varchar(100)),
state_name (varchar(100)),
state_abbr (varchar(100)),
state_capital (varchar(100)),
state_larg_city (varchar(100)),
state_population (varchar(100)),
FILE = ‘\PATH_TO_FILE\US_States_data.txt’;
INSERT into usa_states(“state_id”, “state_name”, “state_abbr”, “state_capital”, “state_larg_city”, “state_population”) values (:state_id, :state_name, :state_abbr, :state_capital, :state_larg_city, :state_population);
END LOADING;
LOGOFF;

Pros: Very fast loading of data file. Loads millions of records in seconds.
Cons: Different style of writing Fastload script. So developer should be well-versed with syntax. Target table must be empty to run fastload. There must not be constraints like referential integrity of primary key on Target table columns. Not much of data manipulation is allowed in the INSERT query in FASTLOAD.

4) Teradata Multiload:
Multiload is another utility which can be used to load data file into Teradata. It also works at block level and not at row-level. So it is very quick as well. However it does not require Target table to be empty. It is slightly slower than Fastload.

.LOGTABLE usa_states_log;
.LOGON TDPID/USER, PASS;
.BEGIN MLOAD TABLES usa_states;
.LAYOUT USA;
.FIELD state_id * VARCHAR(100);
.FIELD state_name * VARCHAR(100);
.FIELD state_abbr * VARCHAR(100);
.FIELD state_capital * VARCHAR(100);
.FIELD state_larg_city * VARCHAR(100);
.FIELD state_population * VARCHAR(100);

.DML LABEL LoadLabel;
INSERT into usa_states(“state_id”, “state_name”, “state_abbr”, “state_capital”, “state_larg_city”, “state_population”) values (:state_id, :state_name, :state_abbr, :state_capital, :state_larg_city, :state_population);
.IMPORT INFILE ‘\PATH_TO_FILE\US_States_data.txt’
FORMAT VARTEXT ‘~’
FROM 2
LAYOUT USA
APPLY LoadLabel;

.END MLOAD;
LOGOFF;

Pros: Quick loading of data file. Loads millions of records in seconds.
Cons: Different style of writing Multiload script. So developer should be well-versed with syntax.

5) Teradata Parallel Transporter (TPT)
Refer to below link:
http://etl-sql.com/teradata-parallel-transporter-tpt-basics-example-and-syntax/

6) JDBC Connection to connect to Teradata:
If you are java developer, then you can create Java program to connect to Teradata and load data files. Kindly refer to below links for sample Java programs:
https://developer.teradata.com/doc/connectivity/jdbc/reference/current/samplePrograms.html

Components of Timestamp and Format available in Teradata

In Teradata, there are various components which make timestamp value in a column. This includes Date part: Year, Month and Day and Time part which includes hour, minutes and seconds. Let us see below all the various formats available for these components:

# Type Format Value Description
1 Year YY 14 2 digits of century
2 Year YYYY 2014 complete year
3 Year Y4 2014 complete year
4 Month MM 11 2 digit month
5 Month MMM Nov short form of month
6 Month MMMM November full name of month
7 Day DD 12 2 digit date
8 Day DDD 316 day of the year
9 Hour HH 02 Hour
10 Minute MI 59 Minute
11 Second SS 07 Second
12 Time Period T AM AM or PM
13 Blank B SPACE Specify Space

You can use combination of above mentioned formats to show your date, time or timestamp values.

Teradata DATATYPES , RANGE and STORAGE SPACE

Teradata supports so many datatypes for columns in table. Some of them are standard ANSI datatypes and some Teradata extension to SQL. Below is the list of few very common DATATYPES used in TERADATA with the range for acceptable values and storage space it requires.

DATATYPE RANGE SPACE (BYTE) STANDARD
NUMERIC DECIMAL(m,n)
n= 1 to 38
m= 0 to n
1 TO 2 –> 1
3 TO 4 –> 2
5 TO 9 –> 4
10 TO 18 –> 8
19 TO 38 –> 16
ANSI
DECIMAL DECIMAL(m,n)
n= 1 to 38
m= 0 to n
1 TO 2 –> 1
3 TO 4 –> 2
5 TO 9 –> 4
10 TO 18 –> 8
19 TO 38 –> 16
TD
REAL 2.26*(10)^-308 TO 1.797*(10)^308 8 ANSI
DOUBLE PRECISION 2.26*(10)^-308 TO 1.797*(10)^308 8 ANSI
FLOAT 2.26*(10)^-308 TO 1.797*(10)^308 8 TD
BYTEINT -128 TO 127 1 TD
SMALLINT -32768 TO 32767 2 ANSI
INTEGER -2147483648 TO 2147483647 4 ANSI
BIGINT -9223372036854775808 TO 9223372036854775807 8 ANSI
DATE 4 ANSI
TIME 6 TD
TIMESTAMP 10 TD
TIME WITH TIME ZONE 8 ANSI
TIMESTAMP WITH TIME ZONE 12 ANSI
PERIOD DATE 8 TD
PERIOD TIME 12 TD
PERIOD TIME WITH TIME ZONE 16 TD
PERIOD TIMESTAMP 20 TD
PERIOD TIMESTAMP WITH TIME ZONE 24 TD
CHAR 64000 n where n is no of digits ANSI
VARCHAR 64000 n+2 where n is no of digits ANSI

How to handle TIMESTAMP with AM PM values coming from Source in Teradata

In Teradata, by default dates are stored in 24 hours format. i.e 2:00 pm in Teradata would look like 14:00. However we may come across the situation in which Timestamp values coming from SOURCE may have AM PM at the end of the value, like 2013-04-25 02:25:30 PM. Now if the column in Teradata is defined as TIMESTAMP(0) , the value stored in the table would be 2013-04-25 02:25:30 which is actually wrong. The correct value should be 2013-04-25 14:25:30. So how to handle such values? To store time properly in Teradata , we need to specify the proper format of the Timestamp column which in this case would be YYYY-MM-DDBHH:MI:SSBT . Here T at the end will take care of AM PM value from the SOURCE. So while inserting such Timestamp values in Teradata, just CAST the values to proper format of TIMESTAMP as required.

Try executing below two queries:

SEL CAST(‘2013-04-25 02:26:25 PM’ AS TIMESTAMP(0) FORMAT ‘Y4-MM-DDBHH:MI:SSBT’);
Output: 2013-04-25 14:26:25 [right]
SEL CAST(‘2013-04-25 02:26:25 PM’ AS TIMESTAMP(0));
Output: 2013-04-25 02:26:25

Teradata SQL UNION and SQL UNION ALL with example

Teradata SQL UNION and UNION ALL are used to retrieve data from multiple SELECT blocks and merge them into one resultset. Below is the SQL UNION syntax:

SELECT EMP_NAME, EMP_CITY FROM EMPLOYEES
UNION
SELECT CUST_NAME, CUST_CITY FROM CUSTOMERS;

The above query will retrieve all the data from first SQL SELECT query and also retrieve the data from second SQL SELECT query and merge the data and display the resultset. SQL UNION will remove the duplicate entries and will display only one row if duplicate rows are present in final resultset.
Below is the SQL UNION ALL syntax:

SELECT EMP_NAME, EMP_CITY FROM EMPLOYEES
UNION ALL
SELECT CUST_NAME, CUST_CITY FROM CUSTOMERS;

SQL UNION ALL operates similarly as UNION and will display resultset combining result from multiple SELECT clauses. SQL UNION ALL will not remove duplicate entries and final resultset may have duplicate rows. Since, there is no duplicate rows checking in SQL UNION ALL it is faster than SQL UNION.

Important point to consider while writing SQL UNION & UNION ALL queries:

The SELECT statements should have equal number of columns and the columns datatypes should be same across all SELECT statements.

The UNION or UNION ALL operator will take technical metadata of the First SELECT statement for complete resultset. So you should make sure that all the columns in first SELECT clause should have highest column width in order to avoid truncation of column values in final resultset.

So when the target table is same and there are multiple SOURCES then SQL UNION or UNION ALL can be used.

Teradata Date Functions

Teradata Date Functions

In Teradata, there are various functions available for DATE columns. I have seen many people getting confused and committing mistakes  while handling columns of DATE datatype. So listing down few of the most basic methods of handling DATE columns.

To Fetch Current Date:
SELECT DATE;
TERADATA SQL Assistant modifies the DATE column and display in format specified in settings.To Display DATE in actual format:
SELECT CAST(DATE AS VARCHAR(20));
Display date in the user specified format
SELECT CAST(DATE AS DATE FORMAT ‘Y4-MM-DD’);
Add 2 Days to DATE
SELECT DATE + INTERVAL ‘2’ DAY;
Add 2 Months to DATE
SELECT DATE + INTERVAL ‘2’ MONTH;
Add 2 Years to DATE
SELECT DATE + INTERVAL ‘2’ YEAR;
Subtract 2 DATEs
SELECT DATE – (DATE -2 );
Change Date to Timestamp
SELECT CAST(DATE AS TIMESTAMP(0));
Select YEAR from Date
SELECT EXTRACT(YEAR FROM DATE);
Select MONTH from Date
SELECT EXTRACT(MONTH FROM DATE);
Select DAY from Date
SELECT EXTRACT(DAY FROM DATE);
Select LAST DAY of LAST MONTH
SELECT (DATE – EXTRACT(DAY FROM DATE))
Select FIRST day of Current Month
SELECT (DATE – EXTRACT(DAY FROM DATE)) + 1;
Select LAST day of Current Month
SELECT (DATE + INTERVAL ‘1’ MONTH) – EXTRACT(DAY FROM ADD_MONTHS(DATE,1));
SELECT DATE IN SOME OTHER FORMAT THEN SPECIFIED: To convert STRING to DATE, the string value should be supplied in the exact format as the DATE FORMAT required in output. If the supplied string is in some other format, then first CAST the STRING to DATE and then apply some other FORMAT of DATE to it.
SELECT CAST(‘2013-02-12’ AS DATE FORMAT ‘DD/MM/YY’);
The above SQL will result in error. Because the supplied string is not in same format as the output DATE is required. To overcome this problem, try below SQL.
SELECT CAST(CAST(‘2013-02-12’ AS DATE) AS DATE FORMAT ‘DD/MM/YY’);

Date columns are stored as INTEGER internally by Teradata. To calculate integer value for any DATE column in TERADATA, the manner in which it will be stored in TERADATA, try this:

(YEAR-1900)*10000 + Month * 100 + Day

If DATE is 2013-03-13 then TERADATA will save it in integer format and value will be
SEL (2013-1900)*10000 + 03*100 + 13 –> [1130313]

Example: If we have a table which has a DATE column DOA and it has some value as ‘2013-03-13’ then we can fetch records like this:

SELECT * FROM EMPLOYEE WHERE DOJ=’2013-03-13′;

OR

SELECT * FROM EMPLOYEE WHERE DOJ=1130313;

Create Views in Teradata

Views in Teradata can be easily created or modified. Views are very useful and are used for several purposes. A VIEW is a virtual table and don’t need any physical space to save data. Only VIEW definition is stored in DATA DICTIONARY and every time VIEW is queried ; the SQL statement is executed and result is returned back to the user. VIEWS in Teradata can be created by using following syntax:

CREATE VIEWS IN TERADATA

CREATE VIEW VIEW_NAME (COL1,COL2,COL3) AS SELECT COLUMN1, COLUMN2, COLUMN3 FROM BASE_TABLE;

REPLACE/UPDATE/MODIFY VIEWS IN TERADATA

REPLACE VIEW VIEW_NAME (COL1,COL2,COL3,COL4) AS SELECT COLUMN1, COLUMN2, COLUMN3,COLUMN4 FROM BASE_TABLE;

**Notice there is no column selected as PRIMARY INDEX in VIEW Definition.

ADVANTAGES OF VIEWS IN TERADATA

  • Adds extra level of security by avoiding access to base tables.
  • Helps in giving more proper column names than the original names present in base tables.
  • Allows to limit access to the records present in the base table by the use of WHERE clause.
  • Allows to limit access to the columns present in the base table by SELECTing only required columns.
  • Combine the result from multiple tables and provide single place to access the records.
  • You can use the VIEW to insert or update the data present in the base table.

VIEWS WITH CHECK OPTION

If you are creating a VIEW with CHECK option, then it will not only user to insert or update data in the base table if it violates the data constraints applied in the VIEW definition. Let see the example below:

CREATE VIEW EMP AS SELECT EMP_ID, EMP_NAME, EMP_SAL FROM EMPLOYEE WHERE EMP_SAL>50000 WITH CHECK OPTION;

Now if you try to insert into the base table via this view any data which has EMP_SAL<50000, it will throw the following error:

****Failure 3564 Range Constraint: Check error in field …

Do you use VIEWS in TERADATA ? Why do you use VIEWS ? Do you use CHECK option in VIEWS ?

Compression in Teradata

Implementing COMPRESSION in Teradata is the best method for freeing some space in Teradata WareHouse environment. Compression Technique in Teradata has been there for sometime now and with every new release Teradata is working on it to make it more and more useful.

Guidelines for implementing Compression in Teradata are as follows:

a) Compression is valid only for FIXED width columns.
b) The length limit for CHAR columns is 255 characters or less.
c) The columns should not be a part of PRIMARY INDEX.
d) Adding COMPRESS to any column will by default COMPRESS NULL’s.
e) COMPRESS values are CASESPECIFIC , so COMPRESS ‘RAJ’ will not COMPRESS ‘raj’.
f) You can add upto 255 values for any column for COMPRESSION.

 

Compression in Teradata
Compression in Teradata

IMPROVEMENTS IN TERADATA 13.10

a) Now it includes VARCHAR datatype columns also for COMPRESSION.
b) The limit has been increased from 255 characters to 510 characters.
c) Also includes other datatypes like GRAPHIC, VARGRAPHIC.

To check for other methods for freeing space check : How to free some space in Teradata Database

Difference between Derived Columns, Derived Tables and SUBQUERIES

There are various kinds of TEMPORARY solution in any RDBMS system. It is interesting to know the difference between them and in any SQL Query , there can be DERIVED COLUMNS, DERIVED TABLES and there can be SUBQUERIES. There is huge difference between all the three and their applications. Below is the main differences between the DERIVED COLUMNS , DERIVED TABLES and SUBQUERIES.

a) Derived columns are used in Select clause ; Derived Tables are used in FROM clause and SUBQUERIES are used in WHERE clause.

b) Derived columns are calculated for each row ; Derived Tables are calculated once in SQL Statement and then is used in the SQL block, SUBQUERIES can be calculated for each row as well as once in SQL Statement.

c) Derived columns need permanent space if we are storing it in the table, Derived Tables and SubQueries use only Spool Space and no space is utilised in Data Dictionary.
d) Derived Tables need exclusive column names in SELECT clause however no such requirement in Derived Columns or Subqueries

If you can think of any more differences between these three , feel free to leave a comment in order to share the knowledge.

Secondary Index in Teradata with Example

What is Secondary Index in Teradata ? Secondary Index in Teradata provides an alternate path to retrieve the data. It is used only for data retrieval and it has nothing to do with data storage. For data storage, Teradata use PRIMARY INDEX. So why do we need Secondary Index , when Primary Index is available for Data Storage as well as Data Retrieval ? There may be some situations when  the query may not be using Primary Index column for data retrieval. In such cases, data retrieval will be very slow. In such situations we can create Secondary Index on such columns which are not part of PRIMARY Index however are used very often in JOIN conditions or other conditions for data retrieval. Like PRIMARY INDEX, we have two types of SECONDARY Index too:

Unique Secondary Index:

CREATE UNIQUE INDEX [COLUMN_NAME] ON TABLENAME;

Non-Unique Secondary Index:

CREATE INDEX [COLUMN_NAME] ON TABLENAME;

Creating SECONDARY INDEX may help in performance optimization however it also comes at some cost in terms of resources. Whenever a SECONDARY index is created on table , a subtable is created on all the AMPs which hold following information:

SECONDARY INDEX VALUE || SECONDARY INDEX ROW_ID  || PRIMARY INDEX ROW_ID

So whenever we query using column defined as SECONDARY INDEX, all AMP’s are asked to check for their sub-table if they hold that value. If yes, then AMPs retrieve the corresponding PRIMARY INDEX row_id from their subtable. Then the AMP holding the PRIMARY INDEX row_id is asked to retrieve respective records. Hence, Data Retrieval via Secondary Index is always 2 AMP or more AMP operation. For NUSI [Non Unique Secondary Index] subtable is created in the same AMP holding the PRIMARY row_id. However for USI[Unique Secondary Index], subtables hold the information about rows of different AMPs. Secondary Index avoids FULL TABLE scan. However one should collect STATS on Secondary Index columns in order to allow Optimizer to use Secondary Index and not Full Table Scan.

Advantages of Secondary Index:

  • Avoids FULL Table Scan by providing alternate data retrieval path.
  • Enhances performances.
  • Can be dropped and created anytime.
  • A table may have multiple Secondary Index defined where as only one Primary Index is permissible.

Disadvantages of Secondary Index:

  • Needs extra storage space for SUBTABLE.
  • Needs extra I/O to maintain SUBTABLE.
  • Collect STATS is required in order to avoid FULL TABLE SCAN.

To Drop Secondary Index use below command:

DROP INDEX [COLUMN_NAME] ON TABLENAME;

Teradata Volatile Table with Example

In Teradata, we can create several kinds of Tables ; one of them is VOLATILE Tables. Volatile tables are used to store some data which is session specific and we need not store that data persistently. Once the session is over, all the data and table definition is lost. Volatile Tables use SPOOL SPACE. We can create a VOLATILE table by using below minimal syntax:

CREATE VOLATILE TABLE TABLE_NAME
(
COLUMN_1 DATATYPE,
COLUMN_2 DATATYPE
) PRIMARY INDEX (COLUMN_1)
ON COMMIT [PRESERVE/DELETE] ROWS;

Few Options which you should consider while creating Volatile Tables:

LOG/NO LOG:

If Volatile table is created with LOG option, then it will be using Transient Journal to save “BEFORE IMAGES” of all the transactions. If it will add to data security but will impact the performance. If the transaction is not that complex, then you should go ahead with NO LOG option. Default is LOG option.

ON COMMIT [PRESERVE/DELETE] ROWS:

This is very important option and most of the time you will be using PRESERVE option. If PRESERVE option is kept then Volatile Table will store the data after each transaction however if the DELETE option is kept then it will lose the data once the transaction is completed.
Options NOT available for VOLATILE Tables:

  • Permanent Journals
  • Foreign Key
  • Compression at Column Level
  • Default Values
  • Name of Indexes

Example:

CREATE VOLATILE TABLE SALARY_STATS ,
NO LOG
(
HIGHEST_SAL INTEGER,
AVG_SAL INTEGER,
LOWEST_SAL INTEGER
)
PRIMARY INDEX (HIGHEST_SAL)
ON COMMIT PRESERVE ROWS;

Once the session is over, all the data and table definition is lost and is never saved in Data Dictionary. Volatile Tables used Spool Space and once session is over , the occupied Spool space is again available for some other transactions.

SQL SELF JOIN EXAMPLE

In SQL , sometimes we need to join the table to itself in order to fetch some data. We basically use this kind of join when we have all  the data present in one table itself however at least one column needs some processing depending on some conditions. In such cases, we generally use Self Join.

In this kind of join, we join the same table to itself by creating different alias name of same table. The general query for self join can be :

SELF JOIN TABLE
SELF JOIN TABLE

SELECT tb1.COL1, tb2.COL2, tb2.COL3 FROM table tb1 , table tb2 WHERE tb1.col1 = tb2.col3;

Suppose we have a table, which is having ZONE_ID and CITY corresponding to that ZONE_ID. We want ZONE_ID, CITY along with the CITY name in next ZONE ID as one row, then we may use SELF JOIN to implement the query.

SELECT TB1.ZONE_ID, TB1.CITY, TB2.CITY FROM OPRT_ZONE TB1 LEFT OUTER JOIN OPRT_ZONE TB2 ON  TB1.ZONE_ID + 1 = TB2.ZONE_ID;

SQL SELF JOIN EXAMPLE
SQL SELF JOIN EXAMPLE

SQL CROSS JOIN EXAMPLE

SQL CROSS JOIN is used to match all the rows from one table to all the rows from other table irrespective of any matching condition. So if Table1 has ‘m’ number of records and Table2 has ‘n’ number of records, the CROSS JOIN will result in m*n number of records in the final result set provided that there is no filter condition specified in SQL Query. SQL CROSS JOIN is not preferred in the actual SQL environment as it may lead to unexpected result set. Also, it is very resource intensive process hence SQL CROSS JOIN query may result in huge processing time. SQL CROSS JOIN has very limited use in real life scenarios and we will see later in which situation , it is considered as good option. The general query for CROSS JOIN is:

SELECT tb1.column1 , tb1.column2, tb2.column1 FROM Table1 tb1 , Table2 tb2

Here we can see that we have not specified any condition as JOIN CONDITION. Also, instead of ‘,’ we could have used CROSS JOIN keyword. The result will consist of all rows of table1 joined to all rows from table2.

SQL CROSS JOIN EXAMPLE
SQL CROSS JOIN EXAMPLE

SELECT OZON.STATE, OZON.CITY, CUST.CUST_ID, CUST.CUST_NAME, CUST.CUST_OPERATOR
FROM OPRT_ZONE OZON
CROSS JOIN CUSTOMERS CUST;

SQL CROSS JOIN is also called as SQL PRODUCT JOIN as it results in result set consisting of product of rows from both the table. SQL CARTESIAN PRODUCT join is when the output of PRODUCT join is unrestricted and there is no Filter condition specified in the SQL query. Hence, it results in m*n number of rows.

SQL FULL OUTER JOIN Example

SQL Full OUTER JOIN is used to retrieve all the records from both the tables participating in join irrespective of the fact that any match exist in the other table or not. This means that FULL OUTER JOIN will return all the rows from the joining tables. So FULL OUTER JOIN returns matching rows as well as all the rows which are not matching from both the tables. In actual Datawarehouse environment we tend to avoid using FULL OUTER JOIN as it may give us some unexpected results. The general SQL query for FULL OUTER JOIN can be :

SELECT tb1.column1, tb1.column2 , tb2.column1 FROM table1 tb1 FULL OUTER JOIN table2 tb2 ON tb1.column1 = tb2.column1;

In this case it will result in retrieving all the rows from both the tables tb1 and tb2. In our example, say we need all the details about CUSTOMERS as well as OPERATOR ZONE. In such case, we may use FULL OUTER JOIN SQL query:

SQL FULL OUTER JOIN EXAMPLE
SQL FULL OUTER JOIN EXAMPLE

SELECT OZON.STATE, OZON.CITY, CUST.CUST_ID, CUST.CUST_NAME, CUST.CUST_OPERATOR FROM OPRT_ZONE OZON FULL OUTER JOIN CUSTOMERS CUST ON CUST.CUST_CITY = OZON.CITY;

Theme: Overlay by Kaira Extra Text
Cape Town, South Africa