What is Secondary Index in Teradata ?
Secondary Index in Teradata provides an alternate path to retrieve the data. It is used only for data retrieval and it has nothing to do with data distribution. For data distribution, Teradata use PRIMARY INDEX. So why do we need Secondary Index , when Primary Index is available for Data Storage as well as Data Retrieval ? There may be some situations when the query may not be using Primary Index column for data retrieval. In such cases, data retrieval will be very slow. In such situations we can create Secondary Index on such columns which are not part of PRIMARY Index however are used very often in JOIN conditions or other conditions for data retrieval. Like PRIMARY INDEX, we have two types of SECONDARY Index too:
Unique Secondary Index
CREATE UNIQUE INDEX [COLUMN_NAME] ON TABLENAME;
Non-Unique Secondary Index
CREATE INDEX [COLUMN_NAME] ON TABLENAME;
Creating SECONDARY INDEX may help in performance optimisation however it also comes at some cost in terms of resources. Whenever a SECONDARY index is created on table , a subtable is created on all the AMPs which hold following information:
SECONDARY INDEX VALUE || SECONDARY INDEX ROW_ID || PRIMARY INDEX ROW_ID
So whenever we query using column defined as SECONDARY INDEX, all AMP's are asked to check for their sub-table if they hold that value. If yes, then AMPs retrieve the corresponding PRIMARY INDEX row_id from their subtable. Then the AMP holding the PRIMARY INDEX row_id is asked to retrieve respective records. Hence, Data Retrieval via Secondary Index is always 2 AMP or more AMP operation. For NUSI [Non Unique Secondary Index] subtable is created in the same AMP holding the PRIMARY row_id. However for USI[Unique Secondary Index], subtables hold the information about rows of different AMPs. Secondary Index avoids FULL TABLE scan. However one should collect STATS on Secondary Index columns in order to allow Optimizer to use Secondary Index and not Full Table Scan.
Advantages of Secondary Index
- Avoids FULL Table Scan by providing alternate data retrieval path.
- Enhances performances.
- Can be dropped and created anytime.
- A table may have multiple Secondary Index defined where as only one Primary Index is permissible.
Disadvantages of Secondary Index
- Needs extra storage space for SUBTABLE.
- Needs extra I/O to maintain SUBTABLE.
- Collect STATS is required in order to avoid FULL TABLE SCAN.
Drop Secondary Index
DROP INDEX [COLUMN_NAME] ON TABLENAME;