In this post, we will see a very simple example in which we will create a Redshift table with basic structure and then we will see what all additional properties Redshift will add to it by default. Then we will quickly discuss about those properties and in subsequent posts we will see how these properties impact the overall query performance of these tables. I will be referring to TPC-H tables and queries in Redshift related posts.
Let’s create a sample table now.
So I am creating “h_part” table with few columns and I have just specified the datatype for all the columns. This is minimum table level properties you must specify in order to create any table in Redshift. Now let’s check the table definition in Redshift.
Two important points to notice here:
1) ENCODE: Appropriate encoding (compression technique) is added to each column. Since Redshift is columnar database, it leverages advantage of having specific compression algorithm for each column as per datatype rather than uniform compression for entire table.
2) DISTSTYLE: Distribution style of “AUTO” is added to the table which will work as “ALL” when table is small and will switch to “EVEN” as the table size will grow.
We will not change anything for now. We will keep table structure as-is and will proceed to data loading in the next post. We will talk more about these properties in details in later posts.