Optimizing the Netezza table is very important aspect to speed up your data loading and unloading process. In this article, we will check out some tricks to optimize Netezza table structure to improve performance.
There are no specific set of rules to optimize Netezza table structure. But here I have listed some of methods that I use while designing table for better performance.
Read:
Optimize Netezza Table Structure
Typically, when you create the aggregate table or summery table’s then Netezza system may take some time to create table based on the data type that you have defined and distribution type that you have defined on the table.
Optimize Netezza Table Structure – Design Considerations
Below are some of the important Netezza table design considerations:
Data type
Below are the some of the Netezza data type consideration while designing table structure:
Read: List of Netezza Data Types and Best Practices
- The main key in the performance improvement is the data type. Do not use VARCHAR with large record set. If your record set is large VARCHAR, then Netezza performance is degraded. For example, VARCHAR (5000) is not recommended on every field.
- Use INTEGER field wherever possible. Netezza shows improved performance with INTEGER data type.
- Replace NUMERIC column with zero (0) scale to integer data type. For examples, cast NUMERIC (9,0) to INTEGER data type and you will see improved performance.
- Use floating point data types (REAL/DOUBLE PRECESION) only if you are working with large numbers or very small fractions.
- Use CHAR if you expect data to be fixed length.
CTAS – Create Table AS
Use “Create Table AS” method whenever you are re-creating table. With CTAS option, data distributes on the data slices without involving Netezza host, hence faster and it is easier way.
Read: Netezza redistribute the data using CTAS – Create Table AS.
Generate Statistics
Generate the statistics on the table before loading or unloading the table. If the table has latest stats then Netezza shows improved performance.
Read: Generate statistics in Netezza
Distribution Key
Choosing right distribution key is very important aspect of the Netezza performance improvement. You should distribute on the unique column so that there should not be any data skews. If your table is taking long time to load then you may want to revisit the table structure for distribution key.
Read: