Improve performance with help of Netezza Query Plan

  • Post author:
  • Post last modified:February 28, 2018
  • Post category:Netezza
  • Reading time:3 mins read

Netezza uses the cost based optimizer to determine the best methods for scan, join order and data movement between the SPUs. Planner uses the Netezza query plan to redistributed or broadcast your data. You can use EXPLAIN VERBOSE <your_quey> command to display the Netezza query plan.

Netezza Query Plan

The planner always tries to avoid redistributing large tables because of the performance impact. Decisions about redistribution are made by the planner and are based on costs like expected table sizes. Optimizer also uses the Netezza query plan to dynamically rewrite queries to improve query performance.

Read:

Netezza Execution Plans

The optimizer uses the following statistics information to determine the optimal execution plan queries:

  • The number of rows in the table
  • Number of unique or distinct values of each available column
  • Number of NULL values in each column
  • The minimum and maximum values of each available column

Optimizer should have up-to-date statistics to generate best execution plan for queries.

Display Execution and Netezza Query Plan

Syntax for EXPLAIN command is:

EXPLAIN [VERBOSE] <your_query>;

Below are the some commonly observed keywords in Netezza query plan:

  • Cost: Resource utilization to execute given query. Cost will be in milliseconds
  • Hash Join [type]: Fetches the rows from table based on ROWID
  • Sequential scan: Reading full table
  • Width: Maximum bytes in each row
  • Group: Grouping of records to eliminating duplicate rows
  • Aggregate: Netezza Aggregate operation
  • Sort: Sorting

Performance Improvement using Netezza Query Plan and Execution

If any query that is taking lot of time to execute, you can check the query execution plan. Follow below steps to improve the query performance:

  • Use the EXPLAIN command to display the execution plan for specific long running queries
  • Analyze the query plan.
  • Review the estimated costs. Check if they are reasonable
  • Check if the system is performing table broadcasts or distributes? If there is a broadcast, is it on a small table or small result set? If there is distribute, validate the distribution for the tables.
  • Review the scan and join Check if largest table is being scan last
  • Check for up to date statistics
  • Run query again if you have validated above steps