Working with Netezza Zone Maps and Best Practices

  • Post author:
  • Post last modified:February 28, 2018
  • Post category:Netezza
  • Reading time:5 mins read

Before going to Netezza zone maps first let’s understand extent.  An extent is smallest unit of storage on the disk. In fact it is smallest disk allocation on storage to store the data. Netezza would maintain the maximum and minimum value of the column that is stored in an extent (each extent is of 3MB in size)

What are Netezza Zone maps?

Netezza Zone maps is an internal mapping structure to the extent (smallest unit of storage) that takes advantage of internal ordering of the data.  These are automatically generated internal table structures that the Netezza system uses to improve the response time of SQL queries when queries data consists of large grouped or nearly ordered date, timestamp, byteint, smallint, integer, and bigint data types. Netezza does not create zone maps for numeric types that are larger than 8 bytes.

Read:

Netezza Zone Maps Example

Zone maps transparently reduce disk scan operations by eliminating the un-referenced data. A zone map contains MIN and MAX of the data in each extent.  Whenever Netezza SQL query executed with WHERE clause, the Netezza system first scans the created zonemaps to find the addresses of minimum and maximum extents that contains the column values. Then Netezza scans the column values in sequential order, and results given to client once the matches found.

Netezza zone maps

Let’s understand the zone maps with an example, consider below query.

SELECT * FROM Patient_table WHERE Date='3-Jan' and Pat_id = 620;

When you execute above query, Netezza look for the zone maps created on the date column and finds out the date range to be considered for query processing. From the above diagram, it is clear that Netezza will not scan the page 1 and 3.

When Netezza Zone Maps are created ?

Netezza creates zone maps in below scenarios

Netezza does not create zone maps for very small table which is < 10 MB (default size). Zone Maps will give excellent results if table is ordered on most restricting column (i.e. column used in WHERE clause). This would reduce the I/O operations since all the rows will be grouped in the same or nearby extents and rest of extent need not be scanned.

Netezza does not create zone map info for numeric types that are larger than 8 bytes. Check if the smaller numeric columns can be converted to integral types to exploit zone maps.

To verify all the columns on which Netezza can create zone maps, use nz_zonemap from the sql toolkit. If you don’t have that installed, get it done ASAP as it has lot of useful syntax and script, those will make your life much easier.

Query performance and Netezza Zone Maps Best Practices

Below are the best practices on Zone Maps:

  • Design ETL architecture and batch jobs with Zone Maps in mind that is, load in sets of data (dates, stores, sites, etc)
  • Primary Key Lookups that is, sort table on primary key, PK lookups behave like an index
  • Validate/Check . If you rebuilds a table with CTAS with different DISTRIBUTION and zone maps are lost.
  • nz_zonemap script data is sorted on your table

If the table size on data slice exceeds 192 GB, you could get error. Follow below mentioned steps to handle this situation:

  • Perform GROOM on the table to remove deleted records. This will free up some space on the data slice.
  • Redistribute the data on different column if existing column is causing high skew
  • Create smaller table out of big table and create a view with a UNION ALL operation
  • Use truncate table statement over delete.
  • Use nz_zonemap script to know how data is sorted in your table

Even query planner wont shows if zone maps are used. Zone map debug tools are very helpful in getting information on zone maps. Use the set enable_zmap_debug=1 in nzsql script to find information on zone maps usage in SQL. This command will enable zone maps debug for only current session. It will be disable automatically once the sql execution is completed.

During poor query performance, always remember to verify if zone maps are disabled zone maps or un-ordered data or deleted records are still available in database (forgot to run groom command?). Generate statistics command will disable the zone maps. If by chance, the generate statistics command is interrupted, that will lead to disabled zone maps. Netezza will never scan the disable zone maps. One should re-generate the statistics on the column to re-enable zone maps.

This Post Has One Comment

  1. Thomas

    Great post! Have nice day ! 🙂 nbvoc

Comments are closed.