DWgeek.com

Netezza Generate Statistics: A Guide and Best Practices

Netezza uses the cost based optimizer to determine best methods determine redistribution, scan, join, join orders. The optimizer uses statistics information to determine the most efficient way to run a query. System uses the Netezza GENERATE STATISTICS command to generate information about each column proportion of the duplicate values,unique values, NULL values and the maximum and minimum values. How Statistics are collected? Netezza system collects statistics in various ways: Statistics are automatically generated for certain database operations Collect full statistics by running Netezza GENERATE STATISTICS command. This operation is required when table changes significantly, say…

Comments Off

August 13, 2016

Netezza

Netezza nzsql Command and its Usage

The Netezza nzsql command invokes a SQL command interpreter on the Netezza host or any Unix/Linux system that can act as a edge node to connect to Netezza host. Read: Netezza Basic Commands Commonly used basic Netezza Linux Commands You can use the nzsql command to perform all the sql related stuffs. E.g. create database objects, run queries, and manage various databases created on that particular server. You cannot manage the object created in different host or server. Read: nzload Command and its Usage in Netezza Netezza nzsql Internal Slash…

4 Comments

August 12, 2016

Netezza

Analyze and Optimize Netezza Query Performance

Analyze and Optimize Netezza Query Performance is very important part if you are performing the reporting on top of Netezza or processing large amount of data. Netezza query tuning is very important part to improve performance. Netezza uses cost based optimizer to determine the best methods for scan and join operations, orders and redistribute or broadcast operations. Netezza system may redistribute the data for some situations like joins, grouping aggregates, create tables and when loading data into Netezza tables. System will decide the redistribution based on the cost like table…

Comments Off

August 11, 2016

Netezza

Working with Materialized Views in Netezza

When you create a materialized views from a base table, the Netezza system stores the view definition for the lifetime of the SPM view and is visible as a materialized view. SPM view data slices are co-located on the same data slices as the corresponding base table data slices hence increases the performance of the query. A materialized views reduces the width of number of columns being scanned in a base table, these type of view contains a small subset of frequently queried columns. When you query the table (table with large number of…

Comments Off

August 7, 2016

Netezza

How to install vmware player and Netezza Emulator on ubuntu

The Netezza emulator is an emulated Netezza appliance that runs both a virtual host and a virtual S-Blade. It is a fully functional system to be used for initial development and testing only. This tool should never be used in the PRODUCTION environment. It is simply a tool for developers only and you cannot use it for performance optimization. You can use this tool to learn the Netezza appliance. Note: This tutorial is tested on the ubuntu 14.04 Useful Netezza Emulator Information Short information that you might want to know before start using Netezza emulator You should treat…

Comments Off

August 4, 2016

Data Warehouse

Data Warehouse Snowflake Schema Model and Design

Data warehouse Snowflake schema is extension of star schema data warehouse design methodology, a centralized fact table references to number of dimension tables, however, one or more dimension tables are normalized i.e. dimension tables are connected with other dimension tables. Primary Keys from the dimensions flows into fact table as foreign key. Star Schema model in Data Warehouse Data Warehouse Fact Constellation Schema and Design Snowflake schema increases the level of normalization in data, the dimension table is normalized into multiple tables. This schema has a disadvantage in terms of data retrieval, we…

Comments Off

August 3, 2016

Data Warehouse

Step by Step Guide to Dimensional Data Modeling

In this post, you will learn about the step by step guide to dimensional data modeling. You will see how to use dimensional modeling technique in real life scenarios. What is Dimensional data Modeling? Dimensional data modeling is one of the data modeling techniques used in data warehouse design. The main goal of this modeling is to improve the data retrieval, it is optimized for the SELECT operation. Dimensional data modelling is best suited for the data warehouse star and snow flake schema. Dimensional data modeling in data warehouse is different than the…

4 Comments

August 1, 2016

Data Warehouse

Data Warehouse Star Schema Model and Design

Data warehouse Star schema is a popular data warehouse design and dimensional model, which divides business data into fact and dimensions. In this model, centralized fact table references many dimension tables and primary keys from dimension table flows into fact table as a foreign key. This entity-relationship diagram looks star, hence the name star schema. This model divides the business data into fact which holds the measurable data, and dimension that holds descriptive attributes related to the fact data. For examples, fact data includes price, quantity, weight measurements and related dimension attributes example includes product color, sales…

Comments Off

July 30, 2016

Netezza

Working with Netezza Stored Procedures

Netezza stored procedures are used to encapsulate the business logic and same time handle the exceptions. SQL provides the power to get and update the database information on the host server, and the procedure language provides the logic for if-then-else branching and application processing on the data. Read: Netezza RECORD Type Variable, Usage and Examples Netezza Stored Procedure ARRAY Variables and Examples For example, you may want to check the table if its existed in database before dropping it. You achieve this by creating stored procedure. e.g. CALL DROP_IF_EXIST(table_name); if you…

Comments Off

July 30, 2016

Netezza

Working with Netezza Clustered Base Tables (CBT)

A Netezza clustered base tables (CBT) are user table that has data which is organized using one to four organizing keys columns. You can specify max four columns in organize on clause and those columns should not be a part of distribute on clause. An organizing key is a column of the table that you specify for clustering the table records; organizing table helps Netezza to save records in same or nearby extents. You can organize the records using "ORGANIZE ON" clause. Netezza does create zone maps on organizing columns, which will accelerate the performance of queries on that…

Comments Off

July 26, 2016