Greenplum Sequence and its Usage

Like any other data warehouse appliances, Greenplum has sequences. Greenplum sequence is an auto number generator. These sequence then can be used in any SQL statements. Greenplum Sequence Overview CREATE SEQUENCE creates a new sequence number generator. This command willalso creates the special single-row table and initialize it. Sequence will be owned by the user creating it. Read: Greenplum Data Loading Sequence Also check: Greenplum Unloading Data Syntax: CREATE SEQUENCE name [Options] Following are the options associated with Greenplum sequence. [INCREMENT [BY] value] [MINVALUE minvalue | NO MINVALUE] [MAXVALUE maxvalue…

Continue ReadingGreenplum Sequence and its Usage
Comments Off on Greenplum Sequence and its Usage

Greenplum Unloading Data Examples

Greenplum can supports unloading large amounts of data. It also supports single file, non-parallel export for small amounts of data. Greenplum unloading data is supported by various methods such as COPY TO command and create writable external tables. In this post you will learn how to unload data from Greenplum Database using parallel unload (writable external tables) and non-parallel unload (COPY) Read: Greenplum Data Loading Options How to Export Vertica Query Results into CSV? Create Writable External Tables Greenplum writable external table uses the Greenplum distributed file server, gpfdist to create file…

Continue ReadingGreenplum Unloading Data Examples
Comments Off on Greenplum Unloading Data Examples

Greenplum Data Loading Options

Being a MPP server, Greenplum supports parallel data loading for large amounts of data. It also supports single file, non-parallel import for small amounts of data. Greenplum data loading is supported by various methods as follows. Read: Greenplum Architecture Data Loading Options Greenplum supports following tools for loading 1. Greenplum data Loading with gpload Command The gpload Greenplum data loading utility is an interface to external table parallel loading feature. gpload uses a load specification or layout defined in a YAML formatted control file to load data into the target table…

Continue ReadingGreenplum Data Loading Options
Comments Off on Greenplum Data Loading Options

Greenplum Architecture

Like IBM Netezza and Amazon Redshift, Greenplum database is a massively parallel processing (MPP) database server. Greenplum architecture is designed to manage large scale data warehouse for analytics and business intelligence needs. Like any other large scale data warehouse appliances, Greenplum works well with Dimensional modeling. Read: Star Schema Model in Data Warehouse Step By Step Guide to Dimensional Modeling Greenplum Architecture Overview The MPP environment shared nothing architecture is made up of two or more processor that work together to perform tasks. Each processor has its own memory, operation…

Continue ReadingGreenplum Architecture
Comments Off on Greenplum Architecture

Netezza Analytic Functions

Netezza analytic functions compute an aggregate value that is based on a group of rows. A Netezza SQL analytic function works on the group of rows and ignores the NULL in the dat.  Netezza server is basically an analytics system and provides many useful functions that can perform day to day aggregations. As these functions are native to Netezza server, hence use of these Netezza analytic functions improves performance of SQL queries as well as server performance. Frequently used Netezza analytical functions are as follows: COUNT Analytic Function Returns the number of…

Continue ReadingNetezza Analytic Functions
Comments Off on Netezza Analytic Functions

Collocated Joins in Netezza: Optimizing Query Performance

In Netezza, if two tables are distributed on same column then such tables are called collocated tables. If you join two collocated tables than each SPU in system works 100% independent to each other as the relevant rows required for joining these tables exist in the same SPU. These types of joins are called collocated joins. Collocated Joins Example Let’s take an examples, assume that we have two tables a patient and hospital, both are distributed on patient id. Netezza distributes the rows with same distribution key (column) to same…

Continue ReadingCollocated Joins in Netezza: Optimizing Query Performance
2 Comments

Netezza Joins and it’s Algorithms

Netezza joins combines the columns of the reports from the more than one table. SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them. The best way to understand the Netezza joins is to get to know the SQL joins and then consider the algorithm that is used by database to resolve them. Read: Netezza nzsql Command and its Usage Netezza Generate Statistics: A Guide and Best Practices Commonly used Netezza Utilities Netezza Joins: Equi Join An equijoin is a join…

Continue ReadingNetezza Joins and it’s Algorithms
Comments Off on Netezza Joins and it’s Algorithms

Netezza Generate Statistics: A Guide and Best Practices

Netezza uses the cost based optimizer to determine best methods determine redistribution, scan, join, join orders. The optimizer uses statistics information to determine the most efficient way to run a query. System uses the Netezza GENERATE STATISTICS command to generate information about each column proportion of the duplicate values,unique values, NULL values and the maximum and minimum values. How Statistics are collected? Netezza system collects statistics in various ways: Statistics are automatically generated for certain database operations Collect full statistics by running Netezza GENERATE STATISTICS command. This operation is required when table changes significantly, say…

Continue ReadingNetezza Generate Statistics: A Guide and Best Practices
Comments Off on Netezza Generate Statistics: A Guide and Best Practices

Netezza nzsql Command and its Usage

The Netezza nzsql command invokes a SQL command interpreter on the Netezza host or any Unix/Linux system that can act as a edge node to connect to Netezza host. Read: Netezza Basic Commands Commonly used basic Netezza Linux Commands You can use the nzsql command to perform all the sql related stuffs. E.g. create database objects, run queries, and manage various databases created on that particular server. You cannot manage the object created in different host or server. Read: nzload Command and its Usage in Netezza Netezza nzsql Internal Slash…

Continue ReadingNetezza nzsql Command and its Usage
4 Comments

Analyze and Optimize Netezza Query Performance

Analyze and Optimize Netezza Query Performance is very important part if you are performing the reporting on top of Netezza or processing large amount of data. Netezza query tuning is very important part to improve performance. Netezza uses cost based optimizer to determine the best methods for scan and join operations, orders and redistribute or broadcast operations. Netezza system may redistribute the data for some situations like joins, grouping aggregates, create tables and when loading data into Netezza tables. System will decide the redistribution based on the cost like table…

Continue ReadingAnalyze and Optimize Netezza Query Performance
Comments Off on Analyze and Optimize Netezza Query Performance