Greenplum Encryption Options and Best Practices

To minimize the data breaches, now a day’s companies are increasingly adding security and cryptographic functions to their data at rest. This applies to the most of the big data appliances such as Greenplum, Netezza, Redshift etc. In this post we will see how the Greenplum encryption works. Greenplum support the data encryption at various level: Encrypting the Connections to the Database Encryption of data in Transit Encryption of data at Rest Database Connections Encryption In the Greenplum systems, connections between clients and the master database can be encrypted with SSL. This…

Continue ReadingGreenplum Encryption Options and Best Practices
Comments Off on Greenplum Encryption Options and Best Practices

Netezza Date Functions and Examples

This article is about detailed descriptions and examples of the standard Netezza date functions that you can use to manipulate date columns in the Netezza SQL and Netezza stored procedure. In the real word scenarios many application manipulate the date and time data types. Date types are highly formatted and very complicated. Each date value contains the century, year, month, day, hour, minute, and second. Each RDBMS may employ different date functions, and there may also be differences in the syntax for each RDBMS even when the function call is the…

Continue ReadingNetezza Date Functions and Examples
Comments Off on Netezza Date Functions and Examples

Quick and best way to Compare Two Tables in SQL

Say you have requirement to compare two tables. You have two tables in same database or server that you wish to compare, and check if any changes in the column values or see if any row is missing in either of tables. Below are some of the methods you can use to compare two tables in SQL. Compare Two Tables using UNION ALL UNION allows you to compare data from two similar tables or data sets. It also handles the NULL values to other NULL values which JOIN or WHERE…

Continue ReadingQuick and best way to Compare Two Tables in SQL
Comments Off on Quick and best way to Compare Two Tables in SQL

Built-in Greenplum Analytics Functions and Examples

Window functions or Greenplum analytics functions compute an aggregated value that is based on a group of rows. These functions allow the application developers to more easily write complex online analytical processing (OLAP) queries using standard SQL commands. For example, with Greenplum analytics functions or windows expressions, users can calculate moving averages or sums over various intervals, ranks as selected column values etc. Read: Greenplum Computed Column Support and Alternative Greenplum Architecture Greeplum Analytic Functions Examples Here are the examples of some commonly used Greenplum analytics functions: COUNT Analytics functions…

Continue ReadingBuilt-in Greenplum Analytics Functions and Examples
Comments Off on Built-in Greenplum Analytics Functions and Examples

Greenplum Skew and How to Avoid it

Greenplum is a MPP shared nothing environment. Data is spread across the many segments located on the multiple segment hosts. If the data is distributed properly, no two segments in the system have same data. The even distribution of the data is determined by the column(s) provided in the DISTRIBUTED BY clause. Greenplum skew is the table situation that degrade the performance. System distributes the rows with same distribution values to same segment. Hence, the more the unique value in the distribution column, the better. In case if the data…

Continue ReadingGreenplum Skew and How to Avoid it
Comments Off on Greenplum Skew and How to Avoid it

Greenplum Interview Questions and Answers – Part1

Explain Greenplum Architecture.  Read Post: Greenplum Architecture How data is distributed using hash algorithm? Read Post : How Greenplum Hash Distribution Works  What are different ways to get data into Greenplum data warehouse? COPY FROM Gpload INSERT statement Create EXTERNAL TABLE Explain how data is stored in Greenplu? Data is stored based on selected field (s) which are used for distribution. When you have a Distribution Key by Hash the values of the Distribution Key are run through a Hash Formula. Then, a map is used to distribute the row to the…

Continue ReadingGreenplum Interview Questions and Answers – Part1
4 Comments

How Greenplum Hash Distribution works?

When you have a Distribution Key by Hash and the values in that column are unique, the data will spread evenly evenly across all segments in Greenplum system. The Greenplum system distributes the rows with same distribution value to the same segment. This is because the data values in the hash key use a hashing algorithm. How Hash Algorithm Works in Distributed systems? Data is stored based on selected field (s) which are used for distribution. When you have a Distribution Key by Hash the values of the Distribution Key…

Continue ReadingHow Greenplum Hash Distribution works?
Comments Off on How Greenplum Hash Distribution works?

Greenplum Table Distribution and Best Practices

Greenplum is a massive parallel processing data store, and data is distributed across segments as per the definition of the distribution strategy. Greenplum Table Distribution uses the two types of distribution, Hash and Random. When you create or alter tables you will have to tell the system which distribution it should use. By default, Greenplum database data distribution uses the hash algorithm. Types of Greenplum Data Distribution Greenplum database distributes data using two methods Column Oriented/Hash Distribution: Distributes data evenly across all segment using the column specified in DISTRIBUTED BY…

Continue ReadingGreenplum Table Distribution and Best Practices
Comments Off on Greenplum Table Distribution and Best Practices

Greenplum Constraints:Table and Column Constraints

Greenplum Constraints are used to apply business rules for the database tables. You can define constraints on columns and tables to restrict the data in your tables. Greenplum Database support for constraints is the same as PostgreSQL with some limitations. Read: Greenplum Sequence and its Usage Greenplum Data Loading Options Greenplum constraints includes: CHECK NOT NULL UNIQUE PRIMARY KEY FOREIGN KEY CHECK Greenplum Constraints and Example CHECK Greenplum Constraints allows you to specify that the value in a certain column must satisfy a Boolean expression. The boolean condition will evaluate to…

Continue ReadingGreenplum Constraints:Table and Column Constraints
Comments Off on Greenplum Constraints:Table and Column Constraints

Access Greenplum Database with No Password Prompt

Users can access Greenplum database using a PostgreSQL-compatible psql client. Users can always connect to the Greenplum database via masters; the segments cannot accept any client connection. Segments can only store user data and process the query distributed by the masters. Couple of options available to set up connection with no password prompt. Read: Greenplum Architecture Greenplum Data Loading Options Option 1. Export Greenplum Database Environmental Variables In order to access Greenplum database with no password prompt, you need to set up some environmental variables. Environmental Variable Description PGHOST The…

Continue ReadingAccess Greenplum Database with No Password Prompt
Comments Off on Access Greenplum Database with No Password Prompt