An Introduction to Cloudera Hadoop Impala Architecture

Cloudera Hadoop impala architecture is very different compared to other database engine on HDFS like Hive. The Impala server is a distributed, massively parallel processing (MPP) database engine. The architecture is similar to the other distributed databases like Netezza, Greenplum etc. Hadoop impala consists of different daemon processes that run on specific hosts within your CDH cluster. Read: Sqoop Architecture Sqoop Import Sqoop Export Netezza and Hadoop Integration Hadoop HDFS Architecture Introduction and Design Cloudera Hadoop Impala Architecture Overview The Hadoop impala is consists of three components: The Impala Daemon,…

Continue ReadingAn Introduction to Cloudera Hadoop Impala Architecture
Comments Off on An Introduction to Cloudera Hadoop Impala Architecture

Netezza Table Locking and Concurrency

You cannot explicitly lock the tables in Netezza. The Netezza SQL, however, uses implicit Netezza table locking when there is a DDL operation on it. For example, drop table command is blocked on the table if a DML commands are running on table and vice versa. Netezza uses the serializable transaction isolation to lock the table and is ACID property compliant. That ensures no dirty reads, no non repeatable reads. Read: How Netezza Updates Records in Table? Netezza Identify and Kill Table Locks nzsql Command and its Usage Netezza Best…

Continue ReadingNetezza Table Locking and Concurrency
Comments Off on Netezza Table Locking and Concurrency

Identify and Remove Netezza Duplicate Records in Table

Netezza do not have primary or unique key. You can insert the duplicate records in the table. There are no constraints to ensure uniqueness or primary key, but if you have a table and have loaded data twice, then you can de-duplicate in several ways. Below methods explain you how to identify and Remove Netezza Duplicate Records Read: Netezza Pivot Rows to Column with Example Netezza Primary Key Constraint and Syntax 1. Use Intermediate and DISTINCT Keyword You can remove the Netezza duplicate records by creating another table using DISTINCT…

Continue ReadingIdentify and Remove Netezza Duplicate Records in Table
Comments Off on Identify and Remove Netezza Duplicate Records in Table

How Netezza Update Records in Tables?

Netezza update records operation is costlier. IBM Netezza does not perform updates, but rather does deletes the records and inserts updated values. When you run nzsql command to update record, Netezza marks the record being updated as logically deleted by setting current transaction value to the deletexid field, but does not delete it. This ensures that the database system adheres to the ACID properties of RDBMS SQL standards. How Netezza Update Records in Tables? Each record in Netezza contains two slots, one for createxid another for deletexid. Deletexid allows you…

Continue ReadingHow Netezza Update Records in Tables?
Comments Off on How Netezza Update Records in Tables?

Netezza Encrypt Password with nzpassword Command Utility

Database user accounts must be authenticated during access requests to the IBM Netezza database. You can secure the password by using Netezza encrypt password facility. Local authentication requires a password for every account which connects to the Netezza server. You must enter the clear text password, when you use Netezza CLI commands. You can set the environment variable NZ_PASSWORD to avoid the type of password every time but this variable also stores the clear text password. Read: Commonly used Netezza Utility Netezza Best Practices to Improve Performance Netezza nzsql Commands and…

Continue ReadingNetezza Encrypt Password with nzpassword Command Utility
Comments Off on Netezza Encrypt Password with nzpassword Command Utility

Netezza Fixed-Width File Loading and Examples

Fixed width text files are special cases of text files where the format is specified by column widths, pad character and left or right alignment. In this format column width are in terms of units of characters. In this post we will learn about Netezza Fixed-Width file loading. Fixed-Width File Overview All data is a series of byte-sequences and has an associated data type, used here as a conceptual or abstract attribute of the data. Fixed-length format files use ordinal positions, which are offsets to identify where fields are within…

Continue ReadingNetezza Fixed-Width File Loading and Examples
Comments Off on Netezza Fixed-Width File Loading and Examples

Netezza Best Practices to Improve Performance

Today there is a increased demand in advanced analytics on big data. Netezza is designed with built in functionalities to perform advanced analytics on really big data sets. To improve the performance, you should follow some Netezza best practices. Best practice should not mean hundreds of rules and regulations to follow. Recommended that basic principles are followed on following features of Netezza: Distribution Data types Statistics Zone maps Clustered base tables Groom table command Netezza Best Practices on Distributions In a Netezza data warehouse appliance good distribution is fundamental element…

Continue ReadingNetezza Best Practices to Improve Performance
Comments Off on Netezza Best Practices to Improve Performance

IBM Bluemix Speech TO Text Transcription in Python – Tutorial

Speech recognition and sentimental analysis are very important part of machine learning. In this tutorial, we will learn IBM Bluemix Speech to Text Transcription file in Python and copy those files to Hadoop ecosystem for further analysis. Once you have data in HDFS format you can torture the data to get the desired results. In this post will walk you through creating speech to text transcription file using IBM Bluemix and copy that file to Hadoop HDFS. IBM Bluemix Speech to Text Transcription in Python - Steps Below are the…

Continue ReadingIBM Bluemix Speech TO Text Transcription in Python – Tutorial
2 Comments

Netezza Cross Database Access and its Restrictions

Netezza cross database access does allows you to execute the objects such as tables, view, synonyms that are available on the same Netezza server. You can can INSERT, UPDATE or DELETE data from current database by referring objects in other database on same server. For example, TRAINING1.ADMIN(ADMIN)=>SELECT * FROM TRAINING1..TEST1; Read: Access Netezza Database, Tools and Examples Netezza nzsql Command and its Usage Netezza Synonym Best Practices and Examples Referencing Database Object from other Database To access objects in other databases on the same Netezza system, you must use three-level…

Continue ReadingNetezza Cross Database Access and its Restrictions
Comments Off on Netezza Cross Database Access and its Restrictions

Migrating Netezza to Impala SQL Best Practices

Now a days everybody wants to migrate to Hadoop environment for their analytics that includes real-time or near real-time. In this post i will explain some best practices in Migrating Netezza to Impala SQL. Impala uses the standard SQL but still you might need to modify the source SQL when bringing specific application to Hadoop Impala due to variations in data types, built-in function and obviously Hadoop specific syntax. Even if the SQL is working correctly in Impala, you might consider rewriting it to improve performance. Read: Netezza Hadoop Connector…

Continue ReadingMigrating Netezza to Impala SQL Best Practices
2 Comments