Sqoop Export Hive Tables into Netezza

Hadoop systems are mostly best suited for batch processing. Reporting is not recommended on Hadoop Hive or Impala. Sometimes to enable faster reporting, organizations transfer the processed data from Hadoop ecosystem to high performance relational databases such as Netezza. In this article, we will check Sqoop export Hive tables into Netezza with working examples. Sqoop Export Hive Tables into Netezza In some cases, data processed by Hadoop ecosystem may be needed in production systems hosted on relational databases to help run additional critical business functions and generate reports. The Sqoop can exports…

Continue ReadingSqoop Export Hive Tables into Netezza
Comments Off on Sqoop Export Hive Tables into Netezza

Apache Hive User-defined Functions

Apache Hive is a data warehouse framework on top of Hadoop ecosystem. The Apache Hive architecture is different compared to other Hadoop tools that are available. Being an open source project, Apache Hive has added a lot of functionalities since its inception. But it still lacks some basic functionalities that are available in traditional data warehouse systems such as Netezza, Teradata, Oracle, etc. In this post, we will check Apache Hive user-defined functions and how to use them to perform a specific task. Apache Hive User-defined Functions When you start…

Continue ReadingApache Hive User-defined Functions
Comments Off on Apache Hive User-defined Functions

Hadoop Hive Regular Expression Functions and Examples

The Hadoop Hive regular expression functions identify precise patterns of characters in the given string and are useful for extracting string from the data and validation of the existing data, for example, validate date, range checks, checks for characters, and extract specific characters from the data. In this article, we will be checking some commonly used Hadoop Hive regular expressions with an examples. Types of Hadoop Hive regular expression functions As of now, Hive supports only two regular expression functions: REGEXP_REPLACE REGEXP_EXTRACT Hive REGEXP_REPLACE Function Searches a string for a…

Continue ReadingHadoop Hive Regular Expression Functions and Examples
Comments Off on Hadoop Hive Regular Expression Functions and Examples

Hive CREATE INDEX to Optimize and Improve Query Performance

The main goal of creating INDEX on Hive table is to improve the data retrieval speed and optimize query performance. For example, let us say you are executing Hive query with filter condition WHERE col1 = 100, without index hive will load entire table or partition to process records and with index on col1 would load part of HDFS file to process records. But be informed that Index on hive table is not recommended. The create index will help if you are migrating your existing data warehouse to Hive and…

Continue ReadingHive CREATE INDEX to Optimize and Improve Query Performance
Comments Off on Hive CREATE INDEX to Optimize and Improve Query Performance

Hive Create View Syntax and Examples

You can use Hive create view to create a virtual table based on the result-set of a complex SQL statement that may have multiple table joins. The CREATE VIEW statement lets you create a shorthand abbreviation for a more complex and complicated query. Apache Hive view is purely a logical construct (an alias for a complex query) with no physical data behind it. Note that, Hive view is different from lateral view.  Read: Hive CREATE INDEX to Optimize and Improve Query Performance Hadoop Hive Bucket Concept and Bucketing Examples Hive…

Continue ReadingHive Create View Syntax and Examples
Comments Off on Hive Create View Syntax and Examples

Hadoop – Export Hive Data with Quoted Values into Flat File and Example

In general, quoted values are values which are enclosed in single or double quotation marks. Usually, quoted values files are system generated where each and every fields in flat files is either enclosed in SINGLE or DOUBLE quotation mark. In this article, we will check how to export Hadoop Hive data with quoted values into flat file such as CSV file format. Quoted Value File Overview In the quoted values files, values are enclosed in quotation mark in case there is a embedded delimiter. For example, comma separated values file…

Continue ReadingHadoop – Export Hive Data with Quoted Values into Flat File and Example
Comments Off on Hadoop – Export Hive Data with Quoted Values into Flat File and Example

Hadoop Security – Hadoop HDFS File Permissions

Hadoop HDFS file permissions are almost similar to the POSIX file system. In a Linux system, we usually create OS level users and make them members of an existing operating system group. But in Hadoop, we create directory and associate it with an owner and a group. Hadoop HDFS File and Directory Permissions The following sections show Hadoop HDFS file and directory permissions: Just like Linux operating system, Hadoop uses notation (r,w) to denote read and write permissions. There is an execute (x) permission for files but you cannot execute…

Continue ReadingHadoop Security – Hadoop HDFS File Permissions
Comments Off on Hadoop Security – Hadoop HDFS File Permissions

Migrating Netezza Data to Hadoop Ecosystem and Sample Approach

In my other post ‘Migrating Netezza to Impala SQL Best Practices’, we have discussed various best practices to migrate the Netezza SQL scripts to Impala SQL. In this article, we will discuss steps on Migrating Netezza Data to Hadoop Ecosystem. Migrating Netezza Data to Hadoop Ecosystem – Offload Netezza data to Hadoop HDFS Now a days Hadoop ecosystem is gaining popularity and organization with huge data wants to migrate to Hadoop ecosystem for their faster analytics that includes real-time or near real-time. Steps to Migrating Netezza Data to Hadoop Ecosystem…

Continue ReadingMigrating Netezza Data to Hadoop Ecosystem and Sample Approach
2 Comments

Hive Create Table Command and Examples

The syntax of creating a Hive table is quite similar to creating a table using SQL. In this article explains Hive create table command and examples to create table in Hive command line interface. You will also learn on how to load data into created Hive table. Hive Create Table Command Hive Create Table statement is used to create table. You can also create the table hive while importing data using Sqoop command. To use, Sqoop create Hive table command, you should specify the --create-hive-table option in Sqoop command. You…

Continue ReadingHive Create Table Command and Examples
Comments Off on Hive Create Table Command and Examples

Hadoop Hive Bucket Concept and Bucketing Examples

Hadoop Hive bucket concept is dividing Hive partition into number of equal clusters or buckets. The bucketing concept is very much similar to Netezza Organize on clause for table clustering. Hive bucket is decomposing the hive partitioned data into more manageable parts. Let us check out the example of Hive bucket usage. Let us say we have sales table with sales_date, product_id, product_dtl etc. The Hive table will be partitioned on sales_date and product_id as the second-level partition would have led to too many small partitions in HDFS. To tackle this…

Continue ReadingHadoop Hive Bucket Concept and Bucketing Examples
Comments Off on Hadoop Hive Bucket Concept and Bucketing Examples