Hadoop Hive Bucket Concept and Bucketing Examples

Hadoop Hive bucket concept is dividing Hive partition into number of equal clusters or buckets. The bucketing concept is very much similar to Netezza Organize on clause for table clustering. Hive bucket is decomposing the hive partitioned data into more manageable parts. Let us check out the example of Hive bucket usage. Let us say we have sales table with sales_date, product_id, product_dtl etc. The Hive table will be partitioned on sales_date and product_id as the second-level partition would have led to too many small partitions in HDFS. To tackle this…

Continue ReadingHadoop Hive Bucket Concept and Bucketing Examples
Comments Off on Hadoop Hive Bucket Concept and Bucketing Examples

Load HBase Table from Apache Hive – Examples

In my other post “Sqoop import Relational Database Table into HBase Table” you learned on how to import data from relational database systems such as Netezza. And we have also seen how to export the HBase table data to relational database using Hive framework. In this article, we will check how to load HBase table from Apache Hive with an example. Why you want Load HBase Table from Apache Hive? This is obvious question, why you want to load HBase table from apache Hive? You may offload part of the…

Continue ReadingLoad HBase Table from Apache Hive – Examples
Comments Off on Load HBase Table from Apache Hive – Examples

Apache Hive Load Quoted Values CSV File and Examples

If you are reading this post, then you probably are considering using BigData or started BigData ecosystem for your huge data processing. When you say huge data, that means you may get all different kind of structured, unstructured and semi-structured data. Hive is just like your regular data warehouse appliances and you may receive files with single or double quoted values. In this article, we will see Apache Hive load quoted values CSV files and see some examples for the same. Apache Hive Load Quoted Values CSV File Let us…

Continue ReadingApache Hive Load Quoted Values CSV File and Examples
2 Comments

Netezza Select Random Rows and Example

If you are working on data warehouse or any database query then you might have received the request to get random numbers based on on some key columns. In this article, we will check Netezza select random rows in nzsql and explanation with an examples. This article also explains you on Netezza select random samples that you may use in other client related applications. Netezza Select Random Rows To demonstrate the Netezza select random, we will use the Netezza random() built in function. Netezza Select Random Rows Example  Suppose you…

Continue ReadingNetezza Select Random Rows and Example
2 Comments

Sqoop Export HBase Table into Relational Database

You can use Apache Sqoop to export HBase table into relational table (RDBMS). Sqoop does not support direct export from HBase to relational databases. You have to use the work around to export data out to relational database, in this article, we will check out Sqoop export HBase table into relational database and steps with an examples. Sqoop Export HBase Table into Relational Database HBase structure doesn't map very well to the typical relational database such as Netezza, Oracle, SQL Servers etc. In relational databases fixed schema for the tables…

Continue ReadingSqoop Export HBase Table into Relational Database
Comments Off on Sqoop Export HBase Table into Relational Database

Apache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet

Apache Hive supports several familiar file formats used in Apache Hadoop. Hive can load and query different data file created by other Hadoop components such as Pig or MapReduce. In this article, we will check Apache Hive different file formats such as TextFile, SequenceFile, RCFile, AVRO, ORC and Parquet formats. Cloudera Impala also supports these file formats. Hive Different File Formats Different file formats and compression codecs work better for different data sets in Apache Hive. Following are the Apache Hive different file formats: Text File Sequence File RC File…

Continue ReadingApache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet
Comments Off on Apache Hive Different File Formats:TextFile, SequenceFile, RCFile, AVRO, ORC,Parquet

Hadoop Hive WITH Clause Syntax and Examples

With the Help of Hive WITH clause you can reuse piece of query result in same query construct. You can also improve the Hadoop Hive query using WITH clause. You can simplify the query by moving complex, complicated repetitive code to the WITH clause and refer the logical table created in your SELECT statements. Hadoop Hive WITH Clause A Hive WITH Clause can be added before a SELECT statement of you query, to define aliases for complex and complicated expressions that are referenced multiple times within the body of the…

Continue ReadingHadoop Hive WITH Clause Syntax and Examples
1 Comment

Hadoop Hive Conditional Functions: IF,CASE,COALESCE,NVL,DECODE

Hadoop Hive supports the various Conditional functions such as IF, CASE, COALESCE, NVL, DECODE etc. You can use these function for testing equality, comparison operators and check if value is null. Following diagram shows various Hive Conditional Functions: Hive Conditional Functions Below table describes the various Hive conditional functions: Conditional Function Description IF(boolean testCondition, T valueTrue, T valueFalseOrNull); This is the one of best Hive Conditional Functions and is similar to the IF statements in other programming languages. The IF Hive Conditional functions tests an expression and returns a corresponding…

Continue ReadingHadoop Hive Conditional Functions: IF,CASE,COALESCE,NVL,DECODE
3 Comments

Hadoop Hive Date Functions and Examples

Many applications manipulate the date and time values. Latest Hadoop Hive query language support most of relational database date functions. In this article, we will check commonly used Hadoop Hive date functions and some of examples on usage of those functions. Hadoop Hive Date Functions Date types are highly formatted and very complicated. Each date value contains the century, year, month, day, hour, minute, and second. We shall see how to use the Hadoop Hive date functions with an examples. You can use these functions as Hive date conversion functions…

Continue ReadingHadoop Hive Date Functions and Examples
3 Comments

Commonly used Cloudera Impala Date Functions and Examples

This article is about short descriptions and examples of the commonly used Cloudera Impala date functions that you can use to manipulate date columns in Impala SQL. In the real word scenarios many application manipulate the date and time data types. Impala SQL supports most of the date and time functions that relational databases supports. Date types are highly formatted and very complicated. Each date value contains the century, year, month, day, hour, minute, and second. We shall see how to use the Impala date functions with an examples. Cloudera…

Continue ReadingCommonly used Cloudera Impala Date Functions and Examples
3 Comments