Hadoop Hive Date Functions and Examples

Many applications manipulate the date and time values. Latest Hadoop Hive query language support most of relational database date functions. In this article, we will check commonly used Hadoop Hive date functions and some of examples on usage of those functions. Hadoop Hive Date Functions Date types are highly formatted and very complicated. Each date value contains the century, year, month, day, hour, minute, and second. We shall see how to use the Hadoop Hive date functions with an examples. You can use these functions as Hive date conversion functions…

Continue ReadingHadoop Hive Date Functions and Examples
3 Comments

Hadoop Hive Analytic Functions and Examples

Hadoop Hive analytic functions compute an aggregate value that is based on a group of rows. A Hadoop Hive HQL analytic function works on the group of rows and ignores the NULL in the data if you specify. Hadoop Hive analytic functions Latest Hive version includes many useful functions that can perform day to day aggregation. Note that, Hive is batch query processing engine and hence take more time to execute. Read: Apache Hive ROWNUM Pseudo Column Equivalent Hadoop Hive Date Functions and Examples Spark SQL Analytic Functions and Examples…

Continue ReadingHadoop Hive Analytic Functions and Examples
5 Comments

Hadoop Hive Cumulative Sum, Average and Example

Latest version of Hive HQL supports the window analytics functions. You can make use of the Hadoop Hive Analytic functions to calculate the cumulative sum or running sum and cumulative average. Sum and Average analytical functions are used along with window options to calculate the Hadoop Hive Cumulative Sum or running sum. Hadoop Hive Cumulative Sum, Average Syntax: Below are the Syntax for Apache Hive Cumulative SUM, AVG analytic functions. You can use these function within query you have requirement to calculate cumulative SUM or AVG. SUM([DISTINCT | ALL] expression)…

Continue ReadingHadoop Hive Cumulative Sum, Average and Example
2 Comments

Run Hive Script File Passing Parameter and Working Example

Hive is used for batch and interactive SQL queries. Variable Substitution allows for tasks such as separating environment-specific configuration variables from code. The variable substitution is very important when you are calling the HQL scripts from shell or Python. You can pass the values to query that you are calling. In this article, we will see how to run Hive script file passing parameter to it. We also see the working examples. Run Hive Script File Passing Parameter You can use the set and use that variable within the script.…

Continue ReadingRun Hive Script File Passing Parameter and Working Example
Comments Off on Run Hive Script File Passing Parameter and Working Example

Hive String Functions and Examples

In this article, we will discuss on the various Hive string functions and usage. The HQL string functions are similar to the SQL string functions. Hive String Functions The string functions in Hive are listed below: Read: Apache Hive Extract Function Alternative and Examples Apache Hive group_concat Alternative and Example Hadoop Hive Regular Expression Functions and Examples Hadoop Hive Date Functions and Examples Hive concat (string A, string B,...) Function This Hive built-in strig function cocatenates all the given strings: hive> select CONCAT('concat','->','demo'); OK concat->demo Hive substr(string, int start, int…

Continue ReadingHive String Functions and Examples
Comments Off on Hive String Functions and Examples

Hadoop Hive Table Dynamic Partition and Examples

Partition in Hive is used for the better performance. Hive supports the single or multi column partition. You can manually add the partition to the Hive tables or Hive can dynamically partition. You can choose either methods based on your needs. In this article, we will discuss about the Hadoop Hive table dynamic partition and demonstrate using examples. Hadoop Hive Table Dynamic Partition In Hadoop Hive, data is stored as files on HDFS, whenever you partition the table in Hive, it creates sub directories within main directory using the partition…

Continue ReadingHadoop Hive Table Dynamic Partition and Examples
Comments Off on Hadoop Hive Table Dynamic Partition and Examples

Different Hive Join Types and Examples

Join is a clause that is used for combining specific fields from two or more tables based on the common columns. The joins in the hive are similar to the SQL joins. Joins are used to combine rows from multiple tables. In this article, we will learn about different Hive join types with examples. Read: Hadoop Hive Bucket Concept and Bucketing Examples Hive Create Table Command and Examples Hive Create View Syntax and Examples Below are the tables that we will be using to demonstrate different Join types in Hive:…

Continue ReadingDifferent Hive Join Types and Examples
Comments Off on Different Hive Join Types and Examples

Commonly used Hadoop Hive Commands and Examples

If you are already familiar with the SQL then Hive command syntax are easy to understand. In this article, we will discuss on the commonly used Hadoop Hive commands. Read: Cloudera Impala Generate Sequence Numbers without UDF Netezza ROWNUM Pseudo Column Alternative Run Impala SQL Script File Passing argument and Working Example  An Introduction to Hadoop Cloudera Impala Architecture Commonly used Hadoop Hive commands Below are the most commonly used Hadoop Hive commands: Hive Create Database A database is a collection of namespace in Hive. Below is the syntax to…

Continue ReadingCommonly used Hadoop Hive Commands and Examples
Comments Off on Commonly used Hadoop Hive Commands and Examples

How to Learn Apache Hadoop

Most of you want to know what Apache Hadoop is, how and where to start learning it? Here I’m going to share you some of steps I followed to learn hadoop.  Don’t worry!  You don’t have to be a Java programmer to learn Hadoop. You should know little bit of basic Linux commands. You will learn all remaining programming languages once you login to cluster :-) Let’s first know what is Hadoop? Apache Hadoop is an open source framework to process very large data sets (BigData). Hadoop allows the distributed storage and…

Continue ReadingHow to Learn Apache Hadoop
Comments Off on How to Learn Apache Hadoop