Details about bigdata

Sqoop Command with Secure Password

Sqoop commands allows you to exchange the data between Hadoop and relational databases such as Netezza, Oracle etc. Sqoop required the password to connect to various databases and of course it has to be secured. In this article, we will discuss on various ways to execute the Sqoop Command with Secure Password. Read: Sqoop import Relational Database Table into HBase Table Import data using Sqoop Export data using Sqoop Sqoop Architecture – Mappers with No Reducers Sqoop Command with Secure Password Below are the some of the methods that we can…

Continue ReadingSqoop Command with Secure Password
Comments Off on Sqoop Command with Secure Password

Hadoop Hive Analytic Functions and Examples

Hadoop Hive analytic functions compute an aggregate value that is based on a group of rows. A Hadoop Hive HQL analytic function works on the group of rows and ignores the NULL in the data if you specify. Hadoop Hive analytic functions Latest Hive version includes many useful functions that can perform day to day aggregation. Note that, Hive is batch query processing engine and hence take more time to execute. Read: Apache Hive ROWNUM Pseudo Column Equivalent Hadoop Hive Date Functions and Examples Spark SQL Analytic Functions and Examples…

Continue ReadingHadoop Hive Analytic Functions and Examples
5 Comments

Hadoop Hive Cumulative Sum, Average and Example

Latest version of Hive HQL supports the window analytics functions. You can make use of the Hadoop Hive Analytic functions to calculate the cumulative sum or running sum and cumulative average. Sum and Average analytical functions are used along with window options to calculate the Hadoop Hive Cumulative Sum or running sum. Hadoop Hive Cumulative Sum, Average Syntax: Below are the Syntax for Apache Hive Cumulative SUM, AVG analytic functions. You can use these function within query you have requirement to calculate cumulative SUM or AVG. SUM([DISTINCT | ALL] expression)…

Continue ReadingHadoop Hive Cumulative Sum, Average and Example
2 Comments

Run Hive Script File Passing Parameter and Working Example

Hive is used for batch and interactive SQL queries. Variable Substitution allows for tasks such as separating environment-specific configuration variables from code. The variable substitution is very important when you are calling the HQL scripts from shell or Python. You can pass the values to query that you are calling. In this article, we will see how to run Hive script file passing parameter to it. We also see the working examples. Run Hive Script File Passing Parameter You can use the set and use that variable within the script.…

Continue ReadingRun Hive Script File Passing Parameter and Working Example
Comments Off on Run Hive Script File Passing Parameter and Working Example

Hive String Functions and Examples

In this article, we will discuss on the various Hive string functions and usage. The HQL string functions are similar to the SQL string functions. Hive String Functions The string functions in Hive are listed below: Read: Apache Hive Extract Function Alternative and Examples Apache Hive group_concat Alternative and Example Hadoop Hive Regular Expression Functions and Examples Hadoop Hive Date Functions and Examples Hive concat (string A, string B,...) Function This Hive built-in strig function cocatenates all the given strings: hive> select CONCAT('concat','->','demo'); OK concat->demo Hive substr(string, int start, int…

Continue ReadingHive String Functions and Examples
Comments Off on Hive String Functions and Examples

Hadoop Hive Table Dynamic Partition and Examples

Partition in Hive is used for the better performance. Hive supports the single or multi column partition. You can manually add the partition to the Hive tables or Hive can dynamically partition. You can choose either methods based on your needs. In this article, we will discuss about the Hadoop Hive table dynamic partition and demonstrate using examples. Hadoop Hive Table Dynamic Partition In Hadoop Hive, data is stored as files on HDFS, whenever you partition the table in Hive, it creates sub directories within main directory using the partition…

Continue ReadingHadoop Hive Table Dynamic Partition and Examples
Comments Off on Hadoop Hive Table Dynamic Partition and Examples

Cloudera Impala Cumulative Sum, Average and Example

You can make use of the Cloudera impala Analytic functions to calculate the cumulative sum or running sum. Sum and Average analytical functions are used along with window options to calculate the Cloudera Impala Cumulative Sum or running sum. Cloudera Impala Cumulative Sum, Average Syntax: Below are the Syntax for Cloudera Impala Cumulative SUM, AVG analytic functions. You can defined ORDER BY clause with column inside OVER clause. SUM([DISTINCT| ALL] expression)[OVER (analytic_clause)] AVG([DISTINCT| ALL] expression)[OVER (analytic_clause)] Cloudera Impala Cumulative Sum, Average Examples Impala Cumulative Sum and Average. Query: select name, amount,…

Continue ReadingCloudera Impala Cumulative Sum, Average and Example
Comments Off on Cloudera Impala Cumulative Sum, Average and Example

Different Hive Join Types and Examples

Join is a clause that is used for combining specific fields from two or more tables based on the common columns. The joins in the hive are similar to the SQL joins. Joins are used to combine rows from multiple tables. In this article, we will learn about different Hive join types with examples. Read: Hadoop Hive Bucket Concept and Bucketing Examples Hive Create Table Command and Examples Hive Create View Syntax and Examples Below are the tables that we will be using to demonstrate different Join types in Hive:…

Continue ReadingDifferent Hive Join Types and Examples
Comments Off on Different Hive Join Types and Examples

Commonly used Hadoop Hive Commands and Examples

If you are already familiar with the SQL then Hive command syntax are easy to understand. In this article, we will discuss on the commonly used Hadoop Hive commands. Read: Cloudera Impala Generate Sequence Numbers without UDF Netezza ROWNUM Pseudo Column Alternative Run Impala SQL Script File Passing argument and Working Example  An Introduction to Hadoop Cloudera Impala Architecture Commonly used Hadoop Hive commands Below are the most commonly used Hadoop Hive commands: Hive Create Database A database is a collection of namespace in Hive. Below is the syntax to…

Continue ReadingCommonly used Hadoop Hive Commands and Examples
Comments Off on Commonly used Hadoop Hive Commands and Examples

Hadoop Streaming Map Reduce using Python

In this article, we will check how to work with Hadoop Streaming Map Reduce using Python. Hadoop Streaming First let us check about Hadoop streaming! Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer. If you are using any language that support standard input and output, that can be used to write the Hadoop Map-Reduce job for examples, Python, C# etc. Read: Hadoop HDFS Schema Design for…

Continue ReadingHadoop Streaming Map Reduce using Python
1 Comment