Hive Incremental Load Options and Examples

The incremental load is very common in a data warehouse environment. Incremental load is commonly used to implement slowly changing dimensions. When you migrate your data to the Hadoop Hive, you might usually keep the slowly changing tables to sync up tables with the latest data. In this article, we will check Hadoop Hive incremental load options and some examples. Hive Incremental Load Options There are many methods you can use. Apache Hive introduced to ACID supports since Hive 0.14. Following are the couple of methods that you can use…

Continue ReadingHive Incremental Load Options and Examples
Comments Off on Hive Incremental Load Options and Examples

Redshift Split String on Delimiter and Examples

Be it relational database management system or any programming language, most common requirement is the split string and extract the particular value of the result. In this article, we will check Redshift split string on delimiter with some examples. Redshift Split String Many relational databases such as Netezza, PostgreSQL, etc, supports array functions. You can use those array functions to extract records from split string result. Unfortunately, Amazon Redshift does not support array functions. Redshift does support split_part string function, you can use this function to split string on any…

Continue ReadingRedshift Split String on Delimiter and Examples
Comments Off on Redshift Split String on Delimiter and Examples

How to Export Redshift Data to JSON Format?- Method and Example

The JSON format is one of the widely used file formats to store data that you want to transmit to another server. Many web applications use JSON to transmit the application information. The JSON file format is an alternative to XML. In this article, we will check how to export redshift data to json format with some examples. What is JSON file? The JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML. The JSON file format stores the…

Continue ReadingHow to Export Redshift Data to JSON Format?- Method and Example
Comments Off on How to Export Redshift Data to JSON Format?- Method and Example

Redshift generate_series Function, Usage and Example

A lot of charts, tables and dashboards that are developed using series values such as time series. Amazon Redshift prior to the PostgreSQL 8.4 was not supported generate series function. In this article, we will check how to use Redshift generate_series function, its usage and example. Page Content Introduction Redshift generate_series() Function Redshift generate_series Function Syntax Redshift generate_series Function Example Generate Series in Reverse Order Generate Date Series in Redshift Generate Series Function with INSERT Statement Conclusion Introduction Redshift generate_series function is a powerful tool that is widely used in…

Continue ReadingRedshift generate_series Function, Usage and Example
Comments Off on Redshift generate_series Function, Usage and Example

Connect PostgreSQL using Python and Jdbc Driver- Example

PostgreSQL is one of the widely used open source relational database management system (RDBMS). Sometimes, it is simply called Postgres. Many modern day databases such as Redshift, Netezza, Vertica,etc are based on the PostgreSQL. Postgres supports both JDBC and OBDC drivers. You can use those drivers from any programming language to connect. In this article, we will check how to connect PostgreSQL using Python and Jdbc driver with a working example. PostgreSQL JDBC Driver PostgreSQL offers drivers for the programming languages and tools that are compatible with JDBC API. You…

Continue ReadingConnect PostgreSQL using Python and Jdbc Driver- Example
Comments Off on Connect PostgreSQL using Python and Jdbc Driver- Example

What is Hive Lateral View and How to use it?

The best part of Apache Hive is it supports array types. i.e. you can store the array values in Hive table columns. With the help of an array, you can minimize the table rows by grouping together in the form of an array. In this article, we will check what is the Hive lateral view and how to use it with array values. You can use lateral view either with EXPLODE or INLINE function. What is Hive Lateral View? Before going in detail, let us check what is lateral view? In Hive, lateral view…

Continue ReadingWhat is Hive Lateral View and How to use it?
Comments Off on What is Hive Lateral View and How to use it?

Hive Insert into Partition Table and Examples

The Hive INSERT command is used to insert data into Hive table already created using CREATE TABLE command. Inserting data into partition table is a bit different compared to normal insert or relation database insert command. There are many ways that you can use to insert data into a partitioned table in Hive. In this article, we will check Hive insert into Partition table and some examples. Hive Insert into Partition Table As mentioned earlier, inserting data into a partitioned Hive table is quite different compared to relational databases. You…

Continue ReadingHive Insert into Partition Table and Examples
2 Comments

Apache Hive Type Conversion Functions and Examples

Apache Hive has some very strict rules regarding data types for function parameters that you provide while executing it. Hive type conversion functions are used to explicitly convert to the required type and format. For example, Hive does not convert DOUBLE to FLOAT, INT to STRING etc. In my other post, we have discussed on Hive date functions and examples. In this article, we will check out Cloudera Hive type conversion functions with some examples. Related Article Commonly used Apache Hive Date Functions and Examples Apache Hive Type Conversion Functions…

Continue ReadingApache Hive Type Conversion Functions and Examples
2 Comments

How to Save Spark DataFrame as Hive Table – Example

Apache Spark is one of the highly contributed frameworks. Many e-commerce, data analytics and travel companies are using Spark to analyze the huge amount of data as soon as possible. Because of in memory computations, Apache Spark can provide results 10 to 100X faster compared to Hive. In this article, we will check How to Save Spark DataFrame as Hive Table? and some examples. How to Save Spark DataFrame as Hive Table? Because of its in-memory computation, Spark is used to process the complex computation. In case if you have…

Continue ReadingHow to Save Spark DataFrame as Hive Table – Example
Comments Off on How to Save Spark DataFrame as Hive Table – Example

Hive Insert from Select Statement and Examples

Apache Hive is the data warehouse framework on top of the Hadoop distributed file system (HDFS). It provides a query language called Hive Query Language, HiveQL or HQL. HiveQL syntax is similar to SQL syntax with minor changes. Similar to SQL insert statements, HQL also supports inserting data into tables using various methods. In this article, we will check one of the data insert methods into Hive table using a Select statement or clause. Hive Insert Data into Table Methods Below are the some of commonly used methods to insert…

Continue ReadingHive Insert from Select Statement and Examples
Comments Off on Hive Insert from Select Statement and Examples