Steps to Connect to Hive Using Beeline CLI

Beeline is a JDBC client that is based on the SQLLine CLI. HiveServer2 supports a command shell Beeline that works with HiveServer2. In this article, we will check how to connect to Hive using Beeline CLI and see some examples to execute HiveQL scripts. Connect to Hive Using Beeline CLI Beeline works on both standalone mode (embedded mode) as well as remote mode. Standalone more or embedded mode, it executes embedded Hive like Hive CLI, and you can use remote mode to connect separate hiveserver2 over thrift. Read: HiveServer2 Beeline…

Continue ReadingSteps to Connect to Hive Using Beeline CLI
Comments Off on Steps to Connect to Hive Using Beeline CLI

Redshift User Defined Functions Examples

Create a user defined function is one of the important features of Amazon Redshift cloud data warehouse. Amazon Redshift supports creating user defined functions. You can create custom user-defined functions (UDF) using either SQL SELECT statements or Python program. In this article, we will check Redshift user defined functions examples and how to create them. Redshift User Defined Functions User defined function (UDF) in Redshift provide a powerful way to perform complex data processing tasks within the Redshift database. They can provide more control and flexibility over data processing workflows, and…

Continue ReadingRedshift User Defined Functions Examples
Comments Off on Redshift User Defined Functions Examples

Redshift Column Compression Types – Compression Encoding

Amazon Redshift, is a fast, fully managed, petabyte-scale data warehousing service. Redshift supports various columnar compression encoding techniques. In this article we will check Redshift column compression types - Compression Encoding techniques for Redshift performance optimization. Amazon Redshift supports data compression for reducing the amount of disk storage used by a table. Amazon Redshift compression encoding can significantly reduce the storage required for a table and improve query performance by reducing I/O operations. The column compression in Redshift also helps storage optimization. Redshift Column Compression Types – Compression Encoding Columnar…

Continue ReadingRedshift Column Compression Types – Compression Encoding
Comments Off on Redshift Column Compression Types – Compression Encoding

Cloudera Impala Extract Numbers using Regular Expressions

Impala supports various built in functions those you can use to extract the numbers from strings values. If you are working on various data sets then there is a possibility that you may get data which is corrupted or merged with other fields. You can separate that data using various string functions available in Impala. In this article, we will discuss about Impala extract numbers using regular expressions and examples. You can read about regular expression in my other post: Cloudera Impala Extract Function and Examples Cloudera Impala String Functions…

Continue ReadingCloudera Impala Extract Numbers using Regular Expressions
Comments Off on Cloudera Impala Extract Numbers using Regular Expressions

Cloudera Impala Create View Syntax and Examples

A View creates a pseudo-table or virtual table. It appears exactly as a regular table, you can use it in SELECT statements, JOINs etc. The Impala CREATE VIEW statement allows you to create a shorthand abbreviation for a more complicated query. The base query can have tables, joins, column alias etc. In this article, we will check Cloudera Impala create view syntax and some examples. Just like views or table in other database, an Impala view contains rows and columns. The fields in a view are fields from one or…

Continue ReadingCloudera Impala Create View Syntax and Examples
Comments Off on Cloudera Impala Create View Syntax and Examples

Different Redshift Join Types and Examples

SQL Join is a clause that is used for combining specific fields from two or more tables based on the common columns available. Joins are used to combine rows from multiple tables. In this article, we will learn about different Redshift join types with some examples. Below are the tables that we will be using to demonstrate different Join types available in Redshift: Students: Id   name   city 1    AAA    London 2    BBB    Mumbai 3    CCC    Bangalore 4    DDD    Mumbai 5    EEE     Bangalore Dept: Did  name    sid 100  CS       1 101 …

Continue ReadingDifferent Redshift Join Types and Examples
2 Comments

Amazon Redshift Extract Numbers using Regular Expressions

Amazon Redshift supports various built in functions, you can use them to extract the numbers, alphanumeric or specific patters from strings. In this article, we will discuss about Redshift extract numbers using regular expressions and examples. In data warehouse environment, you may have different types data files extracted from different data sources. Data might be corrupted or may have unwanted characters, you can clean such a data using Redshift regular expressions. You can read about regular expression in my other post: Redshift Regular Expression Functions and Examples Redshift Extract Numbers…

Continue ReadingAmazon Redshift Extract Numbers using Regular Expressions
Comments Off on Amazon Redshift Extract Numbers using Regular Expressions

Redshift NVL and NVL2 Functions with Examples

Like many other relational database or data warehouse appliances, Redshift supports NVL and NVL2 functions. These functions are mainly used to handle the null values in Redshift tables. Redshift NVL Function An NVL expression returns the value of the first expression in the list that is not null. The NVL function replaces a NULL value with a replacement string that you provide in the function as argument. This function returns the first argument if it is not null, otherwise the second argument. The Redshift NVL function is equivalent to the…

Continue ReadingRedshift NVL and NVL2 Functions with Examples
Comments Off on Redshift NVL and NVL2 Functions with Examples

How Redshift Distributes Table Data? Importance of right Distribution Key

Amazon Redshift uses the Massively parallel processing technique, Redshift automatically distributes data and query load across all nodes available in the cluster. In this article, we will check how Redshift distributes table data and importance of right distribution key. How Redshift distributes Table Data? Amazon Redshift uses the three types of distribution; EVEN, KEY and ALL. When you create tables you will have to tell the system which distribution it should use. You may read about distribution types and best practices: Amazon Redshift Distribution Types and Examples If you specify…

Continue ReadingHow Redshift Distributes Table Data? Importance of right Distribution Key
Comments Off on How Redshift Distributes Table Data? Importance of right Distribution Key

Redshift ROWNUM Pseudo Column Alternative

There is no ROWNUM pseudo column in Redshift. If you are coming from Oracle database background, you will find it difficult in Redshift without ROWNUM pseudo column. The one possible solution to this is ROW_NUMBER() analytical function as Redshift ROWNUM pseudo column alternative. ROWNUM is sometime useful when you are working with multi-level SQL queries. You can restrict the rows using ROW_NUMBER functions. Read: Redshift String Functions and Examples Amazon Redshift Date Functions and Examples Redshift Analytics Functions and Examples Redshift also support a LIMIT clause to restrict the output.…

Continue ReadingRedshift ROWNUM Pseudo Column Alternative
1 Comment