Rename PySpark DataFrame Column – Methods and Examples

A DataFrame in Spark is a dataset organized into named columns. Spark data frame is conceptually equivalent to a table in a relational database or a data frame in R/Python, but with richer optimizations. When you work with Datarames, you may get a requirement to rename the column. In this article, we will check how to rename a PySpark DataFrame column, Methods to rename DF column and some examples. Rename PySpark DataFrame Column As mentioned earlier, we often need to rename one column or multiple columns on PySpark (or Spark) DataFrame. Note…

Continue ReadingRename PySpark DataFrame Column – Methods and Examples
Comments Off on Rename PySpark DataFrame Column – Methods and Examples

Snowflake Pattern Matching – LIKE, LIKE ANY, CONTAINS, LIKE ALL Conditions

The pattern matching conditions in Snowflake are used to search a string for a given pattern. You can search for the string by matching particular patterns. Snowflake Pattern Matching A pattern-matching operator searches a string for a pattern specified in the conditional expression and returns either Boolean (true/ false) or matching value if it finds a match. These conditions are particularly important when you need to search string patterns in your database column values. Pattern matching conditions are mainly used in WHERE conditions. Following are the commonly used pattern matching…

Continue ReadingSnowflake Pattern Matching – LIKE, LIKE ANY, CONTAINS, LIKE ALL Conditions
Comments Off on Snowflake Pattern Matching – LIKE, LIKE ANY, CONTAINS, LIKE ALL Conditions

Redshift NOT NULL Constraint, Syntax and Examples

Similar to most of the MPP databases such as Snowflake, the Amazon Redshift database allows you to define constraints. The Redshift database does not enforce constraints like primary key, foreign key and unique key. But, it does enforce the NOT NULL constraint. In this article, we will check Redshift NOT NULL constraint, its syntax and usage. Redshift NOT NULL Constraint Constraints other than NOT NULL are created as disabled. Amazon Redshift enforces only NOT NULL. You can create NOT NULL constraint while creating tables. Redshift NOT NULL Constraint Syntax There…

Continue ReadingRedshift NOT NULL Constraint, Syntax and Examples
Comments Off on Redshift NOT NULL Constraint, Syntax and Examples

SQL Merge Operation Using Pyspark – UPSERT Example

In the relational databases such as Snowflake, Netezza, Oracle, etc, Merge statement is used to manipulate the data stored in the table. In this article, we will check how to SQL Merge operation simulation using Pyspark. The method is same in Scala with little modification. SQL Merge Statement The MERGE command in relational databases, allows you to update old records and insert new records simultaneously. This command is sometimes called UPSERT (UPdate and inSERT command). Following is the sample merge statement available in RDBMS. merge into merge_test using merge_test2 on…

Continue ReadingSQL Merge Operation Using Pyspark – UPSERT Example
1 Comment

SQL and Hive GROUP BY Alternative-Example

It is common to write the queries using GROUP BY and HAVING clause to group records or rows. Group by clause use columns in Hive or relational database tables for grouping particular column values mentioned with the group by. But, GROUP BY and DISTINCT operations are costly. It is applicable to both Hive and relational databases. But, in some cases, you can rewrite the queries to remove GROUP BY clause. In this article, we will check what are GROUP BY alternative methods available in Hive and SQL. SQL and Hive…

Continue ReadingSQL and Hive GROUP BY Alternative-Example
Comments Off on SQL and Hive GROUP BY Alternative-Example

How to Execute Snowflake Commands from Shell Script?- Example

Snowflake is one of the leading cloud data warehouse providers. Snowflake provides support for many leading programming languages either by providing JDBC, ODBC drivers, or language specific connectors (Python connector). In this article, we will check how to execute Snowflake commands from shell script with some examples. Execute Snowflake Commands from Shell Script The best part about Snowflake is that it provides an interactive terminal called SnowSQL. You can use it to execute queries, create database objects and perform some of the admin tasks. You can call the SnowSQL from…

Continue ReadingHow to Execute Snowflake Commands from Shell Script?- Example
Comments Off on How to Execute Snowflake Commands from Shell Script?- Example

Generate Snowflake Objects DDL using GET_DDL Function

Snowflake is a fully managed cloud data warehouse solution provided on AWS, Azure and GCP. You don't have to manage hardware's and your only task is to manage databases and tables that you create as part of your project development. In this article, we will check one of the administrative tasks, generate DDL for Snowflake objects such as view, tables DDL using built-in GET_DDL function. Snowflake Objects DDL using GET_DDL Function Snowflake provides many useful functions to make developers and administrators task easy. One of such function is GET_DDL function,…

Continue ReadingGenerate Snowflake Objects DDL using GET_DDL Function
Comments Off on Generate Snowflake Objects DDL using GET_DDL Function

Snowflake Control Structures – IF, DO, WHILE, FOR

The best part about Snowflake is it supports JavaScript as a programming language to write stored procedures and user defined functions. The Stored procedure uses JavaScript to combine SQL with control structures such as branching and looping. In this article, we will check Snowflake branching and looping control structures. Snowflake Control Structures You can use two types of control structures inside stored procedures and user defined functions. Following are the Snowflake control structures Branching Structures - Sometimes called conditional control structures Looping Structures - Sometimes called Iterative control structures. Branching…

Continue ReadingSnowflake Control Structures – IF, DO, WHILE, FOR
Comments Off on Snowflake Control Structures – IF, DO, WHILE, FOR

Snowflake REPLACE Function, Usage and Examples

Just like translate function, replace function is also one of the widely used string functions in Snowflake. The replace function is commonly used to manipulate the strings or expression. For example, replace the part of sub-string across the input string or expression. In this article, we will check replace function, its syntax, usage with some examples. Snowflake REPLACE Function In general, SQL replace function replaces each instance of a pattern in the input with the value in the string replacement. Snowflake replace removes all occurrences of a specified substring, and…

Continue ReadingSnowflake REPLACE Function, Usage and Examples
Comments Off on Snowflake REPLACE Function, Usage and Examples

Snowflake TRANSLATE Function, Usage and Examples

There are many situations in data warehouse where you need to replace the one character with another. For example, replace special character such as symbol in the input expression with a space or remove it. In this article, we will check how to use the Snowflake TRANSLATE function to replace characters with some examples. Snowflake TRANSLATE Function In general, you can use translate function to translate or replace one or more characters into another set of characters. Snowflake supports the translate function, which performs the same job as translate function…

Continue ReadingSnowflake TRANSLATE Function, Usage and Examples
Comments Off on Snowflake TRANSLATE Function, Usage and Examples