Lateral Column Alias in Databricks – Example

The lateral column alias in Databricks allows users to reuse an expression specified earlier in the same SELECT list, eliminating the need to use nested subqueries and Common Table Expressions (CTEs) in many cases. This blog post discusses the use cases of the feature and the benefits it brings to Spark and Databricks users. Page Contents Introduction What is Lateral Column Alias Support? Benefits of using Lateral Column Alias in Databricks SQL Alternative Methods to Lateral Column Alias in Databricks SQL Conclusion Introduction Databricks has introduced a much needed lateral…

Continue ReadingLateral Column Alias in Databricks – Example
Comments Off on Lateral Column Alias in Databricks – Example

Database Migration to Snowflake: Best Practices and Tips

The Snowflake cloud data warehouse has become widely recognized as a flexible, high-performing, and scalable solution for both data warehousing and analytics needs. This article will explore how to migrate a database to Snowflake cloud data warehouse and also provide insights into some best practices for the migration. Page Content Introduction Preparing for Migration Migrating to Snowflake Best Practices for Database Migration to Snowflake Best Practices for File Sizing and Format Best Practices for Data Transfer Best Practices for Running Source and Snowflake Databases Best Practices for Temporary and Transient…

Continue ReadingDatabase Migration to Snowflake: Best Practices and Tips
Comments Off on Database Migration to Snowflake: Best Practices and Tips

Reuse Column Aliases in BigQuery – Lateral Column alias

BigQuery lateral Column alias are columns that are derived from the previously computed columns in same SELECT statement. Derived columns or lateral Column alias are virtual columns that are not physically stored in the table. Their values are re-calculated every time they are referenced in a query. Many PostgreSQL relational databases such as Netezza supports reuse of the column aliases within the same SELECT statement but GCP BigQuery does not support reuse of calculated derived columns. In this article, we will identify the alternate methods reuse column aliases in Google…

Continue ReadingReuse Column Aliases in BigQuery – Lateral Column alias
Comments Off on Reuse Column Aliases in BigQuery – Lateral Column alias

How to use Amazon Redshift Replace Function?

The Amazon Redshift REPLACE function is one of the important string functions. The replace function allows you to manipulate the string in Amazon Redshift. This string function is similar to translate and regexp_replace functions. Amazon Redshift Replace Function Page Contents Introduction to Amazon Redshift Introduction to Amazon Redshift Syntax of the Redshift Replace Function Redshift Replace Function Example Usage of the Redshift Replace Function Best practices for using the Redshift Replace Function Conclusion Introduction to Amazon Redshift Amazon Redshift is a fully managed, cloud-based data warehouse service offered by Amazon…

Continue ReadingHow to use Amazon Redshift Replace Function?
Comments Off on How to use Amazon Redshift Replace Function?

How to Optimize Query Performance on Redshift?

In most of the cases, we pay lots of attention to improve the performance of the web application, but ignore the back-end SQL performance tuning. Amazon Redshift is a fully managed, petabyte-scale, massively parallel data warehouse that offers simple operations and high performance. Amazon Redshift can run a data model such as production transaction system third-normal-form model, star and snowflake schemas, data vault, or simple flat tables. This article takes you through the most common performance-related opportunities when writing query in Amazon Redshift and gives you concrete guidance on how to optimize…

Continue ReadingHow to Optimize Query Performance on Redshift?
Comments Off on How to Optimize Query Performance on Redshift?

Redshift RSQL Control Statements – IF-ELSE-GOTO-LABEL

Amazon Redshift is a data warehousing service provided by Amazon Web Services (AWS). It allows users to store and analyze large amounts of data in a scalable and cost-effective manner. Amazon AWS Redshift RSQL is a command-line client for interacting with Amazon Redshift clusters and databases. Redshift RSQL is similar to Teradata BTEQ and is used to interact with the data stored in a Redshift cluster. In this article, we will check Amazon Redshift RSQL control statements such as IF, ELSE, GOT, LABEL,etc. Redshift RSQL Control Statements Amazon Redshift RSQL…

Continue ReadingRedshift RSQL Control Statements – IF-ELSE-GOTO-LABEL
Comments Off on Redshift RSQL Control Statements – IF-ELSE-GOTO-LABEL

How to Create Synonym in Snowflake?

Synonyms in relational databases allows you to create easily names for long table, view names, or any other objects such as sequence, procedure, function, materialized view. Databases such as Netezza, Oracle support creating and managing synonyms. The synonyms provide an alternate way of referencing tables or views that present in the current or other databases. The Snowflake database does not support creating synonym yet. In this article, we will check an alternate method similar to create synonym in Snowflake. How to Create Synonym in Snowflake? The Snowflake cloud database does…

Continue ReadingHow to Create Synonym in Snowflake?
Comments Off on How to Create Synonym in Snowflake?

Spark SQL Count Distinct Window Function

Windows are commonly used analytical functions in a Spark SQL query. The COUNT is one of such a windows functions that will allow you to count over certain window. Many relational databases such as Oracle support COUNT window function with distinct keyword. However, Spark SQL does not support count distinct window function. Spark SQL Count Distinct Window Function Spark SQL supports count window functions. However, the COUNT window function with distinct keyword is not supported as of now. Following is the example of an Oracle COUNT window function with distinct…

Continue ReadingSpark SQL Count Distinct Window Function
Comments Off on Spark SQL Count Distinct Window Function

Rows Affected by Last Snowflake SQL Query – Example

Many modern databases such as Teradata, SQL Server, etc. support system variables to get rows affected by the last DML command. Then the row count is used to make certain decision to continue execution or abort it. Snowflake scripting provides system variables that hold the information of records affected by an INSERT, UPDATE or DELETE queries in the current session. In this article, we will check how to get rows affected by last Snowflake SQL Query. Rows Affected by Last Snowflake DML Statement Rows Affected by Last Snowflake SQL Query After each DML command is…

Continue ReadingRows Affected by Last Snowflake SQL Query – Example
Comments Off on Rows Affected by Last Snowflake SQL Query – Example

Blocks in Snowflake Scripting – Examples

Snowflake scripting allows you to write procedural code within the Snowflake scripting block. In this article, we will check how to use blocks in a Snowflake scripting. Blocks in Snowflake Scripting Blocks in Snowflake Scripting Block in a Snowflake scripting is defined by a keywords such as DECLARE, BEGIN, EXCEPTION and END. You usually write your procedural code into different part. Structure of Block in a Snowflake Scripting Snowflake scripting block has the following basic structure: DECLARE ... variable declarations, ... ... cursor declarations, etc ... BEGIN ... Snowflake Scripting…

Continue ReadingBlocks in Snowflake Scripting – Examples
Comments Off on Blocks in Snowflake Scripting – Examples