How to Query S3 External Files in Snowflake?

You can integrate the cloud storages, such as AWS S3, Azure Blob and GCP cloud storage with Snowflake cloud data warehouse very easily. Snowflake does support external tables, you can create external tables on the top of the files stored on the external storages such as S3, blob or GCP storage. In this article, we will check how to query S3 external files directly in Snowflake. How to Query S3 External Files in Snowflake? Snowflake Cloud data warehouse supports using standard SQL to query data files located in an internal…

Continue ReadingHow to Query S3 External Files in Snowflake?
Comments Off on How to Query S3 External Files in Snowflake?

Snowflake Error handling – Procedure and Functions

Snowflake Cloud data warehouse supports stored procedures to support migration from other relational databases. Snowflake allows you to write stored procedure using JavaScript APIs. It also supports error handling using try/catch block. However, it also supports built-in functions such as try_cast to handle error during type conversion. In this article, we will check methods for handling error in Snowflake procedures and functions. Snowflake Error handling The Snowflake allows you to handle errors with the help of try/catch block when you write a procedure or function using the JavaScript API. Snowflake…

Continue ReadingSnowflake Error handling – Procedure and Functions
2 Comments

Redshift Nested Window Functions and Examples

Amazon Redshift cloud data warehouse supports many useful windows or analytical functions. Many Redshift reporting queries use the analytic functions such as cumulative sum and average. But, because of Redshift limitation, you cannot call an analytics function within another. Whenever you try to call an analytics function within another analytics function, you will end up with an error such as "Invalid operation: aggregate function calls may not have nested aggregate or window function". In this article, we will check how to use the nested window functions in Amazon Redshift with…

Continue ReadingRedshift Nested Window Functions and Examples
Comments Off on Redshift Nested Window Functions and Examples

How to Write Parameterized Queries in Snowflake?

Snowflake supports almost all features that are available in legacy relational databases such as Oracle, Teradata, etc. Along with those features, Snowflake provides many extensions that will allow you to write efficient queries. One of such features is ability to SET and pass variables to the query, In my other articles, we have discussed how to create parameterized views in Snowflake. In this article, we will check how to write parameterized queries in Snowflake with the help of the SET command. Why Do We Need Queries with Parameters in Snowflake?…

Continue ReadingHow to Write Parameterized Queries in Snowflake?
Comments Off on How to Write Parameterized Queries in Snowflake?

How to Get Row Count of Database Tables in Snowflake?

Counting the number of records from the database tables is one of the mandatory checks when you migrate data from one server to another. Snowflake is very rich in metadata information. It captures many useful information such as details about tables, columns, views, etc. In this article, we will write few useful queries to get row count of Snowflake database tables. Get Row Count of Database Tables in Snowflake You can look for object metadata information either in INFROMATION_SCHEMA for a particular database or utilize the ACCOUNT_USAGE that Snowflake provides…

Continue ReadingHow to Get Row Count of Database Tables in Snowflake?
Comments Off on How to Get Row Count of Database Tables in Snowflake?

How to Handle Duplicate Records in Snowflake Insert?

Snowflake does not enforce constraints on tables. Handling duplicate records in Snowflake insert is one of the common requirements while performing a data load. Snowflake supports many methods to identify and remove duplicate records from the table. In this article, we will check how to handle duplicate records in the Snowflake insert statement. It is basically one of the alternative methods to enforce the primary key constraints on Snowflake table. Handle Duplicate Records in Snowflake Insert Snowflake allows you to identify a column as a primary key, but it doesn't…

Continue ReadingHow to Handle Duplicate Records in Snowflake Insert?
Comments Off on How to Handle Duplicate Records in Snowflake Insert?

Snowflake Nested Window Functions and Examples

Snowflake supports many useful windows or analytical functions. Many reporting queries use the analytic functions such as cumulative sum and average. But, whenever you try to call an analytics function within another analytics function, you will end up with an error such as "may not be nested inside another window function.". In this article, we will check how to use the nested window functions in Snowflake with an alternate example. Snowflake does not allow you to define the nested window function. You will have to use alternative methods such as…

Continue ReadingSnowflake Nested Window Functions and Examples
Comments Off on Snowflake Nested Window Functions and Examples

How to Search String in Spark DataFrame? – Scala and PySpark

Being a data engineer, you may work with many different kinds of datasets. You will always get a requirement to filter out or search for a specific string within a data or DataFrame. For example, identify the junk string within a dataset. In this article, we will check how to search a string in Spark DataFrame using different methods. How to Search String in Spark DataFrame? Apache Spark supports many different built in API methods that you can use to search a specific strings in a DataFrame. Following are the…

Continue ReadingHow to Search String in Spark DataFrame? – Scala and PySpark
Comments Off on How to Search String in Spark DataFrame? – Scala and PySpark

How to Find Tables Size in Spark SQL? – Scala Example

Be it relational database, Hive, or Spark SQL, Finding the table size is one of the common requirements. Relational databases such as Snowflake, Teradata, etc support system tables. You can use those system tables to identify the size of tables. But, there are no system tables in Spark SQL. You can make use of the Spark catalog API to find the tables size in the Spark SQL database. Find Tables Size in Spark SQL Starting version 2.0, Spark supports catalog API. It has many useful methods such as listtables, listdatabases,…

Continue ReadingHow to Find Tables Size in Spark SQL? – Scala Example
Comments Off on How to Find Tables Size in Spark SQL? – Scala Example

How to Print SQL Query in Snowflake Stored Procedure?

As explained in my other article, to support migration from other relational databases, Snowflake supports the stored procedures. Snowflake uses JavaScript as a procedural language. It provides many features including control structures - branching, looping, Dynamic SQL, error handling, etc. But, JavaScript API, does not provide any print statement support to display the content of variable or SQL query itself. In this article, we will check how to print SQL query in Snowflake stored procedure. Print SQL Query in Snowflake Stored Procedure Snowflake stored procedure support many useful JavaScript API's.…

Continue ReadingHow to Print SQL Query in Snowflake Stored Procedure?
Comments Off on How to Print SQL Query in Snowflake Stored Procedure?