Working with Amazon Redshift Stored Procedure

Amazon Redshift supports stored procedures written in PL/pgSQL, a dialect of the PostgreSQL database language. Stored procedures in Redshift are used to encapsulate logic on the database side and can be executed by multiple client applications, improving code reusability and reducing the amount of duplicated code in your applications.

In this article, we will check how to create a stored procedure in Redshift using simple examples.

Amazon Redshift Stored Procedure

Overview

Amazon Redshift stored procedures are used to encapsulate the data migration, data validation and business specific logic’s and same time handle the exceptions if any in your data or custom exception handling. Redshift support PostgreSQL and that provides the power to get and update the database information on the Redshift database, and the procedure language provides the logic for if-then-else branching and application processing on the data.

You can use your conditional logic’s such as if-else, while statements. Stored procedure also support building SQL logic dynamically and execute them.

For example, let us say you are working on ETL logic to migrate data. You would be working on many intermediate tables in your logic. You may want to check the intermediate table if its exists in database before dropping it. And you achieve this by creating stored procedure that checks for existence of table in your schema. e.g. CALL DROP_IF_EXIST (table_name);

Redshift Procedural Language – plpgsql

As mentioned earlier, Redshift allows you to create a stored procedures using postgre plpgsql language. plpgsql is a scripting language embedded in SQL based on Postgres PL/pgSQL language. As a procedural language, it has branch, loop, subprogram structures. Subprograms. Procedures take an argument and declare internal variables. These procedures once stored in specific database can be called multiple times.

plpgsql Supported procedural logic’s

Redshift plpgsql language provides following procedural logic:

Conditionals (if/else)
Looping (while, for)
Execution of SQL
Variables
Returning a scalar result or a result set
Input arguments

Redshift Stored Procedure Syntax

You can create a Redshift stored procedures using CREATE OR REPLACE PROCEDURE command. The plpgsql provides a block-quoting mechanism. i.e. Stored producers are defined as a block or body or numerous lines of text.

Following is the syntax to create stored procedures in Redshift database:

CREATE [ OR REPLACE ] PROCEDURE sp_procedure_name 
  ( [ [ argname ] [ argmode ] argtype [, ...] ] )
AS $$
  procedure_body
$$ LANGUAGE plpgsql 
;

Let us check each part of syntax :

CREATE OR REPLACE PROCEDURE is used to create or replace a stored procedure in Redshift. If the stored procedure already exists, the OR REPLACE keyword will replace the existing procedure with the new definition.
sp_procedure_name is the name of the stored procedure that you want to create.
argname, argmode and argtype : represents parameter name, its input mode such as IN, OUT or INOUT and parameter data type.
AS $$ is the start of the stored procedure body. The stored procedure body contains the variable declaration and SQL statements including loops that you want the stored procedure to execute.
END; is the end of the stored procedure body.
LANGUAGE plpgsql; specifies the language in which the stored procedure is written. Redshift supports Python, SQL for writing user-defined functions.

Redshift Stored Procedure Example

Following is the simple stored procedure example:

CREATE OR REPLACE PROCEDURE redshift_sample_sp()
AS $$
BEGIN
  RAISE INFO 'This is sample Redshift Stored Procedure';
END;
$$
LANGUAGE plpgsql
;

Executing Redshift stored Procedures

You can execute Redshift stored procedures using CALL command.

For example, you can execute above created stored procedure using below command:

dev=# call redshift_sample_sp();
INFO:  This is sample Reshift Stored Procedure
CALL

Exceptions and Error Handing in Redshift Stored Procedures

In case if any error encountered during the execution of a stored procedure ends the current execution flow and ends the transaction. If there are any delete or update statements will rollback to original state.

As an Amazon Redshift stored procedure best practices, you should capture any exception that occurred in your stored procedure using EXCEPTION block. The only supported condition is OTHERS, which matches every error type except query cancellation.

CREATE OR REPLACE PROCEDURE redshift_sample_sp()
AS $$
BEGIN
  RAISE INFO 'This is sample Redshift Stored Procedure';
  RAISE EXCEPTION 'There are no SQL statements';
EXCEPTION
  WHEN OTHERS THEN
    RAISE INFO 'Exception Occurred';
END;
$$
LANGUAGE plpgsql
;

Here is the output of above exception handling example:

dev=# call redshift_sample_sp();
INFO:  This is sample Redshift Stored Procedure
INFO:  Exception Occurred
ERROR:  There are no SQL statements

In an Amazon Redshift stored procedures, the only supported handler_statement is RAISE.

Redshift Drop Procedure

You can drop stored procedure using DROP PROCEDURE command:

DROP PROCEDURE redshift_sample_sp()

Redshift Stored Procedure Advantages

There are several advantages of using stored procedures. Following are some of important Redshift stored procedure benefits.

Improved performance: Redshift Stored procedure performance is improved by reducing network time between an application client system and the database server.
Easy to Maintain: Stored procedures reside only one location. Hence versioning and updates need only be made in one place to keep the application up-to-date for all users thus Managing Redshift stored procedures is easy.
Security: You can grant a user permission to run a stored procedure without granting that user explicit access to underlying tables or views.

Redshift Stored Procedures Limitations

Below are some limitations or restrictions on using Stored procedures in Redshift:

Procedure source code cannot exceed 2MB.
Only one cursor can be opened per session in stored procedure.
The maximum number of levels for nested calls is 16.
The maximum number of procedure parameters is 32 for input arguments and 32 for output arguments.
You can declare maximum 1024 variables inside stored procedures.
Statements such as VACUUM, ALTER TABLE APPEND, and CREATE EXTERNAL TABLE are not supported inside Redshift SP.

Hope this helps 🙂