Unique Features of Snowflake Data Warehouse

Snowflake is an analytical data warehouse solution in the cloud. It is faster, easy to use compared to other relational databases. In this article we will check few noteworthy, unique key features of Snowflake Data Warehouse compared to other relational databases in the same category.

Unique Features of Snowflake Date Warehouse

Snowflake cloud data warehouse comes with many unique features that are not present in any relational databases.

Following are the some of the key features of the Snowflake cloud data warehouse.

Unique Cloud Architecture
Database and Object Closing
Allows you to query semi-structure JSON/XML data
Warehouse Concept
Recover table using undrop
Continuous Data Protection – Time Travel
Change Data Capture (CDC) using Streams and Tasks
Data Sharing withing Snowflake Cloud
Snowpipe
Auto-optimization
Result Caching
many more…

Now, let us check these features in brief.

Unique Snowflake Cloud Architecture

Snowflake is an analytic data warehouse specifically designed for the cloud and provided as Software-as-a-Service (SaaS). It is designed from the scratch to provide better performance on the cloud. Snowflake’s architecture is a hybrid of traditional shared-disk database architectures and shared-nothing database architectures.

Snowflake is faster, easier to use cloud data warehouse compared to other relational databases. The Snowflake database support ANSI SQL with many other extended functionalities and added features.

Snowflake Architecture – Cloud Data Warehouse

Database and Object Closing

Cloning is another important feature of the Snowflake cloud data warehouse. You can use the Snowflake clone feature to create an instant copy of any Snowflake object such as database, schema, tables, etc. Snowflake usually create a copy of an object at a near real-time.

Clone is a metadata operation, it does not actually duplicate the stored data and consumes computation power. Cloning comes handy in a situation such as creating a test environment from the production database. In this case, you just have to clone database.

Working with Semi-structured Data JSON/XML

Snowflake allows you to import the semi-structured data such as such as JSON, Avro, ORC, Parquet, and XML data. It provides special column type, VARIANT, which allows you to store semi-structured data.

Snowflake also provides the functionalities to import and export semi-structured data such as JSON.

Virtual Compute Warehouse

Snowflake uses “virtual warehouses” to process every queries that you submit. Each virtual warehouse is an MPP compute cluster composed of multiple compute nodes allocated by Snowflake from a cloud provider.

You can create multiple virtual warehouse of different size as per the requirement and work load. For example, separate virtual compute warehouse for testing and production jobs. Each virtual warehouse is an independent compute cluster that does not share computing resources with other virtual warehouses. As a result, each virtual warehouse has no impact on the performance of other virtual warehouses.

Recover Snowflake Object using undrop

This is one of the unique features that native to Snowflake. You can recover the Snowflake object which is accidentally dropped. A dropped object can be restored using the undrop command in Snowflake, as long as that object is still in recovery window. i.e. not purged by the system.

For example, consider following set of commands.

drop database test_db;
undrop database test_db;
drop table your_table;
undrop table your_table;

Undrop will fail if a table with the same name already exists in the system. Hence it is important to rename the existing table and then recover table using Undrop operation.

Continuous Data Protection – Time Travel

Continuous Data Protection (CDP) is a set of features that help protect data stored in Snowflake against human error such data deletion, malicious acts such as data modification, and software or hardware failure.

Snowflake maintains historical data through Snowflake Time Travel. You can recover the historical data.

Change Data Capture (CDC) using Streams and Tasks

For any data warehouse, it is important to have change data capture (CDC) mechanism for bigger tables. It is not feasible to reload entire table when data changes. Therefore, change data model is required so that you can load only changed data.

Snowflake tackles the change data problem by introducing the concept of Streams. Basically, a Stream object keeps track of all the DML changes made to a table and has metadata of each change so that we could use this in order to extract the changed data.

Secure Data Sharing enables account-to-account sharing of data through Snowflake database tables, secure views, and secure UDFs.

For example, share particular test tables with your client so that they can verify the data before deploying code to production.

Snowflake Snowpipe

Snowpipe enables loading data from files as soon as they’re available in a stage.

Snowpipe can load data from files in micro-batches, making it available to users within minutes, rather than manually executing COPY statements on a schedule to load larger batches.

Auto Optimization in Snowflake

You need not worry about the optimizations such as modifying table structure. The Snowflake will take care of the optimizations. You just have to focus on query part. The Snowflake will take care of optimizations such as micro-partition and clustering.

Snowflake Caching Results

When a query is executed, the result is cached for a period of time (currently 24 hours). At the end of the time period, the result is purged from the system.

Snowflake uses persisted query results to avoid re-generating results when nothing has changed. So Snowflake will return results in no time when you execute the same query again within 24 hours.

Hope this helps 🙂

This Post Has 2 Comments

Faltu F February 22, 2020

Are snowflake and star dimensional schemas considered as logical modes?
1. Vithal S February 24, 2020
  
  Hi,
  
  Snowflake and star schema are data modelling methods. Snowflake model is extension of star schema dimensional modelling.
  
  Thanks

Comments are closed.