Snowflake Architecture – Cloud Data Warehouse

Snowflake is an analytic data warehouse on cloud provided as Software-as-a-Service (SaaS). Snowflake is faster, easier to use cloud data warehouse compared to other relational databases. The Snowflake database support ANSI SQL with added functionalities. In this article, we will check Snowflake architecture and how it is different from other relational databases. Snowflake Architecture Snowflake runs on the cloud such as Amazon AWS, Microsoft Azure, and Google cloud. It uses virtual compute instances for its compute needs and a storage service for persistent storage of data. As per the official…

Continue ReadingSnowflake Architecture – Cloud Data Warehouse
Comments Off on Snowflake Architecture – Cloud Data Warehouse

Teradata Type Conversion Functions and Examples

Teradata is one of the common and widely used MPP database. Just like many relational databases, Teradata supports many useful functions. You can use these functions to covert the value of one data type to another. In this article, will check commonly used Teradata type conversion functions with some examples. The type conversion functions use common calling function i.e. the first argument is the value to be formatted or converted, and the second argument is a template that defines the output or input format. These conversion functions should be used…

Continue ReadingTeradata Type Conversion Functions and Examples
Comments Off on Teradata Type Conversion Functions and Examples

How to Check Integer Type Values in Teradata? Example

Data validation is one of the most important task in the data warehouse environment. Data validation includes, integer type check, count check, etc. For instance, check the row count after data migration. In my other article, we have discussed how to identify the decimal type values. In this article, we will see how to check integer type values in Teradata with some example. Integer type check can also be refereed as a integer data validation. Teradata Integer Type Values Check Just like many other relational databases, Teradata also provides many…

Continue ReadingHow to Check Integer Type Values in Teradata? Example
Comments Off on How to Check Integer Type Values in Teradata? Example

Teradata isnumeric Function Alternatives and Examples

Teradata is one of the widely used MPP databases. It can be used to combine many data sources. When you work with heterogeneous data set, you may have to get rid of many unwanted characters such as $ in your price column. If your application requirement is of numeric type, you may get requirement to filter out non-numeric values. In this article, we will check Teradata isnumeric function alternatives with some examples. We will also see how to check if string is numeric with an example. Teradata isnumeric Function The…

Continue ReadingTeradata isnumeric Function Alternatives and Examples
Comments Off on Teradata isnumeric Function Alternatives and Examples

Hive Table Sampling – Concept and Example

The Relational databases like SQL server supports writing queries on a relatively small number of rows from the very large table. In this article, we will check Hive table sampling concept, methods and some examples. Hive Table Sampling Concept The Hive TABLESAMPLE clause allows the users to write queries for samples of the data instead of the whole table. The sampling comes handy when you are working on the large tables and it takes time to return results. The TABLESAMPLE clause can be added to any table in the FROM…

Continue ReadingHive Table Sampling – Concept and Example
Comments Off on Hive Table Sampling – Concept and Example

Apache Hive Integer Value Check – Examples

In a data warehouse environment, there are many options that you can check for an integer value. Using this process, you can usually remove the unwanted records and save some I/O. For example, filter out non-numeric values when comparing it with integer types. In this article, we see different methods to check integer value in Hive. Apache Hive Integer Value Check Many relational databases provide an extended SQL function to help the data warehouse developers. The built-in functions such isnumeric is used to check given string value is a number…

Continue ReadingApache Hive Integer Value Check – Examples
Comments Off on Apache Hive Integer Value Check – Examples

Working with Hive Macros, Syntax and Examples

Many relational databases such as Teradata supports Macro functions. In RDBMS, Macros are stored in the data dictionary. Users can share macros and can execute based on the requirements. Hive Macros are a bit different compared to that of relational databases. In this article, we will check what are Macros, its syntax, how to use them and some macro examples. What are Macros in Hive? The macros in Hive are set of SQL statements which are stored and executed by calling macro function names. Macros exist for the duration of the current session. Macros are…

Continue ReadingWorking with Hive Macros, Syntax and Examples
Comments Off on Working with Hive Macros, Syntax and Examples

Spark DataFrame Column Type Conversion using CAST

In my other post, we have discussed how to check if Spark DataFrame column is of Integer Type. Some application expects column to be of a specific type. For example, Machine learning models accepts only integer type. In this article, we will check how to perform Spark DataFrame column type conversion using the Spark dataFrame CAST method. Spark DataFrame Column Type Conversion You can use the Spark CAST method to convert data frame column data type to required format. Test Data Frame Following is the test data frame (df) that…

Continue ReadingSpark DataFrame Column Type Conversion using CAST
Comments Off on Spark DataFrame Column Type Conversion using CAST

Spark DataFrame Integer Type Check and Example

Apache Spark is one of the easiest framework to deal with different data sources. You can combine heterogeneous data source with the help of dataFrames. Some application, for example, Machine Learning model requires only integer values. You should check the data type of the dataFrame before feeding it to ML models, or you should type cast it to an integer type. In this article, how to perform Spark dataFrame integer type check and how to convert it using CAST function in Spark. Spark DataFrame Integer Type Check Requirement As mentioned…

Continue ReadingSpark DataFrame Integer Type Check and Example
Comments Off on Spark DataFrame Integer Type Check and Example

How to Create Spark SQL User Defined Functions? Example

A user defined function (UDF) is a function written to perform specific tasks when built-in function is not available for the same. In a Hadoop environment, you can write user defined function using Java, Python, R, etc. In this article, we will check how to create Spark SQL user defined functions with an python user defined functionexample. Spark SQL User-defined Functions When you migrate your relational database warehouse to Hive and use Spark as an execution engine, you may miss some of the built-in function support. Some user defined functions…

Continue ReadingHow to Create Spark SQL User Defined Functions? Example
Comments Off on How to Create Spark SQL User Defined Functions? Example