Hive on Error Stop Script Execution – Options

When you build a data warehouse on top of Hadoop HDFS using Hive framework, you may have to execute HiveQL or SQL queries or HiveQL script containing a bunch of HiveQL statements. Hive and Beeline does provide option to execute a script file. There may be a scenario in which you may want to stop the script execution in case if any of the SQL statement fails. In this article, we will check stop script execution on error in Hive. We shall see both Hive and Beeline CLI options to exit script execution in case…

Continue ReadingHive on Error Stop Script Execution – Options
Comments Off on Hive on Error Stop Script Execution – Options

Hive Dynamic SQL Support and Alternative

Dynamic SQL queries are created on the fly and executed. Dynamic SQL lets SQL statements be defined and execute at run time, i.e. you can build SQL queries based on the user input and execute them to provide required output. For examples, pass a session specific value to the HQL queries dynamically during runtime. In this article, we will check how to build Apache Hive Dynamic SQL queries and how to execute them. Hive Dynamic SQL Support Apache Hive version 1.x and Cloudera impala does not support dynamic SQL, you…

Continue ReadingHive Dynamic SQL Support and Alternative
Comments Off on Hive Dynamic SQL Support and Alternative

Methods to Access Hive Tables from Python

Apache Hive is database framework on the top of Hadoop distributed file system (HDFS) to query structured and semi-structured data. Just like your regular RDBMS, you access hdfs files in the form of tables. You can create tables, views etc in Apache Hive. You can analyze structured data using HiveQL language which is similar to Structural Query Language (SQL). In this article, we will check different methods to access Hive tables from python program. Methods we are going to discuss here will help you to connect Hive tables and get…

Continue ReadingMethods to Access Hive Tables from Python
Comments Off on Methods to Access Hive Tables from Python

Methods to Access Hive Tables from Apache Spark

Now a days, with growing data size, Apache Spark is gaining importance. It is open-source general purpose and lightning fast distributed computing framework. Apache Spark is 100 times faster compared to Hadoop technologies. Considering its speed, you can use Apache Spark to access Hive metastore and process required data. In this post, we will check methods to access Hive tables from Apache Spark. Why Apache Spark? As mentioned earlier, Apache Spark is 100 times faster compared to Hadoop and more than 10 times faster than accessing data from disks. Spark…

Continue ReadingMethods to Access Hive Tables from Apache Spark
Comments Off on Methods to Access Hive Tables from Apache Spark

Execute Hive Beeline JDBC String Command from Python

To perform any analysis, you need to have data in place. To collect data, you may have to connect your application to different data source. In this article, we will discuss on one of such approach to execute Hive Beeline JDBC string command from Python application. This is one of the simple and easy approach to connect to Kerberos HiveServer2 using Beeline shell. I was working on one of the machine learning project to predict query execution time on Hadoop Hive cluster. We were gathering various features from the HiveQL…

Continue ReadingExecute Hive Beeline JDBC String Command from Python
Comments Off on Execute Hive Beeline JDBC String Command from Python

Difference Between Hive CLI and Beeline Client – Hive vs Beeline

Beeline is developed to interact with new server. Hive CLI is an Apache Thrift based client, whereas Beeline is JDBC client, based on SQLLine CLI. In this article, we will check difference between Hive CLI and Beeline client – Hive vs Beeline. Difference Between Hive CLI and Beeline Client – Hive vs Beeline Below are the some of the difference between Hive CLI and Beeline client. These some of differences will help you in case if you are migrating from old Hive CLI to new Beeline client. Server Connection Hive…

Continue ReadingDifference Between Hive CLI and Beeline Client – Hive vs Beeline
Comments Off on Difference Between Hive CLI and Beeline Client – Hive vs Beeline

Beeline Hive Command Options and Examples

You can run hive specific commands like Apache Hive Command options in Beeline shell. Just like in Hive command options, you can terminate Hive command by using “;” (semi colon). In this article, we will check Beeline Hive Command Options with some examples. Read: Execute Hive Beeline JDBC String Command from Python Beeline Hive Command Options Below are the Beeline supported Hive command options: Command Description set <key>=<value> Sets the value of a configuration variable (key). set -v This command prints all Hadoop and Hive configuration variables that are used. set This…

Continue ReadingBeeline Hive Command Options and Examples
Comments Off on Beeline Hive Command Options and Examples

Run HiveQL Script File Passing Parameter using Beeline CLI and Examples

Hive is used for batch and interactive SQL queries. HiveServer2 supports a command shell Beeline that works with HiveServer2. It's a JDBC client that is based on the SQLLine CLI. You can run HiveQL script file passing parameter using Beeline CLI. Variable Substitution allows for tasks such as separating environment-specific configuration variables from code. You can substitute the values to variable that you have used in HiveQL query. Read: Hive Dynamic SQL Support and Alternative HiveServer2 Beeline Command Line Shell Options and Examples Run HiveQL Script File Passing Parameter using…

Continue ReadingRun HiveQL Script File Passing Parameter using Beeline CLI and Examples
Comments Off on Run HiveQL Script File Passing Parameter using Beeline CLI and Examples

Steps to Connect to Hive Using Beeline CLI

Beeline is a JDBC client that is based on the SQLLine CLI. HiveServer2 supports a command shell Beeline that works with HiveServer2. In this article, we will check how to connect to Hive using Beeline CLI and see some examples to execute HiveQL scripts. Connect to Hive Using Beeline CLI Beeline works on both standalone mode (embedded mode) as well as remote mode. Standalone more or embedded mode, it executes embedded Hive like Hive CLI, and you can use remote mode to connect separate hiveserver2 over thrift. Read: HiveServer2 Beeline…

Continue ReadingSteps to Connect to Hive Using Beeline CLI
Comments Off on Steps to Connect to Hive Using Beeline CLI

HiveServer2 Beeline Command Line Shell Options and Examples

HiveServer2 supports a command shell Beeline that works with HiveServer2. It's a JDBC client that is based on the SQLLine CLI. The Beeline shell works in both embedded mode as well as remote mode. In the embedded mode, it runs an embedded Hive (similar to Hive Command line) whereas remote mode is for connecting to a separate HiveServer2 process over Thrift. In this article, we will check commonly used HiveServer2 Beeline command line shell options with an examples. You can run all Hive command line and Interactive options from Beeline…

Continue ReadingHiveServer2 Beeline Command Line Shell Options and Examples
Comments Off on HiveServer2 Beeline Command Line Shell Options and Examples