Register Hive UDF jar into pyspark – Steps and Examples
Apache Spark is one of the widely used processing engine because of its fast and in-memory computation. Most of the organizations use both Hive and Spark. Hive as a data source and Spark as a processing engine. You can use any of your favorite programming language to interact with Hadoop. You can write custom UDFs in Java, Python or Scala. To use those UDFs, you have to register into the Hive so that you can use them like normal built-in functions. In this article, we check check couple of methods…