HPL/SQL
Since Hive v2.0.0, the Hadoop Procedure Language SQL (HPL/SQL) (http://www.hplsql.org/) available to provide store procedure programming in Hive. HPL/SQL supports Hive, Spark SQL, and Impala, and is compatible with Oracle, DB2, MySQL, and TSQL standard. One of its benefits is making the migration of existing database-stored procedures to Hive easy and efficient. Using HPL/SQL does not require Java skills to implement what can be done through UDF mentioned. Compared with UDF, HPL/SQL's performance is a little slower and it is still new for production usage.
The following is an example of creating a stored procedure. HPL/SQL supports the creation of both Function
and Procedure
:
$ cat getEmpCnt.pl CREATE PROCEDURE getCount() BEGIN DECLARE cnt INT = 0; SELECT COUNT(*) INTO cnt FROM employee; PRINT 'Users cnt: ' || cnt; END; call getCount(); -- Call a procedure
In order to run a procedure, we need to set up the database connection in hplsql-site.xml
by providing the hiveserver2
connection...