Performance utilities
HQL provides the EXPLAIN
and ANALYZE
statements, which can be used as utilities to check and identify the performance of queries. In addition, Hive logs contain enough detailed information for performance investigation and troubleshooting.
EXPLAIN statement
Hive provides an EXPLAIN
statement to return a query execution plan without running the query. We can use it to analyze queries if we have concerns about their performance. The EXPLAIN
statement helps us to see the difference between two or more queries for the same purpose. The syntax for it is as follows:
EXPLAIN [FORMATTED|EXTENDED|DEPENDENCY|AUTHORIZATION] hql_query
The following keywords can be used:
FORMATTED
: This provides a formatted JSON version of the query plan.EXTENDED
: This provides additional information for the operators in the plan, such as file pathname.DEPENDENCY
: This provides a JSON format output that contains a list of tables and partitions that the query depends on. It has been available since Hive...