HCatalog
HCatalog (see https://cwiki.apache.org/confluence/display/Hive/HCatalog) is a metadata management system for Hadoop data. It stores consistent schema information for Hadoop ecosystem tools, such as Pig, Hive, and MapReduce. By default, HCatalog supports data in the format of RCFile
, CSV
, JSON
, SequenceFile
, ORC
file, and a customized format if InputFormat
, OutputFormat
, and SerDe
are implemented. By using HCatalog, users are able to directly create, edit, and expose (via its REST API) metadata, which becomes effective immediately in all tools sharing the same piece of metadata. At first, HCatalog was a separate Apache project from Hive. Eventually, HCatalog became part of the Hive project in 2013 starting with Hive v0.11.0. HCatalog is built on top of the Hive metastore
and incorporates support for HQL DDL. It provides read and write interfaces and HCatLoader
and HCatStorer
. For Pig, it implements Pig's load and store interfaces. HCatalog also provides an interface for MapReduce...