Apache HCatalog is a table management layer that exposes Hive metadata to other Hadoop applications. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop Distributed File System (HDFS) and ensures that users need not worry about where or in what format their data is stored. HCatalog displays data from RCFile format, text files, or sequence files in a tabular view. It also provides REST APIs so that external systems can access these table’s metadata.

HCatalog is built on top of the Hive metastore and incorporates components from the Hive DDL. HCatalog provides read and write interfaces for Pig and MapReduce and uses Hive’s command line interface for issuing data definition and metadata exploration commands. It also presents a REST interface to allow external tools access to Hive DDL (Data Definition Language) operations, such as “create table” and “describe table”.

HCatalog Study

  1. hcatalog-installation
  2. hcatalog-command-line-interface-cli-usage
  3. hcatalog-creating-table
  4. hcatalog-script
  5. hcatalog-load-operation
  6. hcatalog-alter-table
  7. dropping-table
  8. creating-view-and-indexes