Designing tables
In HBase, when modeling the schema for any table, a designer should also keep in mind the following, among other things:
The number of column families and which data goes to which column family
The maximum number of columns in each column family
The type of data to be stored in the column
The number of historical values that need to be maintained for each column
The structure of a rowkey
Once we have answers, certain practices are followed to ensure optimal table design. Some of the design practices are as follows:
Data for a given column family goes into a single store on HDFS. This store might consist of multiple HFiles, which eventually get converted to a single HFile using compaction techniques.
Columns in a column family are also stored together on the disk, and the columns with different access patterns should be kept in different column families.
If we design tables with fewer columns and many rows (a tall table), we might achieve O(1) operations but also compromise with atomicity...