Column-oriented databases save their data grouped by columns. Subsequent column values are stored contiguously on disk. Columnar storage for database tables is one of an important factor in optimizing analytic query performance in the database.In this article, we will check how column oriented database stores data. Also we will check the difference between row oriented database and columnar database – columnar database vs document database
What is Column Oriented Database?
The column-oriented databases save their data grouped by columns. This differs from the usual row-oriented approach of traditional databases, which store entire rows contiguously. In a typical traditional relational database table, each row contains field values for a single record.
Typical Row Oriented Database Storage
Below diagram depicts how traditional row oriented database stores the data in database:
In row-oriented database storage, data blocks store values sequentially for each consecutive column making up the entire row of the table. If in case allocated block size is smaller than the size of a record that being stored, storage for an entire record may take more than one block. If block size is larger than the size of a record, storage for an entire record may take less than one block, this results in an inefficient use of the disk space. Row oriented databases are typically best suited for online transaction processing (OLTP) such as banking application, as most transaction are reading and writing the value of entire rows.
How Column Oriented Database Stores Data?
We know how oriented database stores data in previous section, now let is check how column oriented or columnar database stores data.
Below diagram depicts how column oriented or columnar database stores data:
In columnar storage, each data block holds column field values for as many as three times as many records as row-based storage. This helps storing large number of columns will very large number of rows store more efficiently.
The reason to store values on a per-column basis is based on the assumption that, for specific user queries, not all of the values in the records are needed. This condition holds good in online analytics database (OLAP), that is the reason columnar database is more efficient in OLAP applications.
Advantages of Column Oriented Databases
Below are advantages of using column oriented databases:
- Efficient usage of storage,
- Reduced I/O for queries having fewer columns.
- Aggregation queries having fewer columns runs fast.
- Columnar Compression – Some database like Amazon Redshift supports columnar compression.
Disadvantages of Column Oriented Databases
There are some of drawback of using columnar or column orient databases. Below are some of the disadvantages:
- Columnar database is not suited for incremental data loading.
- Online Transaction Processing (OLTP) application are not suitable in column oriented databases.
- User queries against only a few rows cannot give you any benefits in columnar databases.
Related Articles