Sunday, 24 December 2017

Cassandra Data Structure



In order to understand the internal data structure, It would prefer exploring the data model in Cassandra. Cassandra data model consists of Keyspace, Column Families, Keys and Columns.

Let me give a small comparison with the other database system,
   Keyspaces in Cassandra are similar to the database
   Column Families are similar to the tables in the database
   Primary Keys are similar to the primary keys in the tables
   Column keys and values are similar in any database structure.


Primary Keys in the above model are very special, This will decide the partitions in the Cassandra.

To make it simple, We have keyspace -> under the keyspace we have column families -> partitions -> column keys and value
Let's create the above parts and explore the internals,

1) Let's create a keyspace in the name 'office' with the Replication Strategy as 'Simple' and Replication Factor as '1'  and also create the table 'user' under the keyspace 'office'

   


 2) Inserting an entry into the 'user' table
   


   NOTE : After inserting go to the location of the data folder and list the folder under the table 'user' surprising there will not be any data. It is because If you could recollect the write process in Cassandra, Always write first happens in the commit log then to the In-memory mem table, Only after doing the memtable is getting full / manually flushing the mem table, It will be flushed to Hard disk as ss table. Until that It will be available in the Mem table and read from memory.

No SSTable for the column family 'user'
  
  
  I did a manual flush using the ./nodetool flush command and we could see the ss tables available for the column family 'user' under the office key space.


In the above example, the file 'mc-1-big-Data.db' is the real data of the sstable.

3) There is an utility under the 'tools/bin/sstabledump' used for reading the sstable.


       
I am able to read the sstable using the utility sstabledump.
This is very useful to understand the data structure of the Cassandra.

Let's compare the above dumped data,

We could see that the grouping is based on the primary key (Partition key) as of above example.
It is grouped under the partition '1' and '2' and It has its own column key and values. 

Table with clustering key:

 

I created a table 'nodedata' which stores the node/system related information.
There are ways to customize the primary key, As per the above example, I have customized the primary key using the 'nodeip' as partition key (to group) and any keys used next to the partition keys are 'clustering keys' which is used for sorting. In this example, the entries are grouped by 'nodeip' and sorted by the 'timestamp'.



In the above example, We could that primary key has two sections

primary key (nodeIp, timestamp)

where, the first field 'nodeIp' is the partition key and second field 'timestamp' is used for sorting within the partition. I created three entries under the partititon '172.30.56.60' and flushed from mem table to ss table manually to understand the internal data structure.

We could see from the above example that, It is grouped by the partition key and sorted using the timestamp in the name of 'clustering'.

Compound primary keys :


This is one of the complex primary key which makes us do a deep dive into the Cassandra internal data structure.

Instead of single key to form a partition or single key to form a clustering key, As per the business use case we can create a composite partition and clustering key.

I would like to show one example, In which there is a file name which forms a group based on the blocks.

(i.e) PRIMARY KEY ((filename, blockId), timestamp, data)


It forms a composite partition key, In which the partition key is a combination of two fields 'filename' and 'blockId'


I flushed and read the ss table dump and we could see that the partition key grouping is based on both the 'filename' and 'blockId'.

I believe this SSTableDump clearly explains the internal data structure of Cassandra.

Let's go ahead and see how it actually works with statistical kind of data :)


 

No comments:

Post a Comment