3 top features every Big data developer to know on Cassandra

Cassandra is a NoSQL database. You can store any kind of data in NoSQL. The particular format is not required. Big data developer must learn these unique features. 

Column Structure

Data is stored in column format. Each column has its value in vertical format. Each column group there is a row key.

NoSQL

Clusters

Data is replicated in all nodes. The nodes arrange like a ring. So Cassandra clusters also called ring nodes.

  1. Running Cassandra on single node is not useful. 
  2. Multi node environment, the data on each node is replicated in other nodes
  3. When original node is down, during peer to peer communication, replica will answer

You can see in the below image, there are 3 families. One is Hospital, Kitchen and Sports. Like this in Cassandra each type of data stores in particular column family.

Column

KeySpace

The keyspace is an outermost container in Cassandra. It saves all of your data in Keyspace. Single key-spaces normally enough per cluster. To store more applications, you can also create multiple Keyspaces per cluster.

  • Replication factor is ‘3’. That means each row has ‘3’ replica
  • Replicas placed based on the strategy you have given during creation of Keyspace
  • Define key ranges for all nodes. Based on user request, if the key range falls, then that particular node-replica responds to the user. Placement of replica strategy you need to define while creating Keyspace.
  • Each row has one Column family. Multiple column families you can create per Keyspace. 
Introduction on Cassandra

More on Hadoop


Advertisements

Author: Srini

Experienced software developer. Skills in Development, Coding, Testing and Debugging. Good Data analytic skills (Data Warehousing and BI). Also skills in Mainframe.