Cassandra is a NoSQL database. You can store any kind of data in NoSQL. The particular format is not required. Big data developer must learn these unique features.
Feature 1: Column Structure
Data is stored in column format. Each column has its value in vertical format. Each column group there is a row key.
Feature 2: Clusters
Data is replicated in all nodes. The nodes arrange like a ring. So Cassandra clusters also called ring nodes.
- Running Cassandra on single node is not useful.
- Multi node environment, the data on each node is replicated in other nodes
- When original node is down, during peer to peer communication, replica will answer
You can see in the below image, there are 3 families. One is Hospital, Kitchen and Sports. Like this in Cassandra each type of data stores in particular column family.
Feature 3: KeySpace
The keyspace is an outermost container in Cassandra. It saves all of your data in Keyspace. Single key-spaces normally enough per cluster. To store more applications, you can also create multiple Keyspaces per cluster.
- Replication factor is ‘3’. That means each row has ‘3’ replica
- Replicas placed based on the strategy you have given during creation of Keyspace
- Define key ranges for all nodes. Based on user request, if the key range falls, then that particular node-replica responds to the user. Placement of replica strategy you need to define while creating Keyspace.
- Each row has one Column family. Multiple column families you can create per Keyspace.