Before starting a Cassandra cluster, you must choose how the
data will be divided across the nodes in the cluster. This involves choosing a
partitioner for the cluster. Cassandra uses a ring architecture. The ring is
divided up into ranges equal to the number of nodes, with each node being
responsible for one or more ranges of the overall data. Before a node can join
the ring, it must be assigned a token. The token determines the node’s position
on the ring and the range of data it is responsible for.
Once the partitioner is chosen it is unlikely to change the
configuration choice without reloading all the data. This makes it very
important to choose and configure the correct partitioner before initializing
the cluster.
The important distinction between the partitioners is order
preservation (OP). Users can define their own partitioners by implementing IPartitioner, or they can use one of the
native partitioners.
Random Partitioner
RandomPartitioner is the default choice for cassandra as it
uses an MD5 hash function to map keys into tokens. These keys will evenly
distribute across the clusters. The row key determines where the node
placement. Consistent hashing algorithm
used by Random partioning ensures that when nodes are added to the cluster, the
minimum possible set of data is affected. The hashing algorithm creates an MD5
hash value of the row key ranging from 0 to 2*127. Then nodes in the cluster
are assigned a token that represents the hash value in the above mentioned
range. This value determines the row keys to be placed in the node. For e.g the
below given row with row key ‘Prajeesh’ is assigned a hash key like 98002736AD65AB
which determines the node that holds the range to store the row.
Prajeesh
|
India
|
Scrum Master
|
Prowareness
|
Notice that the keys are not in order. With RandomPartitioner, the keys are evenly
distributed across the ring using hashes, but you sacrifice order, which means
any range query needs to query all nodes in the ring.
Ordered Preserving Partitioners
The Order Preserving
Partitioners preserve the order of the row keys as they are mapped into the
token space. This allows range scans over rows, meaning you can scan rows as
though you were moving a cursor through a traditional index. For example, if
your application has user names as the row key, you can scan rows for users
whose names fall between Albert and Amy. This type of query would not be
possible with randomly partitioned row keys, since the keys are stored in the
order of their MD5 hash (not sequentially).
An advantage of using OPP is that the range queries are
simplified since the query need not consult each node in the ring the fetch the
data. It can directly visit the node based on the order of row keys.
A disadvantage of using OPP is that the ring becomes
unstable over a time if your application tends to write or update a sequential
block of rows at a time, then the writes will not be distributed across the
cluster, putting it all to a node. This makes one node holding more data than
the rest disturbing the even distribution of data across nodes.
No comments:
Post a Comment