Comparing architectural aspects of distributed systems- Part 1

4 min readJul 13, 2020

In my quest to explore the internals (the architectural aspects) of three of the famous distributed systems there are in today’s world, I spent a solid 2 days just to compare and contrast the architectural decisions. Why not share these amazing learning with you folks??

Disclaimer — I am not going to get into the depth of the below listed concepts. I have shared relevant links that can help if you are new to these topics.

3 distributed systems that form an integral part of a majority of the cloud native architectures today

A distributed Message broker (Kafka), a distributed database (Cassandra) and a distributed cache (Redis)

Partitioning strategy

Kafka — Individual partitions represent brokers that handle a portion of the entire traffic. Can be visualized as virtual queues within topics

A multi-broker kafka cluster with consumer groups

Cassandra — Partitions are Shards of the data. The Shards store portions of the entire lot.

Depiction of Cassandra DB instances/servers placed in a hash ring. Each node represents a shard (for primary partition’s data + replicas’ data)

Redis- Similar to the database

Partition Arrangement strategy

Kafka — Consistent Hashing. The client application can choose a key in the message attribute that would be hashed and would help add the message to the ring.

Cassandra — Cassandra is said to use its own hashing algorithm (murmur hashing). The hashed value would determine the placement of the data in a server node.

Cache — Consistent Hashing as a design (not sure if Redis uses this). The key in a KEY-VALUE pair can be subject to the hash algorithm and the value would determine the server node to which this KV would be saved

Data replication strategy (replication factor)

Kafka — The partitions (along with their data) are replicated to other brokers. These are called the partition replicas. Replication factor is equivalent to the total number of partitions/brokers configured.

depiction of a distributed kafka cluster with multiple brokers( partitions and replicas)

Cassandra — Data is replicated to the peer nodes. The consistency requirement determines how many copies of the data should be created. Cassandra provides a consistency factor that can be tuned (separately for reads and writes). If reads require a quorum of 3 nodes then the minimum replication of data would be to 3 nodes.

Depiction of simple strategy of data replication.

Cache (Redis) — Data is replicated to the nodes identified as replicas. The consistency requirement determines how many copies of the data should be created.

Node coordination/orchestration strategy/protocol

Kafka — Zoo-keeper pattern. A coordinator would be performing the tasks of maintaining the whereabouts of the brokers including their statuses ,ability to keep up with the ingest streams, mapping between the leader and followers and much more.

Cassandra — Gossip protocol ( information exchange between the nodes that are in the ring). This happens every 1 second in Cassandra. The nodes communicate in an effective way to perform the coordination of operations and replication.

Redis Cache — Redis Sentinel does the job of a zoo keeper

Read Strategy

Kafka — Read always happens from the leader partition ( told to have changed in the latest versions of kafka where the consumer can also read from any of the replicas. I am not sure yet if these are done to mimic the read-only replica pattern used in the relational DBs)

Cassandra — Read can happen from any of the designated partitions for read
Note:
If the read consistency level is set to N then the same queried value is read from N nodes before the data is returned to the client.

Redis Cache — Read can happen either from the leader node or from any of the read replicas (that contain the replicated data)

Write Strategy

Kafka —Write happens at the leader partition.

Note (Edit):

In an design that expects quick write operations, the data would be asynchronously replicated to the followers. In a design where the durability of data is important, synchronous replication is used. The leader commits the write to the client only after the acknowledgement from all the In-Sync Replicas (ISR).

Cassandra- Write can be done to any node ( no master -slave concept)
Note:
If the write consistency level (CL) is set to N then the same data would be written to N nodes before a success is returned to the client. This is a key point in a trade-off between consistency and availability in CAP theorem.

Redis cache- Write happens to the leader node/server that the hash resolves to.

Comparison of the systems to be continued in Part-2.

Please note: I have just consolidated a lot of information from various sources (as a part of exploration) and do not intend to claim any of this as my own findings :)