SQS versus Kinesis

2014-10-24

A lot of people are confused between SQS and Kinesis. In some ways, both act as a queue, but there is a massive difference between the two.

SQS is a queue, adheres to FIFO and promises at least once delivery.

Kinesis is a distributed stream processor. A simplistic and hand-wavy way to think of Kinesis is like one large log file; items that you write to the stream as lines in this log file. When you want to process the stream, you get a pointer to the log file. When you read a line from this log file, the pointer moves to the next line. Kinesis is stateless, as in, it does not maintain the pointer for you, it is up to your reading process to maintain this. What this means is that, say you are reading off a Kinesis stream, and your process goes down, when you bring the reader process up again, it will start processing from the start, not from the last line before the crash. There is no concept of popping items out of Kinesis, data is always there(expires after seven days), you manipulate the pointer to this data. Hence, if you want to reprocess the stream, you can replay i.e., you can start from the beginning and do the whole thing over and over again. AWS provides a client library for Kinesis which maintains the state for you. This client library uses dynamodb to persist the state.

This should give you a fair idea of when to use Kinesis and when to opt-in for SQS.

← Release early, release often Selfie →