SQS versus Kinesis

by abhirama

There is some confusion around SQS versus Kinesis, both are QAAS™(queue as a service) provided by AWS(this statement is not entirely true, you will know why as you read on). This is an attempt at defogging this confusion.

SQS is a queue in the old fashion sense, it promises at least once delivery. You create a queue, enqueue items, dequeue items. One point to note is that, dequeueing an item does not delete that item from the queue, you have to explicitly delete the item post dequeue, this sort of goes against the intuitive understanding of dequeue, at least for me, but, you can configure that once you dequeue an item, that item should not be available for dequeueing again for a specified period of time.

Kinesis is a queue but it is not a queue and it is not a paradox, it is a high throughput stream handler. A conceptual way to think of Kinesis is like one huge log file, items that you enqueue as lines in this log file. You have a pointer to this log file, when you read one line from this log file(dequeue), the pointer points to the next line. Kinesis is stateless, as in, it does not maintain the pointer for you, it is upto you to maintain this. What this means is that, say you are reading off a Kinesis stream and your process goes down, when you bring it up again, it will start over the processing from where it started originally, not from the last line before crash. There is no concept of taking items out of Kinesis, the data is always there(retention period of a day), you manipulate the pointer to this data. Hence, if you want to re process the stream, you can replay. AWS provides a client library for Kinesis which maintains the state for you. This client library uses dynamodb to persist the state.

This should give you a fair idea of when to use Kinesis and when to opt in for SQS.

Advertisements