Month: October, 2014

SQS versus Kinesis

There is some confusion around SQS versus Kinesis, both are QAAS™(queue as a service) provided by AWS(this statement is not entirely true, you will know why as you read on). This is an attempt at defogging this confusion.

SQS is a queue in the old fashion sense, it promises at least once delivery. You create a queue, enqueue items, dequeue items. One point to note is that, dequeueing an item does not delete that item from the queue, you have to explicitly delete the item post dequeue, this sort of goes against the intuitive understanding of dequeue, at least for me, but, you can configure that once you dequeue an item, that item should not be available for dequeueing again for a specified period of time.

Kinesis is a queue but it is not a queue and it is not a paradox, it is a high throughput stream handler. A conceptual way to think of Kinesis is like one huge log file, items that you enqueue as lines in this log file. You have a pointer to this log file, when you read one line from this log file(dequeue), the pointer points to the next line. Kinesis is stateless, as in, it does not maintain the pointer for you, it is upto you to maintain this. What this means is that, say you are reading off a Kinesis stream and your process goes down, when you bring it up again, it will start over the processing from where it started originally, not from the last line before crash. There is no concept of taking items out of Kinesis, the data is always there(retention period of a day), you manipulate the pointer to this data. Hence, if you want to re process the stream, you can replay. AWS provides a client library for Kinesis which maintains the state for you. This client library uses dynamodb to persist the state.

This should give you a fair idea of when to use Kinesis and when to opt in for SQS.

Release early, release often

Releasing early and often can make the difference between life and death for new age internet companies. Most of the successful dotcoms like Amazon, Google, Etsy etc do hundreds of deployments per day. If you are a small organization, your magnitude and frequency of deployments might not rival these big organizations, but it is always a good idea to release early and often.

If you plan to carry out multiple deployments a day, it is critical that you do not have downtime during these deployments. When your shiny new code is getting deployed, your end users should be able to access the site. This is usually done by routing all your traffic through a load balancer. The end user does not directly hit your application server, he/she hits the load balancer and it is the load balancer’s responsibility to route traffic to your application servers. In addition to this being the recommended architecture for server side software, it gives you the agility to deploy code without having to worry about down time. You take a server out of the load balancer, deploy your code on it, add it back to the load balancer and do the same with the other servers that are part of the group. Two of the most popular load balancers out there, Nginx and HAProxy, allow you to do this dynamically(while the load balancer is up and running you can add and remove back end servers) and I am sure that other load balancers let you too. If you are running on AWS, Elastic Load Balancer lets you do this.

Also, your deployments should be as simple as possible, even a monkey should be able to deploy your code to production. More the complicated it is to deploy software, less the enthusiasm of developers to do it. Using a continuous integration tool like Jenkins helps to make this as painless as possible.

Enable a kill switch for all new features. Your app should have the ability to turn on and off features in seconds. This gives you the power to turn off a feature if you detect any problems with it in the early days.

Also, gradually releasing features is a good idea. This lets you whet out performance and other issues on a small scale before it becomes a site wide issue. Release to 1% of your users and then slowly ramp up to 100% keeping a close eye on the feature all the time.

If you are working on a feature, you do not have to wait for feature completion to release. As and when you finish off logical steps, keep releasing. Your end users might not see the feature yet but this helps you to get away from the situation of one big bang release and everything going down. These incremental releases help to detect bugs early and build confidence for the final release.

Set alerts for all your key performance and business metrics. Post deployment, if any of these metrics go awry, you get an alert and you can set things right. In addition to alerting, having the ability to graph these is tremendous. Post a deployment, you can check your dashboard to see whether the deployment has had an adverse affect on your response times, key business metrics, etc. This adds to your confidence to do multiple deployments without having to worry about your new code degrading performance or adversely affecting business.

These are some of the simple tips that help you deploy code to production on a regular basis. I have not touched on more advanced topics like automated testing, integration testing, automated provisioning etc.