SQS versus Kinesis

A lot of people are confused between SQS and Kinesis. In some ways, both act as a queue, but there is a massive difference between the two.

SQS is a queue, adheres to FIFO and promises at least once delivery.

Kinesis is a distributed stream processor. A simplistic and hand-wavy way to think of Kinesis is like one large log file; items that you write to the stream as lines in this log file. When you want to process the stream, you get a pointer to the log file. When you read a line from this log file, the pointer moves to the next line. Kinesis is stateless, as in, it does not maintain the pointer for you, it is up to your reading process to maintain this. What this means is that, say you are reading off a Kinesis stream, and your process goes down, when you bring the reader process up again, it will start processing from the start, not from the last line before the crash. There is no concept of popping items out of Kinesis, data is always there(expires after seven days), you manipulate the pointer to this data. Hence, if you want to reprocess the stream, you can replay i.e., you can start from the beginning and do the whole thing over and over again. AWS provides a client library for Kinesis which maintains the state for you. This client library uses dynamodb to persist the state.

This should give you a fair idea of when to use Kinesis and when to opt-in for SQS.

Release early, release often

Releasing early and often can make the difference between life and death for new age internet companies. Most of the successful dotcoms like Amazon, Google, Etsy do hundreds of deployments per day. If you are a small organization, your magnitude and frequency of deployments might not rival these big organizations, but it is always a good idea to release early and often.

rectangular-white-and-red-gift-box-1303079

If you plan to carry out multiple deployments a day, you should not have downtime during these deployments. When your shiny new code is getting deployed, your end-users should be able to access the site; this is usually done by routing all your traffic through a load balancer. The end-user does not directly hit your application server, she hits the load balancer, and it is the load balancer’s responsibility to route traffic to the application servers. In addition to this being the recommended architecture for server-side software, it gives you the agility to deploy code without having to worry about downtime. You take a server out of the load balancer, deploy your code on it, add it back to the load balancer, and do the same with the other servers that are part of the group. Two of the most popular load balancers out there, Nginx and HAProxy, allow you to do this dynamically; while the load balancer is up and running, you can add and remove back end servers. If you are running on AWS, Elastic Load Balancer lets you do this.

Also, your deployments should be as simple as possible; even a monkey should be able to deploy your code to production. More the complicated it is to deploy software, less the enthusiasm of developers to do it. Using a continuous integration tool like Jenkins helps to make this a painless process.

Enable a kill switch for all new features. Your app should have the ability to turn on and off features in seconds; this gives you the power to turn off a feature if you detect any problems with it in the early days.

Also, gradually releasing features is a good idea; this lets you whet performance and other issues on a small scale before it becomes a site-wide problem. Release to 1% of your users and then slowly ramp up to 100%, keeping a close eye on the feature all the time.

If you are working on a feature, you do not have to wait for feature completion to release. As and when you finish off logical steps, keep releasing. Your end-users might not see the feature yet, but this helps you to get away from the situation of one big bang release and everything going down. These incremental releases help to detect bugs early and build confidence for the final version.

Create alerts for all your critical performance and business metrics. After deployment, if any of these metrics go awry, you get an alert, and you can set things right. In addition to alerting, having the ability to graph these is tremendous. Post a deployment; you can check your dashboard to see whether the deployment has hurt your response time and critical business metrics; this adds to your confidence to do multiple deployments without having to worry about your new code degrading performance and adversely affecting business.

These are some of the simple tips that help you deploy code to production regularly. I have not touched on more advanced topics like automated testing, integration testing, and automated provisioning.

Photo by Giftpundits.com from Pexels