Zoho’s domain was inaccessible for a while. This is an embarrassing event for a software organization.
Whenever I hear of events like this, I am reminded of a couple of pages in “The Black Swan“. Taleb calls it “A new kind of ingratitude”.
The idea presented by Taleb essentially boils down to a person who takes steps to prevent something catastrophic from happening. Since that person has taken steps to prevent the catastrophe, the catastrophe never occurs. Thus the person never gets his due and dies a silent hero.
This is a very fascinating thought that keeps repeating in all aspects of life. Whenever it floods, we make a big deal of politicians who fold their sleeves and get into action. What about that politician who took the necessary steps to prevent flooding?
Whenever there is a production issue at work and a team goes out of their way to put out the fire, that team is lauded. What about those teams that took steps to prevent something like this from occurring in the first place?
Software security is one big area that falls in this category. If you have a great security team, life would go on humming silently. You need to have the right tech leadership to recognize this otherwise it falls bang into ingratitude category.
This is a very obvious thought but Taleb has done a great job of giving structure to this idea. If you keep your eyes open, you see this happening around you all the time.
Imagine a person who walks from her home to the office. Frequently she is late to work as she takes time to cover the distance. She wants to improve her pace. She goes to a walking expert to get tips on increasing her walking speed.
A radical solution to the problem is to use some other means of transportation instead of walking. If you go to a walking expert, you are going to get tips on improving your walking speed. The expert is not going to ask you to forego walking and use a different mode of transportation. Also, if you are deeply attached to the idea of walking, you might not think of a solution beyond walking. Improving your walking speed is a micro solution, whereas using some other means of transportation is a macro solution.
The above is a contrived example but something we come across in our professional and personal lives, both as problem solvers as well as ones facing a problem. Programmers sometimes try to optimize the hell out of a piece of code while the right approach might be to throw away the code and use something else. Organizations seek to nail down a process to the last mile while a sensible solution might be to do away with the process entirely.
We lean towards micro solutions when we are either deeply entwined in a problem or are the domain expert in that particular area. In these situations, we tend to think within the bounds of a problem and not outside.
When you come up with a solution, bracket it as micro or macro. Being aware is the first step towards becoming better at anything. Also, an outside view helps. Find someone who is not an expert in the domain or one who is not acutely aware of the problem. Run your solution through them. They might lead you to a macro solution or make you aware that what you have is a micro solution. Taking time and mind off a problem helps, like how Archimedes had his eureka moment.
Last but not the least, take a walk.
If you have a producer with an uneven rate of production and a consumer that cannot keep pace with the producer at its peak, use a queue.
If you have a workload that need not be addressed synchronously, use a queue.
If your customer-facing application is riddled with workloads that can be deferred, move these to a queue thus making the customer-facing application lean and mean.
Think of a queue as a shock absorber.
There are workloads that need to be processed immediately with sub-millisecond latency, and then there are ones where you have the luxury of taking the time. It is advisable not to mix these in an application. The second kind of workload can be addressed by moving it to a queue and having a consumer process them.
For example, consider a scenario where you are consuming messages and persisting them in a data store. These messages are coming in at a variable rate, and at its peak, the data store cannot handle the load. You have two options. Scale the datastore to meet the peak load or slap a queue in between to absorb the shock. Queue solves this problem in a KISS manner.
Queues enable applications to be highly available while giving enough room to maneuver. As long as the queue is highly available, the chance of message loss is almost nil. Since a queue is durable, you need not perfect your consumer’s high availability; you get leeway to manage.
With applications embracing microservices paradigm, there is a lot of API back and forth. Not all API consumption has to be in real-time. Whatever can be deferred should use a queue as the transport mechanism.
Queue introduces a bit more complexity into an application but the advantage it brings to the table makes it a worthwhile investment.
Whenever a new process is introduced, there is always going to be some discomfort. The cause can be categorized into:
1. Uneasiness due to newness.
2. There is a problem with the process itself.
Category one is due to human nature. Deviation from an established routine causes queasiness and a yearning for the old way. It takes over-communication, repetition and sometimes “just giving it time” to tide over this initial phase; this is usually a short-lived phenomenon.
Category two is the troublesome one. When someone complains about a newly introduced process, it is essential to get to the source of this discomfort. Prod as to whether the reason for disapproval falls into category one or two.
A suitable process has to roughly follow the Libertarian Paternalism idea popularised by Behavioural Economist Richard Thaler. The process should be a nudge towards better behavior rather than a dictatorial dictum. A process whose intention is to police people does not end up well.
A new process introduces some amount of friction, but this friction has to be local, not global. This friction should not slow down the task at a global level; instead, it should aid speed, agility, and stability.
Take the checklist process as an example. It nudges people towards being more aware and aids better behavior. It does introduce friction at the local level, but on the whole, globally, the task speeds up with a much better result on an average.
It always helps to think along these lines to figure out whether a new process is worth its salt. Instead of introducing a new process and then reneging, put in the effort to evaluate the efficacy of a process beforehand.