We can roughly classify concurrency models into:
1. Thread based concurrency.
2. Event based concurrency.
Imagine that you run a store with only one customer service representative. As soon as a customer walks in, the customer service representative greets the customer with a quick hello saying – “If you need any help, give me a shout, and I will help you out.” She then waits for the customer to seek help. She aims to complete the interaction as soon as possible and wait for the next communication. When a customer asks for help, she quickly answers the query and goes back to waiting. If a customer asks where is the washroom, she points in the right direction quickly and reverts to waiting. If a customer asks her for the price of a product, she quickly conveys the price and goes back to waiting. The point to note here is that there is only one customer service representative for the entire store servicing all customers. This model works exceptionally well when the representative is fast, and the answers to the queries are quick. Concurrency based on events works like this.
Now consider the situation where you have five customer service representatives in your store. As soon as a customer walks in, a representative is assigned exclusively to that customer. When another customer walks in, one more representative is picked from the pool and assigned to the customer. The critical point to note here is that there is a one to one relationship between the customer service representative and the customer. When one representative is servicing a customer, she does not bother about other customers; she is exclusive to that customer. Since our pool has five representatives, at most, we can serve only five customers at a time. What do we do when the sixth customer walks into the store? We can wait until one of the customers walks out or we can have a rule saying that a representative services a customer for a fixed period after which she will be assigned to another waiting customer. She is reassigned to the original customer once the time elapses. Concurrency based on threads works like this.
Coming back to the scenario wherein the sixth customer walks in. Now, we have to ask the sixth customer to wait until a representative is free. On the other hand, we have to wean away a representative from one of the existing customers and assign her to the new customer. When this happens, the customer who was initially being serviced by this representative has to wait. After the elapsed time, we have to assign the representative back to the original customer. When a lot of customers walk in, and you have a fixed no of representatives, quite a bit of coordination is needed to service all customers satisfactorily. In a computer, the CPU scheduler takes care of switching between tasks. Switching is a comparatively time-consuming operation and an overhead of the thread based concurrency model when compared to an event based one.
In the single representative scenario, what happens if one of the customers starts a long conversation with the representative? The representative will be stuck with the customer, and if other customers have queries, they will have to wait for the representative to finish the ongoing conversation. Also, what if one of the customers sends a representative on a long-running errand like fetching something from the depot a mile away? Until the representative returns, all other customers have to wait to get their queries resolved. One egregious customer can jeopardize all other customers and hold up the entire store operation.
Hence, when working with event based concurrency, it is essential not to:
1. Carry out CPU intensive tasks akin to having a long-running conversation with the representative.
2. Carry out blocking IO tasks similar to sending the representative to the depot.
NGINX and Redis are probably the most commonly used software that leverage event based concurrency. The workloads that these cater to are quick. Hence event based concurrency makes perfect sense here.
Taking the case of NGINX used as a reverse proxy, what does it do? Pick a client connection from the listen queue, do some operations on this and then forward it to the upstream server and then wait for the upstream to respond. While waiting for the upstream, NGINX can pick more client connections from the queue and repeat the above. When the upstream sends a response, it relies on this back to the client. Since all these are short-lived operations, this fits beautifully into an event based concurrency model. Good old Apache HTTP server creates a thread/process for each connection to do the same. The no of threads it has constraints apache. If the number of incoming requests is more than the number of threads in its pool, it has to deal with switching and coordination. NGINX does not have this overhead which makes it comparatively faster than Apache in real-world workloads. All of this is a bit simplistic and hand-wavy but should convey the idea.
Event based concurrency cannot leverage multiple CPU cores which all modern processors have. To do this, you create one event unit for each core usually called a worker. Also, most software that leverage event based concurrency adopt a hybrid model where they use event based concurrency for short-lived quick operations and off-load long-running tasks to a thread/process.
I have glossed over a lot of details and nuances to explain a complex topic like concurrency in simple terms. Treat this as a good starting guide to dig more into this fascinating world.