Today, while developing software, external calls are a given—your code talks to external HTTP services, databases, and caches. These external communications happen over networks that are fast and work well most of the time. Once in a while, networks do show their true color—they become slow, congested, and unreliable. Even the external services can get overloaded, slow down, and start throwing errors. The code one writes to interface with external services should be able to stand steady under these circumstances.
In this post, I will go through some of the basics one should keep in mind while calling external services. I will use the Python Requests library to demonstrate this with external HTTP calls. The concepts remain almost the same irrespective of the programming language, library, or the kind of external service. This post is not a Python Requests tutorial.
I have created a Jupyter Notebook so that you can read and run the code interactively. Click here, then click on the file WildWildWorldOfExternalCalls.ipynb to launch the Jupyter Notebook. If you are not familiar with executing code in a Jupyter Notebook, read about it here. You can find the Notebook source here.
Let us call api.github.com using Requests.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
r = requests.get("https://api.github.com") |
External calls happen in two stages. First, the library asks for a socket connection from the server and waits for the server to respond. Then, it asks for the payload and waits for the server to respond. In both of these interactions, the server might choose not to respond. If you do not handle this scenario, you will be stuck indefinitely, waiting on the external service.
Timeouts to the rescue. Most libraries have a default timeout, but it may not be what you want
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
r = requests.get("https://api.github.com", timeout=(3.2, 3.2)) |
The first element in the timeout tuple is the time we are willing to wait to establish a socket connection with the server. The second is the time we are willing to wait for the server to respond once we make a request.
Let us see the socket timeout in action by connecting to github.com on a random port. Since the port is not open(hopefully), github.com will not accept the connection resulting in a socket timeout.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from timeit import default_timer as timer | |
from requests import exceptions as e | |
start = timer() | |
try: | |
requests.get("https://api.github.com:88", timeout=(3.4, 20)) | |
except e.ConnectTimeout: | |
end = timer() | |
print("Time spent waiting for socket connection -", end – start, "Seconds") | |
start = timer() | |
try: | |
requests.get("https://api.github.com:88", timeout=(6.4, 20)) | |
except e.ConnectTimeout: | |
end = timer() | |
print("Time spent waiting for socket connection -", end – start, "Seconds") |
The output.
Time spent waiting for socket connection – 3.42826354 Seconds
Time spent waiting for socket connection – 6.4075264999999995 Seconds
As you can see from the output, Requests waited till the configured socket timeout to establish a connection and then errored out.
Let us move onto the read timeout.
We will use httpbin service, which lets us configure read timeouts.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from timeit import default_timer as timer | |
from requests import exceptions as e | |
try: | |
start = timer() | |
r = requests.get("https://httpbin.org/delay/9", timeout=(6.4, 6)) | |
except e.ReadTimeout: | |
end = timer() | |
print("Timed out after", end – start, "Seconds") |
The output.
Timed out after 6.941002429 Seconds
In the above, we are asking httpbin to delay the response by 9 seconds. Our read timeout is 6 seconds. As you can see from the output, Requests timed out after 6 seconds, the configured read timeout.
Let us change the read timeout to 11 seconds. We no longer get a ReadTimeout exception.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
r = requests.get("https://httpbin.org/delay/9", timeout=(6.4, 11)) |
A common misconception about the read timeout is that it is the maximum time the code spends in receiving/processing the response. That is not the case. Read timeout is the time between the client sending the request and waiting for the first byte of the response from the external service. After that, if the server keeps on responding for hours, our code will be stuck reading the response.
Let me illustrate this.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from timeit import default_timer as timer | |
from requests import exceptions as e | |
start = timer() | |
r = requests.get("https://httpbin.org/drip?duration=30&delay=0", timeout=(6.4, 6)) | |
end = timer() | |
print("Time spent waiting for the response – ", end – start, "Seconds") |
The output.
Time spent waiting for the response – 28.210101459 Seconds
We are asking httpbin to send data for 30 seconds by passing the duration parameter. Requests read timeout is 15 seconds. As evident from the output, the code spends much more than 15 seconds on the response.
If you want to bound the processing time to 15 seconds, you will have to use a thread/process and stop the execution after 15 seconds.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from multiprocessing import Process | |
from timeit import default_timer as timer | |
def call(): | |
r = requests.get("https://httpbin.org/drip?duration=30&delay=0", timeout=(6.4, 20)) | |
p = Process(target=call) | |
start = timer() | |
p.start() | |
p.join(timeout=20) | |
p.terminate() | |
end = timer() | |
print("Time spent waiting for the response – ", end – start, "Seconds") |
The output.
Time spent waiting for the response – 20.012269603 Seconds
Even though we receive the HTTP response for 30 seconds, our code terminates after 20 seconds.
In many real-world scenarios, we might be calling an external service multiple times in a short duration. In such a situation, it does not make sense for us to open the socket connection each time. We should be opening the socket connection once and then re-using it subsequently.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import logging | |
logging.basicConfig(level=logging.DEBUG) | |
for _ in range(5): | |
r = requests.get('https://api.github.com') |
The output.
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
As you can see from the output, Requests started a new connection each time; this is inefficient and non-performant.
We can prevent this by using HTTP Keep-Alive as below. Using Requests Session enables this.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import logging | |
logging.basicConfig(level=logging.DEBUG) | |
s = requests.Session() | |
for _ in range(5): | |
r = s.get('https://api.github.com') |
The output.
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
Now, Requests established the socket connection only once and re-used it subsequently.
In a real-world scenario, where multiple threads call external services simultaneously, one should use a pool.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from requests.adapters import HTTPAdapter | |
import threading | |
import logging | |
logging.basicConfig(level=logging.DEBUG) | |
s = requests.session() | |
def call(url): | |
s.get(url) | |
s.mount("https://", HTTPAdapter(pool_connections=1, pool_maxsize=2)) | |
t0 = threading.Thread(target=call, args=("https://api.github.com", )) | |
t1 = threading.Thread(target=call, args=("https://api.github.com", )) | |
t0.start() | |
t1.start() | |
t0.join() | |
t1.join() | |
t2 = threading.Thread(target=call, args=("https://api.github.com", )) | |
t3 = threading.Thread(target=call, args=("https://api.github.com", )) | |
t2.start() | |
t3.start() | |
t2.join() | |
t3.join() |
The output.
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.github.com:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (2): api.github.com:443
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
DEBUG:urllib3.connectionpool:https://api.github.com:443 “GET / HTTP/1.1” 200 496
As we have created a pool of size two, Requests created only two connections and re-used them, even though we made four external calls.
Pools also help you to play nice with external services as external services have an upper limit to the number of connections a client can open. If you breach this threshold, external services start refusing connections.
When calling an external service, you might get an error. Sometimes, these errors might be transient. Hence, it makes sense to re-try. The re-tries should happen with an exponential back-off.
Exponential back-off is a technique in which clients re-try failed requests with increasing delays between the re-tries. Exponential back-off ensures that the external services do not get overwhelmed, another instance of playing nice with external services.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
from urllib3.util.retry import Retry | |
from requests.adapters import HTTPAdapter | |
import logging | |
logging.basicConfig(level=logging.DEBUG) | |
s = requests.Session() | |
retries = Retry(total=3, | |
backoff_factor=0.1, | |
status_forcelist=[500]) | |
s.mount("https://", HTTPAdapter(max_retries=retries)) | |
s.get("https://httpbin.org/status/500") |
The output.
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): httpbin.org:443
DEBUG:urllib3.connectionpool:https://httpbin.org:443 “GET /status/500 HTTP/1.1” 500 0
DEBUG:urllib3.util.retry:Incremented Retry for (url=’/status/500′): Retry(total=2, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.connectionpool:Retry: /status/500
DEBUG:urllib3.connectionpool:https://httpbin.org:443 “GET /status/500 HTTP/1.1” 500 0
DEBUG:urllib3.util.retry:Incremented Retry for (url=’/status/500′): Retry(total=1, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.connectionpool:Retry: /status/500
DEBUG:urllib3.connectionpool:https://httpbin.org:443 “GET /status/500 HTTP/1.1” 500 0
DEBUG:urllib3.util.retry:Incremented Retry for (url=’/status/500′): Retry(total=0, connect=None, read=None, redirect=None, status=None)
DEBUG:urllib3.connectionpool:Retry: /status/500
DEBUG:urllib3.connectionpool:https://httpbin.org:443 “GET /status/500 HTTP/1.1” 500 0
In the above, we are asking httpbin to respond with an HTTP 500 status code. We configured Requests to re-try thrice, and from the output, we can see that Requests did just that.
Client libraries do a fantastic job of abstracting all the flakiness from external calls and lull us into a false sense of security. But, all abstractions leak at one time or the other. These defenses will help you to tide over these leaks.
No post on external services can be complete without talking about the Circuit Breaker design pattern. Circuit Breaker design pattern helps one to build a mental model of many of the things we talked about and gives a common vocabulary to discuss them. All programming languages have libraries to implement Circuit Breakers. I believe Netflix popularised the term Circuit Breaker with its library Hystrix.
Follow @abhyrama
Image by RENE RAUSCHENBERGER from Pixabay
One thought on “Wild Wild World of External Calls”