angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

Aiohttp with custom DNS servers, Unbound and Docker

Offload your Python aoihttp application by adding caching DNS resolvers to your local system.

13 July 2023 Updated 13 July 2023
In Async
post main image
https://www.pexels.com/nl-nl/@cihankahraman

Using aiohttp looks so easy, but it is not. It's confusing. The 'Client Quickstart' documentation begins with the following:

Note

Don’t create a session per request. Most likely you need a session per application which performs all requests together.

More complex cases may require a session per site, e.g. one for Github and other one for Facebook APIs. Anyway making a session for every request is a very bad idea.

A session contains a connection pool inside. Connection reusage and keep-alives (both are on by default) may speed up total performance.

Hmmm ... ok ... repeat please ...

Anyway, the problem: I must check many different sites, and I also want to use custom DNS servers. This means a session per site. I do not know which sites, I just get a list of Urls. So we go for a session per Url. And everything is using Docker.

In this post we feed the aiohttp AsyncResolver with the IP addresses of two Unbound caching DNS resolvers, Cloudflare or Quad9, running on our local system. As always my development system is Ubuntu 22.04.

The Python application

Below is our (incomplete) Python application. It uses aiohttp to check sites (Urls). The script is running using a Python Docker container. I am not going to bore you here with setting up a Docker Python image and container.

Note that we create an AsyncResolver for every TCPConnector for every session using the IP addresses of Cloudflare or Quad9.

# check_urls.py
import asyncio
import aiodns
import aiohttp
import logging
import os
import socket
import sys

def get_logger(
    console_log_level=logging.DEBUG,
    file_log_level=logging.DEBUG,
    log_file=os.path.splitext(__file__)[0] + '.log',
):
    logger_format = '%(asctime)s %(levelname)s [%(filename)-30s%(funcName)30s():%(lineno)03s] %(message)s'
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.DEBUG)
    if console_log_level:
        # console
        console_handler = logging.StreamHandler(sys.stdout)
        console_handler.setLevel(console_log_level)
        console_handler.setFormatter(logging.Formatter(logger_format))
        logger.addHandler(console_handler)
    if file_log_level:
        # file
        file_handler = logging.FileHandler(log_file)
        file_handler.setLevel(file_log_level)
        file_handler.setFormatter(logging.Formatter(logger_format))
        logger.addHandler(file_handler)
    return logger

logger = get_logger(file_log_level=None)


async def check_url(task_number, url, nameservers=None):
    logger.debug(f'[{task_number}] url = {url}, nameservers = {nameservers}')

    resolver = None
    if nameservers:
        resolver = aiohttp.resolver.AsyncResolver(nameservers=nameservers)

    connector=aiohttp.TCPConnector(
        limit=1,
        use_dns_cache=True,
        ttl_dns_cache=300,
        family=socket.AF_INET,
        resolver=resolver
    )

    async with aiohttp.ClientSession(
        connector=connector,
        # if we want to reuse the connector with other sessions, 
        # we must not close it: connector_owner=False
        connector_owner=True,
    ) as session:
        async with session.get(
            url,
        ) as client_response:
            logger.debug(f'[{task_number}] status = {client_response.status}')
            logger.debug(f'[{task_number}] url = {client_response.url}')
            logger.debug(f'[{task_number}] content_type = {client_response.headers.get("Content-Type", None)}')
            logger.debug(f'[{task_number}] charset = {client_response.charset}')


async def main():
    logger.debug(f'()')
    dns_cloudflare = ['1.1.1.1', '1.0.0.1']
    dns_quad9 = ['9.9.9.9', '149.112.112.112']

    sites = [
        ('http://www.example.com', dns_cloudflare),
        ('http://www.example.org', dns_quad9)
    ]

    tasks = []
    for task_number, site in enumerate(sites):
        url, nameservers = site
        task = asyncio.create_task(check_url(task_number, url, nameservers))
        tasks.append(task)

    for task in tasks:
        await task
    logger.debug(f'ready')

asyncio.run(main())

Problem: Directly connected to remote DNS servers

Although the above works it has some problems. If we check a lot of Urls then we are firing a lot of (separate) requests to the DNS servers of Cloudflare or Quad9.

We can re-use the TCPConnector, e.g. by creating a pool of TCPConnectors, and use the DNS caching of the connectors. This is a big improvement but it still is far from perfect because our connectors remain 'directly' connected to the outside world (via the resolver).

Solution: Local caching DNS servers

We can do better by running one or more caching DNS servers on our local system, and feeding the AsyncResolvers with the IP addresses of our caching DNS servers.

Caching DNS server: Unbound

There are a lot of Docker DNS server images and I selected 'Unbound DNS Server Docker Image', see links below. Why? It is easy to use, and by default it forwards queries to a remote DNS server, Cloudflare. A nice feature is that we can use DNS over TLS (DoT). This means we shield the requests from (ISP) tracking.

Because we want more than one local DNS server, we first copy some configuration files outside the container. In the directory where we start the DNS server, we create a new directory:

my_conf

Then we start the DNS server:

docker run --name=my-unbound mvance/unbound:1.17.0

And in another terminal, we copy some files from inside the container to our system:

mkdir my_conf
docker cp my-unbound:/opt/unbound/etc/unbound/forward-records.conf my_conf
docker cp my-unbound:/opt/unbound/etc/unbound/a-records.conf my_conf

Stop the DNS server by hitting 'CTRL-C'.

I created the following docker-compose.yml file:

version: '3'

services:
  unbound_cloudflare_service:
    image: "mvance/unbound:1.17.0"
    container_name: unbound_cloudflare_container
    networks:
     - dns
    volumes:
      - type: bind
        read_only: true
        source: ./my_conf/forward-records.conf
        target: /opt/unbound/etc/unbound/forward-records.conf
      - type: bind
        read_only: true
        source: ./my_conf/a-records.conf
        target: /opt/unbound/etc/unbound/a-records.conf
    restart: unless-stopped

networks:
  dns:
    external: true
    name: unbound_dns_network

volumes:
  mydata:

We are connecting from the Python application Docker container to the DNS server container using Docker network. This means that there is no need to specify the ports in docker-compose.yml file. Publishing no ports means better security. To create the Docker network 'unbound_dns_network':

docker network create unbound_dns_network

To start the DNS server:

docker-compose up

Check if the DNS server is working

For this I use 'netshoot: a Docker + Kubernetes network trouble-shooting swiss-army container', see links below. When we start it, we also connect to the 'unbound_dns_network':

docker run --rm -it --net=unbound_dns_network nicolaka/netshoot

Then we use 'dig' to check if our DNS server is working.

Note that we are refering to the Docker-compose service name, 'unbound_cloudflare_service', here:

dig @unbound_cloudflare_service -p 53 google.com

Result:

---
; <<>> DiG 9.18.13 <<>> @unbound_cloudflare_service -p 53 google.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55895
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;google.com.            IN    A

;; ANSWER SECTION:
google.com.        300    IN    A    142.250.179.174

;; Query time: 488 msec
;; SERVER: 172.17.10.3#53(unbound_cloudflare_service) (UDP)
;; WHEN: Wed Jul 12 16:32:56 UTC 2023
;; MSG SIZE  rcvd: 55

The 'ANSWER SECTION' gives the IP address. The query time is 488 milliseconds. If we run the command again, we get the same result but the query time will be (close to) 0. Note that the IP address of our local DNS server service is also shown:

172.17.10.3

Anyway, our local DNS server is working!

We can repeat these steps for Quad9. Create a new directory and copy the files from the Cloudflare setup.

Edit the docker-compose.yml file and replace 'cloudflare' by 'quad9'.

Edit the 'forward-records.conf' file:

  • Comment the lines for Cloudflare
  • Decomment the lines for Quad9

And take it up!

Using our local DNS servers in our Python script

This is the last step. We must do two things:

  • Add the 'unbound_dns_network' to our Python Docker container.
  • Translate the names 'unbound_cloudflare_service', and 'unbound_quad9_service' to IP addresses.

Adding the 'unbound_dns_network' to our Python Docker container is easy. We do this the same way as we did in the Unbound docker-compose.yml file.

We already know the IP addresses of our local DSN server services, but they can change. Instead of hard-coding the IP addresses, we translate the service names to IP addresses in our Python script, by changing the following code in our script, see above, from:

    dns_cloudflare = ['1.1.1.1', '1.0.0.1']
    dns_quad9 = ['9.9.9.9', '149.112.112.112']

to:

    dns_cloudflare = [socket.gethostbyname('unbound_cloudflare_service')]
    dns_quad9 = [socket.gethostbyname('unbound_quad9_service')]

Of course this only works if the local DNS server services are up-and-running.

Now all DNS requests from our Python application are routed to our local DNS servers!

Summary

We wanted to behave friendly and not overload remote DNS servers with too many connections. We also wanted to remove the direct connection of our Python application to remote DNS servers. We achieved this by spinning up local DNS server services and connected our Python script to them, using Docker network.
We created an extra depency, local DNS servers, but also removed a depency. If a remote DNS server is down (for some time), our Python application keeps working.

Links / credits

Docker Container Published Port Ignoring UFW Rules
https://www.baeldung.com/linux/docker-container-published-port-ignoring-ufw-rules

netshoot: a Docker + Kubernetes network trouble-shooting swiss-army container
https://github.com/nicolaka/netshoot

Unbound
https://nlnetlabs.nl/projects/unbound/about

Unbound DNS Server Docker Image
https://github.com/MatthewVance/unbound-docker

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.