angle-uparrow-clockwisearrow-counterclockwisearrow-down-uparrow-leftatcalendarcard-listchatcheckenvelopefolderhouseinfo-circlepencilpeoplepersonperson-fillperson-plusphoneplusquestion-circlesearchtagtrashx

A database switch with HAProxy and the HAProxy Runtime API

Using the HAProxy Runtime API, we can add and remove backend database servers manually and with a script or program.

13 August 2024 Updated 13 August 2024
post main image
https://unsplash.com/@georgiadelotz

One of my projects needed a high availability database (don't we all want this ...).  This means (asynchronous) replication, and a multi-node setup. Several solutions exist, will write about this in another post. In these scenarios we have multiple replicas of the main database, and when a problem occurs, we switch from one database to another.

To do this switching of databases without touching the client and databases, the client must access the database not directly but via a proxy. And that is exactly what this post is about. We implement a 'manual switch' between the databases. This is just a demonstration of how this can work.

As always I do this on Ubuntu 22.04.

Database proxy: HAProxy

There are many choices for a database proxy. Here we use HAProxy. mThis software is often used to distribute loads between web applications, acting as a load balancer, but it has many more options.

HAProxy health checks are not usable for our purpose

HAProxy can check the health of backend servers and use rules to select the best backend server for a request. That's great but we have a number of databases and only the one we specified, should be accessed by the client.

HAProxy also supports an external agent for health checks. We can create this agent ourselves. In advanced scenarios the agent can gather much more information about the total system, and make better decisions than HAProxy.

It appears that using a HAProxy external agent is problematic for our purpose, one the reasons being that the initial state is not 'maintenance' or 'down' and there is no way to change this. The HAProxy external agent is best used with load balancing, not for on/off switching.

HAProxy Runtime API

HAProxy also has an API, the HAProxy Runtime API. With this API we get the control we want. We can add new backend servers, change the state of the servers, and delete backend servers. That's exactly what we are looking for!

Project overview

We have two databases, only one can be accessed by the client. We do this by putting a database proxy service in between, and of course we need some kind of manager to control the database proxy. Below is diagram of what we want to achieve.

  +--------+      +----------+                    +-------------+
  | client |      | db-proxy |                    | backend-db1 |
  |        |      |          |                    |             |
  |        |------|8320      |------+-------------|8310         |
  |        |      |          |      |             |             |
  +--------+      |          |      |             +-------------+
                  |          |      |
                  |          |      |             +-------------+
   statistics ----|8360      |      |             | backend-db2 |
                  |          |      |             |             |
                  |  api     |-------------+------|8311         |
                  |  9999    |      |      |      |             |
                  +----------+      |      |      +-------------+
                       ^            |      |   
                       |            |      |    
                       |      +------------------+
                       |      |   db-proxy-man   |
                       +----- |                  |
                              |                  |
                              +------------------+
                                       ^
                                       |
                                select backend      

Short description of the components in this demonstration:

  • client
    Simple program sending a request to db-proxy:8620 and printing the response. The program is nothing more than a Shell script one-liner, sending a request every second.
  • db-proxy
    This is the HAProxy. We enable the HAProxy Runtime API on port 9999.
  • db-proxy-man
    From here we send commands to the API, at db-proxy:9999. We can do this manual or create a Shell script or Python program to add our selected database server to the (empty) list of backend servers.
  • backend-db1, backend-db2
    Our database servers. We do not use real database servers but echo servers. They echo what is received with a prefix.

The files in this project are:

.
├── docker-compose.yml
├── Dockerfile
└── haproxy.cfg

You may want to add  a .env file and set DOCKER_COMPOSE_PROJECT to the name of you project, e.g. db-proxy-demo.

The docker-compose.yml file

The docker-compose.yml contains all the elements we mentioned above. We add the 'hostname' for easy referencing (when we use Docker Swarm, we omit the 'hostname').

# docker-compose.yml
version: "3.7"

x-service_defaults: &service_defaults
  restart: always
  logging:
    driver: json-file
    options:
      max-size: "10m"
      max-file: "5"

services:
  client:
    << : *service_defaults
    image: busybox:1.35.0
    hostname: client
    command:
      - /bin/sh
      - -c
      - "i=0; while true; do echo \"sending: hello $$i \"; echo \"hello $$i\" | nc db-proxy 8320; i=$$((i+1)); sleep 1; done"
    networks:
      - frontend-db-network

  db-proxy:
    << : *service_defaults
    image: haproxy:1.001-d
    hostname: db-proxy
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      # http://127.0.0.1:8360/stats
      - "127.0.0.1:8360:8360"
    networks:
      - db-proxy-network
      - frontend-db-network
      - backend-db1-network
      - backend-db2-network

  db-proxy-man:
    << : *service_defaults
    image: busybox:1.35.0
    hostname: db-proxy-man
    # keep alive
    command: tail -f /dev/null
    networks:
      - db-proxy-network
      - backend-db1-network
      - backend-db2-network

  backend-db1:
    << : *service_defaults
    image: venilnoronha/tcp-echo-server
    hostname: backend-db1
    # echo with a prefix
    entrypoint: /bin/main 8310 "backend-db1 at 8310:"
    networks:
      - backend-db1-network

  backend-db2:
    << : *service_defaults
    image: venilnoronha/tcp-echo-server
    hostname: backend-db2
    # echo with a prefix
    entrypoint: /bin/main 8311 "backend-db2 at 8311:"
    networks:
      - backend-db2-network

networks:
  frontend-db-network:
    external: true
    name: frontend-db-network
  db-proxy-network:
    external: true
    name: db-proxy-network  
  backend-db1-network:
    external: true
    name: backend-db1-network
  backend-db2-network:
    external: true
    name: backend-db2-network  

Personal opinion, never let Docker-Compose create networks. Do this manually:

docker network create frontend-db-network
docker network create db-proxy-network
docker network create backend-db1-network
docker network create backend-db2-network

The haproxy.cfg file and Dockerfile

We do not add backend database servers here, instead we add them using the HAProxy Runtime API.

# haproxy.cfg
global
  maxconn 100
  stats socket ipv4@*:9999 level admin
  log stdout format raw local0 debug
  #log stdout format raw local0 info

defaults
  log global
  retries 2
  timeout client 30m
  timeout connect 15s
  timeout server 30m
  timeout check 15s

frontend db
  mode tcp
  option tcplog
  bind :8320
  default_backend db_servers

backend db_servers
  balance roundrobin
  mode tcp
  option tcplog
  # servers here are added using the HAProxy Runtime API

frontend stats
  bind :8360
  mode http
  stats enable
  stats uri /stats
  stats refresh 5s
  #stats auth username:password
  stats admin if LOCALHOST

You may want to change the log level from 'debug' to 'info'.

The line to make the HAProxy Runtime API available:

  stats socket ipv4@*:9999 level admin

The Dockerfile is used to build the image, there is not much to it:

# Dockerfile 
FROM haproxy:lts-alpine3.20
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

Bring up the project

First, we build the image:

docker-compose build --no-cache

Then we start it:

docker-compose up

You will see the following in the terminal:

backend-db2_1   | listening on [::]:8311, prefix: backend-db2 at 8311:
backend-db1_1   | listening on [::]:8310, prefix: backend-db1 at 8310:
client_1        | sending: hello 0 
db-proxy_1      | [NOTICE]   (1) : haproxy version is 3.0.3-95a607c
db-proxy_1      | [WARNING]  (1) : config : parsing [/usr/local/etc/haproxy/haproxy.cfg:25] : backend 'db_servers' : 'option tcplog' directive is ignored in backends.
db-proxy_1      | [NOTICE]   (1) : New worker (8) forked
db-proxy_1      | [NOTICE]   (1) : Loading success.
db-proxy_1      | 172.17.15.2:38293 [13/Aug/2024:08:05:01.543] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1        | sending: hello 1 
db-proxy_1      | 172.17.15.2:46745 [13/Aug/2024:08:05:02.548] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1        | sending: hello 2 
db-proxy_1      | 172.17.15.2:37693 [13/Aug/2024:08:05:03.552] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1        | sending: hello 3 
...

Note that the 'client' is sending requests to our db-proxy. These requests are not answered.

To see the status page, point your browser to:

http://127.0.0.1:8360/stats

Before rebuilding, remove the containers:

docker-compose down

Sending commands to the HAProxy Runtime API

According to the documentation:

  • A server added using the API is called a dynamic server.
  • The backend must be configured to use a dynamic load balancing algorithm.
  • A dynamic server is not restored after a load balancer reload operation.
  • "Currently a dynamic server is statically initialized with the "none" init-addr method. This means that no resolution will be undertaken if a FQDN is specified as an address, even if the server creation will be validated."

To send commands to the API we use the 'nc' (netcat) command present in the busybox image:

echo "<your-command>" | nc db-proxy 9999

We only use a few commands to add and remove servers, note that we named our backend in haproxy.cfg: 'db_servers'.

To show the servers state:

show servers state

To add a server:

add server <backend-name>/<server-name> <addr>:<port>
set server <backend-name>/<server-name> state ready

The first command adds a server but it will be in maintenance mode. The second command makes the server ready.

Again, see also above: The HAProxy Runtime API does not resolve host names of dynamic servers for us! This means we must use an IP address here, not the host name!

To remove a server:

set server <backend-name>/<server-name> state maint
del server <backend-name>/<server-name>

Now for real: Adding and removing servers

To send commands to the API we enter the db-proxy-man busybox container:

docker exec -it $(docker container ls | grep db-proxy-man | awk '{print $1; exit}') sh

During the commands you can watch the status page in your browser.

We assign the following names to our servers:

backend_db1_8310
backend_db2_8311

As already mentioned above, the API does resolve the IP address of host names of dynamic servers. We use the 'nslookup' command to get the IP address:

nslookup backend-db1

Result:

Server:		127.0.0.11
Address:	127.0.0.11:53

Non-authoritative answer:
Name:	backend-db1
Address: 172.17.13.2

Let's check the servers state:

echo "show servers state" | nc db-proxy 9999

Result:

1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port

Only the header line is shown. No servers are present.

Now add a server:

echo "add server db_servers/backend_db1_8310 172.17.13.2:8310" | nc db-proxy 9999

Result:

New server registered.

In the browser you will see that this server was added with the Status 'MAINT'.

Next, we enable the server by changing the state to 'ready':

echo "set server db_servers/backend_db1_8310 state ready" | nc db-proxy 9999

You can also these changes in the Docker-Compose log:

...
client_1        | sending: hello 61 
db-proxy_1      | 172.17.15.2:38285 [13/Aug/2024:10:05:54.779] db db_servers/<NOSRV> -1/-1/0 0 SC 2/1/0/0/0 0/0
db-proxy_1      | Connect from 172.17.13.1:54778 to 172.17.13.4:8360 (stats/HTTP)
db-proxy_1      | Server db_servers/backend_db1_8310 is UP/READY (leaving forced maintenance).
db-proxy_1      | [WARNING]  (8) : Server db_servers/backend_db1_8310 is UP/READY (leaving forced maintenance).
client_1        | sending: hello 62 
backend-db1_1   | request: hello 62
backend-db1_1   | response: backend-db1 at 8310: hello 62
db-proxy_1      | 172.17.15.2:34601 [13/Aug/2024:10:05:55.781] db db_servers/backend_db1_8310 1/0/0 30 -- 2/1/0/0/0 0/0
client_1        | backend-db1 at 8310: hello 62
client_1        | sending: hello 63 
backend-db1_1   | request: hello 63
backend-db1_1   | response: backend-db1 at 8310: hello 63
client_1        | backend-db1 at 8310: hello 63
db-proxy_1      | 172.17.15.2:42305 [13/Aug/2024:10:05:56.784] db db_servers/backend_db1_8310 1/0/0 30 -- 2/1/0/0/0 0/0
client_1        | sending: hello 64 
...

After changing the server state to 'ready', we see that backend-db1 is now responding to 'client' requests, and the 'client' receives the data from backend-db1. Great!

Again check the servers state:

echo "show servers state" | nc db-proxy 9999

Result, showing that our server was added:

1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port
3 db_servers 1 backend_db1_8310 172.17.13.2 2 0 1 1 10 1 0 0 0 0 0 0 - 8310 - 0 0 - - 0

Now let's remove this server, by first putting the server in maintenance state:

echo "set server db_servers/backend_db1_8310 state maint" | nc db-proxy 9999

And next, remove the server:

echo "del server db_servers/backend_db1_8310" | nc db-proxy 9999

You can see the changes in the logs and the browser.

Now we know how the commands are working, we can write a Shell script, Bash script, or a Python program to send commands to the API.

Using Python to access the HAProxy Runtime API

This is a blog about Python but I am not going into details of a Python program to do the above.
I only show the function you can use to access the API:

import socket
import typing as _t

BUFFER_SIZE = 4096

class HAProxyRuntimeAPIUtils:
    def __init__(
        self,
        host: _t.Optional[str] = 'db-proxy',
        port: _t.Optional[int] = 9999,
    ):
        self.host = host
        self.port = port

    def send(self, send_data: str) -> _t.Optional[str]:
        send_bytes = bytes(send_data + '\n', 'utf-8')
        client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        client_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        client_socket.connect((self.host, self.port))
        client_socket.send(send_bytes)
        recv_data = client_socket.recv(BUFFER_SIZE).decode('utf-8').strip()
        client_socket.close()
        return recv_data
    
    ...    

And the program should do something like this:

    # set the selected_server

    # keep repeating as db-proxy may restart
    while True:

        try:
            # get all servers for the backend 
            ...

            # delete all servers that are not the selected_server
            ...

            # if not present, add the selected_server, with the proper addr, and set to ready
            ...

If the IP address of the backend servers can change, then we also check for changes here.

Summary

We used HAProxy to create a switch for our databases. HAProxy is a very complex piece of software and, although the HAProxy documentation is extensive, it is also unclear at times. I believe this has to do with the continuous development of HAProxy. Using the HAProxy Runtime API is easy, but the information that can be retrieved about the backend servers is limited.

But in the end, HAProxy just works and should be part of every system developer's toolbox.

Links / credits

Dynamic DNS Resolution with HAProxy and Docker
https://stackoverflow.com/questions/41152408/dynamic-dns-resolution-with-haproxy-and-docker

HAProxy
https://en.wikipedia.org/wiki/HAProxy

HAProxy Runtime API
https://www.haproxy.com/documentation/haproxy-runtime-api/

venilnoronha/tcp-echo-server
https://hub.docker.com/r/venilnoronha/tcp-echo-server

Leave a comment

Comment anonymously or log in to comment.

Comments

Leave a reply

Reply anonymously or log in to reply.