A database switch with HAProxy and the HAProxy Runtime API
Using the HAProxy Runtime API, we can add and remove backend database servers manually and with a script or program.
One of my projects needed a high availability database (don't we all want this ...). This means (asynchronous) replication, and a multi-node setup. Several solutions exist, will write about this in another post. In these scenarios we have multiple replicas of the main database, and when a problem occurs, we switch from one database to another.
To do this switching of databases without touching the client and databases, the client must access the database not directly but via a proxy. And that is exactly what this post is about. We implement a 'manual switch' between the databases. This is just a demonstration of how this can work.
As always I do this on Ubuntu 22.04.
Database proxy: HAProxy
There are many choices for a database proxy. Here we use HAProxy. mThis software is often used to distribute loads between web applications, acting as a load balancer, but it has many more options.
HAProxy health checks are not usable for our purpose
HAProxy can check the health of backend servers and use rules to select the best backend server for a request. That's great but we have a number of databases and only the one we specified, should be accessed by the client.
HAProxy also supports an external agent for health checks. We can create this agent ourselves. In advanced scenarios the agent can gather much more information about the total system, and make better decisions than HAProxy.
It appears that using a HAProxy external agent is problematic for our purpose, one the reasons being that the initial state is not 'maintenance' or 'down' and there is no way to change this. The HAProxy external agent is best used with load balancing, not for on/off switching.
HAProxy Runtime API
HAProxy also has an API, the HAProxy Runtime API. With this API we get the control we want. We can add new backend servers, change the state of the servers, and delete backend servers. That's exactly what we are looking for!
Project overview
We have two databases, only one can be accessed by the client. We do this by putting a database proxy service in between, and of course we need some kind of manager to control the database proxy. Below is diagram of what we want to achieve.
+--------+ +----------+ +-------------+
| client | | db-proxy | | backend-db1 |
| | | | | |
| |------|8320 |------+-------------|8310 |
| | | | | | |
+--------+ | | | +-------------+
| | |
| | | +-------------+
statistics ----|8360 | | | backend-db2 |
| | | | |
| api |-------------+------|8311 |
| 9999 | | | | |
+----------+ | | +-------------+
^ | |
| | |
| +------------------+
| | db-proxy-man |
+----- | |
| |
+------------------+
^
|
select backend
Short description of the components in this demonstration:
- client
Simple program sending a request to db-proxy:8620 and printing the response. The program is nothing more than a Shell script one-liner, sending a request every second. - db-proxy
This is the HAProxy. We enable the HAProxy Runtime API on port 9999. - db-proxy-man
From here we send commands to the API, at db-proxy:9999. We can do this manual or create a Shell script or Python program to add our selected database server to the (empty) list of backend servers. - backend-db1, backend-db2
Our database servers. We do not use real database servers but echo servers. They echo what is received with a prefix.
The files in this project are:
.
├── docker-compose.yml
├── Dockerfile
└── haproxy.cfg
You may want to add a .env file and set DOCKER_COMPOSE_PROJECT to the name of you project, e.g. db-proxy-demo.
The docker-compose.yml file
The docker-compose.yml contains all the elements we mentioned above. We add the 'hostname' for easy referencing (when we use Docker Swarm, we omit the 'hostname').
# docker-compose.yml
version: "3.7"
x-service_defaults: &service_defaults
restart: always
logging:
driver: json-file
options:
max-size: "10m"
max-file: "5"
services:
client:
<< : *service_defaults
image: busybox:1.35.0
hostname: client
command:
- /bin/sh
- -c
- "i=0; while true; do echo \"sending: hello $$i \"; echo \"hello $$i\" | nc db-proxy 8320; i=$$((i+1)); sleep 1; done"
networks:
- frontend-db-network
db-proxy:
<< : *service_defaults
image: haproxy:1.001-d
hostname: db-proxy
build:
context: .
dockerfile: Dockerfile
ports:
# http://127.0.0.1:8360/stats
- "127.0.0.1:8360:8360"
networks:
- db-proxy-network
- frontend-db-network
- backend-db1-network
- backend-db2-network
db-proxy-man:
<< : *service_defaults
image: busybox:1.35.0
hostname: db-proxy-man
# keep alive
command: tail -f /dev/null
networks:
- db-proxy-network
- backend-db1-network
- backend-db2-network
backend-db1:
<< : *service_defaults
image: venilnoronha/tcp-echo-server
hostname: backend-db1
# echo with a prefix
entrypoint: /bin/main 8310 "backend-db1 at 8310:"
networks:
- backend-db1-network
backend-db2:
<< : *service_defaults
image: venilnoronha/tcp-echo-server
hostname: backend-db2
# echo with a prefix
entrypoint: /bin/main 8311 "backend-db2 at 8311:"
networks:
- backend-db2-network
networks:
frontend-db-network:
external: true
name: frontend-db-network
db-proxy-network:
external: true
name: db-proxy-network
backend-db1-network:
external: true
name: backend-db1-network
backend-db2-network:
external: true
name: backend-db2-network
Personal opinion, never let Docker-Compose create networks. Do this manually:
docker network create frontend-db-network
docker network create db-proxy-network
docker network create backend-db1-network
docker network create backend-db2-network
The haproxy.cfg file and Dockerfile
We do not add backend database servers here, instead we add them using the HAProxy Runtime API.
# haproxy.cfg
global
maxconn 100
stats socket ipv4@*:9999 level admin
log stdout format raw local0 debug
#log stdout format raw local0 info
defaults
log global
retries 2
timeout client 30m
timeout connect 15s
timeout server 30m
timeout check 15s
frontend db
mode tcp
option tcplog
bind :8320
default_backend db_servers
backend db_servers
balance roundrobin
mode tcp
option tcplog
# servers here are added using the HAProxy Runtime API
frontend stats
bind :8360
mode http
stats enable
stats uri /stats
stats refresh 5s
#stats auth username:password
stats admin if LOCALHOST
You may want to change the log level from 'debug' to 'info'.
The line to make the HAProxy Runtime API available:
stats socket ipv4@*:9999 level admin
The Dockerfile is used to build the image, there is not much to it:
# Dockerfile
FROM haproxy:lts-alpine3.20
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg
Bring up the project
First, we build the image:
docker-compose build --no-cache
Then we start it:
docker-compose up
You will see the following in the terminal:
backend-db2_1 | listening on [::]:8311, prefix: backend-db2 at 8311:
backend-db1_1 | listening on [::]:8310, prefix: backend-db1 at 8310:
client_1 | sending: hello 0
db-proxy_1 | [NOTICE] (1) : haproxy version is 3.0.3-95a607c
db-proxy_1 | [WARNING] (1) : config : parsing [/usr/local/etc/haproxy/haproxy.cfg:25] : backend 'db_servers' : 'option tcplog' directive is ignored in backends.
db-proxy_1 | [NOTICE] (1) : New worker (8) forked
db-proxy_1 | [NOTICE] (1) : Loading success.
db-proxy_1 | 172.17.15.2:38293 [13/Aug/2024:08:05:01.543] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1 | sending: hello 1
db-proxy_1 | 172.17.15.2:46745 [13/Aug/2024:08:05:02.548] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1 | sending: hello 2
db-proxy_1 | 172.17.15.2:37693 [13/Aug/2024:08:05:03.552] db db_servers/<NOSRV> -1/-1/0 0 SC 1/1/0/0/0 0/0
client_1 | sending: hello 3
...
Note that the 'client' is sending requests to our db-proxy. These requests are not answered.
To see the status page, point your browser to:
http://127.0.0.1:8360/stats
Before rebuilding, remove the containers:
docker-compose down
Sending commands to the HAProxy Runtime API
According to the documentation:
- A server added using the API is called a dynamic server.
- The backend must be configured to use a dynamic load balancing algorithm.
- A dynamic server is not restored after a load balancer reload operation.
- "Currently a dynamic server is statically initialized with the "none" init-addr method. This means that no resolution will be undertaken if a FQDN is specified as an address, even if the server creation will be validated."
To send commands to the API we use the 'nc' (netcat) command present in the busybox image:
echo "<your-command>" | nc db-proxy 9999
We only use a few commands to add and remove servers, note that we named our backend in haproxy.cfg: 'db_servers'.
To show the servers state:
show servers state
To add a server:
add server <backend-name>/<server-name> <addr>:<port>
set server <backend-name>/<server-name> state ready
The first command adds a server but it will be in maintenance mode. The second command makes the server ready.
Again, see also above: The HAProxy Runtime API does not resolve host names of dynamic servers for us! This means we must use an IP address here, not the host name!
To remove a server:
set server <backend-name>/<server-name> state maint
del server <backend-name>/<server-name>
Now for real: Adding and removing servers
To send commands to the API we enter the db-proxy-man busybox container:
docker exec -it $(docker container ls | grep db-proxy-man | awk '{print $1; exit}') sh
During the commands you can watch the status page in your browser.
We assign the following names to our servers:
backend_db1_8310
backend_db2_8311
As already mentioned above, the API does resolve the IP address of host names of dynamic servers. We use the 'nslookup' command to get the IP address:
nslookup backend-db1
Result:
Server: 127.0.0.11
Address: 127.0.0.11:53
Non-authoritative answer:
Name: backend-db1
Address: 172.17.13.2
Let's check the servers state:
echo "show servers state" | nc db-proxy 9999
Result:
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port
Only the header line is shown. No servers are present.
Now add a server:
echo "add server db_servers/backend_db1_8310 172.17.13.2:8310" | nc db-proxy 9999
Result:
New server registered.
In the browser you will see that this server was added with the Status 'MAINT'.
Next, we enable the server by changing the state to 'ready':
echo "set server db_servers/backend_db1_8310 state ready" | nc db-proxy 9999
You can also these changes in the Docker-Compose log:
...
client_1 | sending: hello 61
db-proxy_1 | 172.17.15.2:38285 [13/Aug/2024:10:05:54.779] db db_servers/<NOSRV> -1/-1/0 0 SC 2/1/0/0/0 0/0
db-proxy_1 | Connect from 172.17.13.1:54778 to 172.17.13.4:8360 (stats/HTTP)
db-proxy_1 | Server db_servers/backend_db1_8310 is UP/READY (leaving forced maintenance).
db-proxy_1 | [WARNING] (8) : Server db_servers/backend_db1_8310 is UP/READY (leaving forced maintenance).
client_1 | sending: hello 62
backend-db1_1 | request: hello 62
backend-db1_1 | response: backend-db1 at 8310: hello 62
db-proxy_1 | 172.17.15.2:34601 [13/Aug/2024:10:05:55.781] db db_servers/backend_db1_8310 1/0/0 30 -- 2/1/0/0/0 0/0
client_1 | backend-db1 at 8310: hello 62
client_1 | sending: hello 63
backend-db1_1 | request: hello 63
backend-db1_1 | response: backend-db1 at 8310: hello 63
client_1 | backend-db1 at 8310: hello 63
db-proxy_1 | 172.17.15.2:42305 [13/Aug/2024:10:05:56.784] db db_servers/backend_db1_8310 1/0/0 30 -- 2/1/0/0/0 0/0
client_1 | sending: hello 64
...
After changing the server state to 'ready', we see that backend-db1 is now responding to 'client' requests, and the 'client' receives the data from backend-db1. Great!
Again check the servers state:
echo "show servers state" | nc db-proxy 9999
Result, showing that our server was added:
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord srv_use_ssl srv_check_port srv_check_addr srv_agent_addr srv_agent_port
3 db_servers 1 backend_db1_8310 172.17.13.2 2 0 1 1 10 1 0 0 0 0 0 0 - 8310 - 0 0 - - 0
Now let's remove this server, by first putting the server in maintenance state:
echo "set server db_servers/backend_db1_8310 state maint" | nc db-proxy 9999
And next, remove the server:
echo "del server db_servers/backend_db1_8310" | nc db-proxy 9999
You can see the changes in the logs and the browser.
Now we know how the commands are working, we can write a Shell script, Bash script, or a Python program to send commands to the API.
Using Python to access the HAProxy Runtime API
This is a blog about Python but I am not going into details of a Python program to do the above.
I only show the function you can use to access the API:
import socket
import typing as _t
BUFFER_SIZE = 4096
class HAProxyRuntimeAPIUtils:
def __init__(
self,
host: _t.Optional[str] = 'db-proxy',
port: _t.Optional[int] = 9999,
):
self.host = host
self.port = port
def send(self, send_data: str) -> _t.Optional[str]:
send_bytes = bytes(send_data + '\n', 'utf-8')
client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
client_socket.connect((self.host, self.port))
client_socket.send(send_bytes)
recv_data = client_socket.recv(BUFFER_SIZE).decode('utf-8').strip()
client_socket.close()
return recv_data
...
And the program should do something like this:
# set the selected_server
# keep repeating as db-proxy may restart
while True:
try:
# get all servers for the backend
...
# delete all servers that are not the selected_server
...
# if not present, add the selected_server, with the proper addr, and set to ready
...
If the IP address of the backend servers can change, then we also check for changes here.
Summary
We used HAProxy to create a switch for our databases. HAProxy is a very complex piece of software and, although the HAProxy documentation is extensive, it is also unclear at times. I believe this has to do with the continuous development of HAProxy. Using the HAProxy Runtime API is easy, but the information that can be retrieved about the backend servers is limited.
But in the end, HAProxy just works and should be part of every system developer's toolbox.
Links / credits
Dynamic DNS Resolution with HAProxy and Docker
https://stackoverflow.com/questions/41152408/dynamic-dns-resolution-with-haproxy-and-docker
HAProxy
https://en.wikipedia.org/wiki/HAProxy
HAProxy Runtime API
https://www.haproxy.com/documentation/haproxy-runtime-api/
venilnoronha/tcp-echo-server
https://hub.docker.com/r/venilnoronha/tcp-echo-server
Read more
Database Docker Docker-compose
Most viewed
- Using Python's pyOpenSSL to verify SSL certificates downloaded from a host
- Using PyInstaller and Cython to create a Python executable
- Reducing page response times of a Flask SQLAlchemy website
- Connect to a service on a Docker host from a Docker container
- Using UUIDs instead of Integer Autoincrement Primary Keys with SQLAlchemy and MariaDb
- SQLAlchemy: Using Cascade Deletes to delete related objects