Post

Zero downtime deployments with zero k8s

My production setup is stil a simple VM, but I still wanted to do zero-downtime deployments. This is how I made that work.

Zero downtime deployments with zero k8s

Don’t worry, despite the snarky title this won’t be an anti k8s rant. In fact, I like k8s, I run a small cluster at home, but for my production server I’m still on an old-fashioned VM.

My setup is really simple:

  • PostgreSQL on the host
  • A Spring Boot application (Adara) running in a container, with --net=host, listening on a port on localhost.
  • Apache HTTP with mod_proxy, mostly used for SSL termination.

I realise this setup is thoroughly old-school, but it’s also one I’m deeply familiar with, and it simply works.

Now, why am I even bothering to run my app in a docker container, when I could just have easily just run it from systemd? Well, mostly to make deployments and rollbacks more simple, but also… because I wanted zero downtime deployments!

But how can you do zero-downtime deployments in a setup like this? Let me show you.

The Makefile

Make is a beautifully versatile tool. Most of my application is built in Maven, but I use make to easily script around it. Make is the least opinionated build tool I know, it will pretty much let you do anything you want, as long as it uses proper Unix exit codes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
DOCKER_CMD = docker build -q -t adara:$(GIT_SHA) -t adara:latest adara-backend
GIT_SHA = $(shell git rev-parse --short HEAD)
SERVER := adara.staging

# ⑦
deploy-server: docker push run-deployment git-tag 

# ⑥
docker: docker-images/docker-image-$(GIT_SHA).tar.gz 

# ③
docker-images/docker-image-$(GIT_SHA).tar.gz: docker-images/docker-image-$(GIT_SHA).sha 
	docker save adara:$(GIT_SHA) | gzip > docker-images/docker-image-$(GIT_SHA).tar.gz  
	
# ②
docker-images/docker-image-$(GIT_SHA).sha: docker-images build 
	$(DOCKER_CMD) > docker-images/docker-image-$(GIT_SHA).sha 

# ①
docker-images:
	mkdir -p docker-images

# ④
push: docker 
	ssh $(SERVER) mkdir -p adara-deployment
	scp docker-images/docker-image-$(GIT_SHA).tar.gz $(SERVER):adara-deployment/
	ssh $(SERVER) "cat adara-deployment/docker-image-$(GIT_SHA).tar.gz | gunzip | docker load"

# ⑤
run-deployment: 
	scp adara-deployment/server/deploy.sh "$(SERVER):adara-deployment/deploy.sh"
	rsync -av target/deploy/ $(SERVER):adara
	ssh $(SERVER) 'docker tag $(IMAGE_SHA) adara:$(GIT_SHA)'
	ssh $(SERVER) 'docker tag $(IMAGE_SHA) adara:latest'
	ssh -t $(SERVER) 'adara-deployment/deploy.sh $(GIT_SHA)'
	

If you’re not familiar with Makefiles, this can look a bit daunting, so let me go through it step by step. Note that Make builds a dependency tree, and then starts at the lowest levels.

So, if A depends on B, and B depends on C, first C will be built, then B, then A. To make this a little clearer, I’ve labeled the steps in the order in which they’re executed.

  1. This simply creates the docker-images folder where we’ll store the image before sending it to the server.
  2. This builds the image (it depends on the build step which runs the Maven build).
  3. Then, we use the docker save command to get a tar file of the image. We run that through gzip for compression
  4. This command has several steps. We use ssh to remotely create a folder, and the copy the tar.gz file we created into it. Finally, we use docker load to load the image into the local Docker daemon, so it can be used to run containers.
  5. This is where the actual deployment happens. We copy a shell-script to the server, and then tag the docker image we just loaded, so it can be referred to by git-sha. Finally, we run the shell script we just uploaded.

The actual deployment

So, most of the previous was just getting the new version of our software onto the server, but we still need to run it.

To do that, these things need to happen:

  1. Find a free port on localhost for the new container to listen on
  2. Start the new container on that port
  3. Wait for the new container to become healthy
  4. Switch the Apache mod_proxy config to start sending requests to the new port
  5. Bring down the old container

This is all achieved by this script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
#!/bin/bash

# ①
function find_port() {
  base_port=8080
  increment=1
  
  port=$base_port
  isfree=$(netstat -taln | grep listen | grep $port)
  
  while [[ -n "$isfree" ]]; do
      port=$[port+increment]
      isfree=$(netstat -taln | grep listen | grep $port)
  done
  
  echo "$port"
}

# ②
kill_old_pending() {
  old_pending=$(docker container ls --all --filter=name=adara-pending --format "{{.id}}")

  if [ -n "$old_pending" ]; then
    #kill any old pending containers 
    echo "killing old pending container $old_pending"
    docker rm -f $old_pending
  fi
}

function start_adara() {

  # ③
  old_container=$(docker container ls --all --filter=name=adara-backend --format "{{.id}}")


  # ④
  port="$(find_port)"

  container_id=$(docker run --net=host --name=adara-pending --restart unless-stopped -v /opt/adara:/opt/adara -v /var/www:/var/www -d -e server_port=$port adara:$version)
  echo "started container $container_id on port $port"

  # ⑤ 
  echo "waiting for container to come up..."
  if ! curl --silent --retry-connrefused --retry 45 --retry-delay 1 --fail http://localhost:$port/actuator/health > /dev/null; then
    echo "container failed to start"
    exit 1
  fi
  
  # ⑥ 
  echo "retrieving status"
  status=$(curl --silent http://localhost:$port/actuator/health | jq -r .status)

  echo "got status: $status"
  if [ ! "$status" = "up" ]; then
    echo "unexpected status: $status"
    exit 1
  fi

  # ⑦ 
  echo "updating apache config"
  for i in `ls /etc/apache2/sites-available/*.conf.template`
  do
    target=`echo $i | grep -op '.*(?=\.template)'`
    echo "writing file $target"
    sudo sh -c "cat $i | adara_port=$port apache_log_dir=/var/log/apache2 envsubst > $target"
  done

  sudo systemctl reload apache2

  # ⑧ 
  echo "stopping old container..."
  docker stop $old_container 
  docker rm $old_container

  # ⑨ 
  docker container rename $container_id adara-backend 

  echo "deployment done"
}

version=$1

if [ -z "$version" ]; then
  version="latest"
fi

kill_old_pending
start_adara

  1. This is a utility function, which uses netstat to find a free port, starting at 8080
  2. If for some reason, a previous deployment failed, we might have an old container called adara-pending still running. If that’s the case, kill it.
  3. Find the ID of the currently running container, and store it for later
  4. Find a free port, and start a new container on that port, under the temporary name adara-pending
  5. Run curl with the --retry-connrefused --retry 45 --retry-delay 1 options to retrieve the actuator. This makes the script wait until the actuator is available and responding.
  6. Grab the actual response, and use jq to get the status field. If the status isn’t ‘Up’ by the point, assume the deployment failed and exit with an error.
  7. Update the Apache config files to point to the new port.
  8. Bring down the old container.
  9. Rename the new container from adara-pending to adara-backend

With that, the deployment is succesful, and we’re running the new container.

Apache config

Now, step 7 needs a bit more explanation. It uses envsubst to do simple templating. For each of my Apache configuration files, I have a .template file, which is just the config file, but instead of the concrete port, it references the ADARA_PORT environment variable. If we run this through envsubst, those values will be substituted, resulting in a valid config file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
<IfModule mod_ssl.c>
<VirtualHost *:443>
  ServerName foobar.com

  ServerAdmin info@foobar.com
  CustomLog ${APACHE_LOG_DIR}/foobar.com.log combined

  SSLCertificateFile /etc/letsencrypt/live/foobar.com/fullchain.pem
  SSLCertificateKeyFile /etc/letsencrypt/live/foobar.com/privkey.pem

  Protocols h2 http/1.1

  ProxyPreserveHost on
  ProxyPass / http://localhost:$ADARA_PORT/
  ProxyPassReverse / http://localhost:$ADARA_PORT/

</VirtualHost>
</IfModule>

Some thoughts

At this point, you may be thinking: “Why the hell don’t you just run k8s?”, and that’s a very valid question. At the moment, my needs are simply too small to justify running a cluster.

And yes, this took some manual work on my part, but the upside is that I know exactly how this solution works. It’s simple, predictable and straight-forward.

Plus: I learned a bunch of new things, and isn’t that what it’s ultimately all about?

This post is licensed under CC BY 4.0 by the author.