Easy "Maintenance Mode" in Kubernetes

November 11, 2017
ops kubernetes

Despite our best efforts, systems sometimes require downtime for a variety of reasons. Different systems that I’ve seen have built this capability into different areas of their stacks, mostly in the application itself, or into the web server (such as nginx). Many of these solution still require the application to be running in order to serve an appropriate maintence page. At Cratejoy, this requirement proved problematic.

When we first migrated to using Kubernetes, we controlled our “Maintenance Mode” using a ConfigMap, which was mounted into our Nginx containers, and checked the mounted value. If we had activated “Maintenance Mode”, nginx would directly return out maintence page. Due to the manner in which we set up our Pods (and nginx container proxying to an application container), the application was required to be running in order to properly serve our maintence page. As well, this required us keep an ever-increasing number of maintence configurations, and duplicate our maintence pages across services.

Introducing Service-switching

In order to prevent this, we made the decision to create a separate ‘Maintenance’ service, and utilize native Kubernetes capabilities to serve it when necessary. This isolated service contains only an nginx container, with our generic Maintenance page preloaded. For any given service, we can instantly activate “Maintenance Mode” by swapping the selector in the corresponding service definition to target our “maintenance” service.

For example, assuming we have a simple service:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: helloworld
  name: helloworld
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: helloworld

We can switch it to maintenance by editting to the following:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: helloworld
  name: helloworld
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: maintenance  # <--- This is what we changed

We’re now free to completely shut down our application, and still successfully serve our maintence status page.

Customizing Maintenance for a Service

You probably noticed that this approach requires that we serve the same maintenance page for all of our services. This may not always be wanted, but fortunately we can easily correct this.

If we want to serve a custom maintenance page for an application, we can:

  1. Clone our original Maintenance Service. Note that we will need to customize the selector as well.
  2. Create a ConfigMap containing the custom page that we which to serve, and mount it in place of our default page.
  3. Direct our traffic to this custom maintence service, instead of the default.

Hopefully this works out for you, or was at least helpful. If you have any questions, don’t hesitate to shoot me an email, or follow me on twitter @nrmitchi.