Non-Terminating Namespace Removal

September 3, 2020
kubernetes devops sre

Namespaces are a core component of the Kubernetes landscape which are often used as as a base level of resource isolation. As a resource which contains multiple others, the shutdown behaviour associated with terminating namespaces is complex. In effect, namespaces can often get “stuck” in a Terminating state. I ran into this recently with a variety of namespaces in a Digital Ocean Kubernetes cluster.

Much advice on the internet will recommend editing the namespace to remove any finalizers on the object. This may work in many distributions, however was not an option for me. It appears that Digitalocean has protections against this (likely in the form of a MutatingAdmissionController) which dynamically added the finalizer back. Due to this, I had to track down the finalizer that was failing, and determine how to correct it.

Note: A finalizer is a function which acts to enforce cleanup and/or deletion of related resources, or take other action when a resource is deleted.

As a part of debugging this, your first step should be looking at the current state of the namespace. Although I’m sure there is a better way, I used kubectl edit ns <name>. This quickly confirmed that the namespace termination was stuck on the finalizer, and from the error it appears that the finalizer was failing to list custom metrics. At ths time this felt unrelated, but becomes clear when you think about what must be cleaned up when a namespace is deleted.

When a namespace is removed, the finalizer must ensure that all resources that it contains are removed. While a namespace still has resources within it, it is in the Terminating state. You can check what resources are in a namespace with this snippet:

kubectl api-resources --verbs=list --namespaced -o name \
  | xargs -n 1 kubectl get --show-kind --ignore-not-found -n <namespace>

Note that this is much more encompassing than kubectl get all -n <namespace>

This is, effectively, the same thing that the finalizer does, so it must fetch all resources. If you can’t list all api-resources, the finalizer cannot continue. This is what was happening in my case.

Making this call was failing because custom metrics could not be queried. My custom metrics were being provided by the prometheus-adapter, which luckily had a fairly clear error in the logs.

unable to update list of all metrics: unable to fetch metrics for query "{namespace!=\"\",__name__!~\"^container_.*\"}": Get http://prometheus-server.monitoring:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C__name__%21~%22%5Econtainer_.%2A%22%7D&start=1599086760.41: dial tcp 10.245.96.140:9090: i/o timeout
I0

The call to prometheus was timing out, leading me to check on Prometheus, where it was immediately clear from the logs that Prometheus was in a rough state. I quickly restarted Prometheus, and the namespace quickly terminated.

Our chain of failures here was:

Terminating Namespace, to
Kubernetes Finalizer, to
Listing api-resources, to
Listing Custom Metrics, to
Prometheus Custom Metric Adapter, to
Prometheus

In the end a slow-responding Prometheus was preventing an unrelated Namespace from terminating. As Kubernetes installations get more complex, I expect that we will start seeing more cross-cutting issues like this, which is why careful tracing and debugging skills are important. As the system gets more complex, a quick google search is unlikely to have the exact answer needed. Before this incident, I would not have suspected that a failing custom metric provider would prevent resource deletion.

Read more