Troubleshooting an issue on a distributed system like Kubernetes can be challenging at times. There are just so many things that can go wrong with a distributed system. It can be even more challenging when a particular error has multiple reasons for occurring.
One such error is ImagePullBackOff
. It typically shows up when the kubelet agent instructs the container runtime and can’t pull the image from the container registry for various reasons.
This article will provide an in-depth overview of possible causes for your pod entering into ImagePullBackOff
state while starting your container. More importantly, you’ll learn how to troubleshoot and solve this notorious error.
What does an ImagePullBackOff error mean?
The ImagePull
part of the ImagePullBackOff
error primarily relates to your Kubernetes container runtime being unable to pull the image from a private or public container registry. The Backoff
part indicates that Kubernetes will continuously pull the image with an increasing backoff delay. Kubernetes will keep on increasing the delay with each attempt until it reaches the limit of five minutes.
It seems like a generalized statement to say that container runtime (be it Docker, containers, etc.) fails to pull the image from the registry, but let’s try to understand the possible causes for this issue.
Here are some of the possible causes behind your pod getting stuck in the ImagePullBackOff
state:
- Image doesn’t exist.
- Image tag or name is incorrect.
- Image is private, and there is an authentication failure.
- Network issue.
- Registry name is incorrect.
- Container registry rate limits.
How can you troubleshoot ImagePullBackOff?
Let’s try to troubleshoot each of the possible causes in that bulleted list.
Image doesn’t exist, or the name is incorrect
In most cases, the error could be either from a typo or the image was not pushed to the container registry, and you’re referring to an image that doesn’t exist. Let’s try to replicate this by creating a pod with a fake image name.
As you can see, the pod is stuck in an ImagePullBackOff
because the image doesn’t exist and we cannot pull the image.
$ kubectl get pod
NAME | READY | STATUS | RESTARTS | AGE |
---|---|---|---|---|
myapp | 0/1 | ImagePullBackOff | 0 | 4m59s |
To understand the root cause and find more details about this error, use the kubectl describe
command. The command itself gives a verbose output, so we’ll just show the parts of output that are relevant to our discussion.
In the following output under Events in the Message column, you can see the actual error message:
Which confirms that the image doesn’t exist.
$ kubectl describe pod myapp
........
Events:
Type | Reason | Age | From | Message |
---|---|---|---|---|
---- | ------ | ---- | ---- | ------- |
Normal | Scheduled | 2m54s | default-scheduler | Successfully assigned default/myapp to minikube |
Normal | Pulling | 71s (x4 over 2m53s) | kubelet | Pulling image "myimage/myimage:latest" |
Warning | Failed | 67s (x4 over 2m49s) | kubelet | Failed to pull image "myimage/myimage:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for myimage/myimage, repository does not exist or may require 'docker login': denied: requested access to the resource is denied |
Warning | Failed | 67s (x4 over 2m49s) | kubelet | Error: ErrImagePull |
Warning | Failed | 54s (x6 over 2m48s) | kubelet | Error: ImagePullBackOff |
Normal | Backoff | 41s (x7 over 2m48s) | kubelet | Back-off pulling image "myimage/myimage:latest" |
Tag doesn’t exist
There could be cases where the image tag you’re trying to pull is retired, or you entered the wrong tag name. In those cases, your pod will again get stuck in the ImagePullBackOff
state, as seen in the following code snippet.
We have deliberately entered the wrong tag name, lates
instead of latest
, to replicate this issue.
kubectl get pod
NAME | READY | STATUS | RESTARTS | AGE |
---|---|---|---|---|
nginx | 0/1 | ImagePullBackOff | 0 | 3m3s |
In the following output, the message indicates that tag lates
doesn’t exist for image nginx
.
Hence the image pull is unsuccessful.
$ kubectl describe pod nginx
........
Events:
Type | Reason | Age | From | Message |
---|---|---|---|---|
---- | ------ | ---- | ---- | ------- |
Normal | Scheduled | 26s | default-scheduler | Successfully assigned default/nginx to minikube |
Normal | Backoff | 20s | kubelet | Back-off pulling image "nginx:lates" |
Warning | Failed | 20s | kubelet | Error: ImagePullBackOff |
Normal | Pulling | 6s (x2 over 25s) | kubelet | Pulling image "nginx:lates" |
Warning | Failed | 2s (x2 over 20s) | kubelet | Failed to pull image "nginx:lates": rpc error: code = Unknown desc = Error response from daemon: manifest for nginx:lates not found: manifest unknown: manifest unknown |
Warning | Failed | 2s (x2 over 20s) | kubelet | Error: ErrImagePull |
Private image registry and wrong credentials provided
Most enterprises typically use an internal private container registry instead of DockerHub because they don’t want to push their internal applications to someone outside their organization. Even with DockerHub or any other publicly accessible password-protected registry, you must provide proper credentials to Kubernetes using the secret to pull the image from the registry.
In the following example, we’re trying to replicate this issue by spinning up a pod that uses an image from a private registry.
We have neither added a secret to Kubernetes nor reference of the secret in pod definition. The pod will again get stuck in the ImagePullBackOff status and the message confirms that access is denied to pull an image from the registry:
$ kubectl describe pod mypod
Events:
Type | Reason | Age | From | Message |
---|---|---|---|---|
---- | ------ | ---- | ---- | ------- |
Normal | Scheduled | 39s | default-scheduler | Successfully assigned default/mypod to minikube |
Normal | Pulling | 20s (x2 over 37s) | kubelet | Pulling image "docker.io/hiyou/image" |
Warning | Failed | 16s (x2 over 33s) | kubelet | Failed to pull image "docker.io/hiyou/image": rpc error: code = Unknown desc = Error response from daemon: pull access denied for hiyou/image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied |
Warning | Failed | 16s (x2 over 33s) | kubelet | Error: ErrImagePull |
Normal | Backoff | 3s (x2 over 32s) | kubelet | Back-off pulling image "docker.io/hiyou/image" |
Warning | Failed | 3s (x2 over 32s) | kubelet | Error: ImagePullBackOff |
To resolve this error, create a secret using the following kubectl
command. The following kubectl
command creates a secret for a private Docker registry.
Add your secret to your pod definition, as explained in the following snippet.
Network issue
There could be a widespread network issue on all the nodes of your Kubernetes cluster, and the container runtime will not be able to pull the image from the container registry. Let’s try to replicate that scenario.
$ kubectl describe pod nginx
Events:
Type | Reason | Age | From | Message |
---|---|---|---|---|
---- | ------ | ---- | ---- | ------- |
Normal | Scheduled | 35s | default-scheduler | Successfully assigned default/mypod to minikube |
Normal | Pulling | 19s (x2 over 32s) | kubelet | Pulling image "nginx:latest" |
Warning | Failed | 19s (x2 over 32s) | kubelet | Failed to pull image "nginx:latest": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:latest": failed to resolve reference "docker.io/library/nginx:latest": failed to do request: Head https://registry-1.docker.io/v2/library/nginx/manifests/latest: dial tcp: lookup registry-1.docker.io on 192.168.64.1:53: server misbehaving |
Warning | Failed | 19s (x2 over 32s) | kubelet | Error: ErrImagePull |
Normal | Backoff | 5s (x2 over 32s) | kubelet | Back-off pulling image "nginx:latest" |
Warning | Failed | 5s (x2 over 32s) | kubelet | Error: ImagePullBackOff |
In the preceding output, the message indicates that there is a network issue.
Container registry rate limits
Most container registries have implemented some rate limits (i.e., number of images you can pull) to protect their infrastructure. For example, with Docker Hub, anonymous and free Docker Hub users can only request 100 and 200 container image pull requests per six hours. If you exceed your maximum download limit, you’ll be blocked, resulting in ImagePullBackOff
error.
To resolve this for Docker Hub, you would need to upgrade to a Pro or Team account. Many other popular container image registries like GCR or ECR propose similar limitations.
Final thoughts
In this article, you learned some possible reasons why a pod would get stuck in an ImagePullBackOff
state. You checked out some different examples to understand the error better and troubleshoot it with commands like kubectl describe
. If you’re confident there is no typo in the image, registry, or tag name, then kubectl describe
will reveal the chain of events that led to the failure. In some cases, you may be able to pull the image using docker pull
, but your cluster can’t, then that probably means there’s a network issue.
To streamline your troubleshooting process by building custom workflows and UIs that help monitor and debug issues, consider using Airplane. Airplane is the developer platform for building custom internal tools.
The basic building blocks of Airplane are Tasks, which are single or multi-step functions that anyone on your team can use. Airplane also offers engineering workflows for use cases that are engineering-centric, such as database backups and incident monitoring. If you're looking to build a dashboard to monitor errors, you can use Airplane Views, which allows you to build custom UIs within minutes.
To build your first engineering-centric dashboard, sign up for a free account or book a demo.