Gitlab fails to pull image from docker repository
Today, when running CI for one of my projects, I got a pipeline error for my master
branch. At first, I was a bit nervous because I'd already deployed these changes, but then, when I looked into the error itself, I was relieved, because it was not the tests that failed, but rather the downloading of the docker image.
The error I got is the following
Running with gitlab-runner 16.3.0~beta.108.g2b6048b4 (2b6048b4)
on green-5.saas-linux-small-amd64.runners-manager.gitlab.com/default xS6Vzpvo, system ID: s_6b1e4f06fcfd
feature flags: FF_USE_IMPROVED_URL_MASKING:true, FF_RESOLVE_FULL_TLS_CHAIN:false
Preparing the "docker+machine" executor
00:35
Using Docker executor with image php:8.1 ...
Starting service percona:ps-5.7.31 ...
Pulling docker image percona:ps-5.7.31 ...
WARNING: Failed to pull image with policy "always": error pulling image configuration: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/86/864a2ef94e17e2cfc4a3f14eddd58281dfb36bae929923c3aff4e38fe253b8e8/data?verify=1694594427-4vOhdSXfTT5o8grnJsREe8TF3zk%3D: dial tcp 104.16.103.207:443: i/o timeout (manager.go:237:30s)
ERROR: Job failed: failed to pull image "percona:ps-5.7.31" with specified policies [always]: error pulling image configuration: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/86/864a2ef94e17e2cfc4a3f14eddd58281dfb36bae929923c3aff4e38fe253b8e8/data?verify=1694594427-4vOhdSXfTT5o8grnJsREe8TF3zk%3D: dial tcp 104.16.103.207:443: i/o timeout (manager.go:237:30s)
The essential part of the error is
ERROR: Job failed: failed to pull image "percona:ps-5.7.31" with specified policies [always]: error pulling image configuration: Get...
What is the issue?
The issue is when running the CI Docker pull failed because the docker cannot fetch the repository data. It's either Gitlab failing to fetch or Docker repo is down. It is also possible that Cloudflare failed because Docker is using Cloudflare. It is rare for these big websites to fail, but it happens sometimes.
And because my Gitlab runner pull_policy
is the default one, which is always
, Gitlab tries to fetch the repository each time the runner is started, and fetching failed with dial tcp 104.16.103.207:443: i/o timeout.
How to fix this issue?
Try to restart your job.
These outages, especially in big services like Docker Gitlab, and CloudFlare don't last long, they are fixed quickly.
You may want to change the default pull policy to if-not-present - which will pull the repository only if it's not found in the local cache.
A few important things:
- Changing the
pull_policy
is not available on shared Gitlab runners. - Caching is used even if you use always as
pull_policy
.
How to change the pull_policy
?
To change the pull_policy
do the following:
job1:
script: echo "A single pull policy."
image:
name: ruby:3.0
pull_policy: if-not-present
Test:
image:
name: php:8.1
pull_policy: if-not-present
services:
- name: percona:ps-5.7.31
pull_policy: if-not-present
alias: db
- name: redis:3.2
pull_policy: if-not-present
alias: redis
As you can see, you can specify pull_policy on images and services as well.
Note that if the pull policy is not supported, you'll get
ERROR: Job failed (system failure): the configured PullPolicies ([always]) are not allowed by AllowedPullPolicies ([never]).
For example, on my shared Gitlab runner, I got the following error in an attempt to change the pull policy
ERROR: Preparation failed: failed to pull image 'percona:ps-5.7.31': pull_policy ([if-not-present]) defined in GitLab pipeline config is not one of the allowed_pull_policies ([always])
For more information - read the Gitlab documentation about pull policy and Set the if-not-present pull policy.