Loading And Pre-Loading¶
Loading and Pre-loading supports the air-gapped use case of populating the registry in a connected environment, and then moving it into an air-gapped environment.
You can pre-load the cache two ways:
- As a startup task before running the service:
bin/ociregistry serve --preload-images <file>. The server will load the image cache and then serve. - By using the binary as a CLI:
bin/ociregistry load --image-file <file>. The executable will load the cache and then exit back to the command prompt.
In both cases, you create a file with a list of image references. Example:
cat <<EOF >| imagelist
quay.io/jetstack/cert-manager-cainjector:v1.11.2
quay.io/jetstack/cert-manager-controller:v1.11.2
quay.io/jetstack/cert-manager-webhook:v1.11.2
registry.k8s.io/metrics-server/metrics-server:v0.6.2
registry.k8s.io/ingress-nginx/controller:v1.8.1
registry.k8s.io/pause:3.8
docker.io/kubernetesui/dashboard-api:v1.0.0
docker.io/kubernetesui/metrics-scraper:v1.0.9
docker.io/kubernetesui/dashboard-web:v1.0.0
EOF
Since the entirety of the image cache consists of files and sub-directories under the image cache directory, you can tar that directory up at any time, copy it somewhere, untar it, and start an Ociregistry server instance there pointing to the copied directory and it will just work.
Image Store¶
The image store is persisted to the file system. This includes blobs and manifests. Let's say you run the server with --image-path=/var/lib/ociregistry, which is the default. Then:
/var/lib/ociregistry
├── blobs
├── img
└── lts
blobsare where the blobs are stored.imgstores the non-latest-tagged image manifests.ltsstores thelatest-tagged image manifests. (See About "Latest" below.)
Everything is stored by digest. When the server starts it loads everything into an in-memory representation. Each new pull through the server while it is running updates both the in-memory representation of the image store as well as the persistent state on the file system.
The software uses a data structure called a ManifestHolder to hold all the image metadata and the actual manifest bytes from the upstream registry. These are simply serialized to the file system as JSON. (So you can find and inspect them if needed for troubleshooting with grep, cat, and jq.)
A ManifestHolder looks like this:
type ManifestHolder struct {
Type ManifestType
Digest string
ImageUrl string
Bytes []byte
V1ociIndex v1oci.Index
V1ociManifest v1oci.Manifest
V2dockerManifestList v2docker.ManifestList
V2dockerManifest v2docker.Manifest
Created string
Pulled string
}
The Bytes field has the actual manifest bytes from the upstream. You can see the supported manifest types: V1ociIndex, V1ociManifest, V2dockerManifestList, and V2dockerManifest.
Loading behavior¶
Loading is additive, meaning if you run the load command to load 100 images, then run it again to load 100 different images, your image cache will have 200 images. If you load 100 images, and then later load the same 100 images again, the server will detect during the second load that there is nothing to do. And of course, if you first load A, B, and C, and then later load C, D, and E, then the cache will hold A, B, C, and D.
About "Latest"¶
Internally, latest-tagged images are stored side-by-side with non-latest images and treated as separate manifests. This enables the server to support cases that occur in development environments where latest images are in a constant state of flux. Storing latest images this way works in tandem with the --always-pull-latest flag as follows:
| Action | --always-pull-latest |
Result |
|---|---|---|
Pull foo:latest |
false (the default) |
The image is pulled exactly once. All subsequent pulls return the same image regardless of what happens in the upstream. |
Pull foo:latest |
true |
The image is pulled from the upstream on each pull from the pull-through server for each client. Each pull completely replaces the prior pull. In other words - for latest images the server is a stateless proxy. (This could consume a fair bit of network bandwidth.) |