I've been working on adding basic images support to my experimental container manager and to my surprise, the task turned to be more complex than I initially expected. I spent some time looking for ways to manage container images directly from my application code. There is plenty of tools out there (docker, containerd, podman, buildah, cri-o, etc) providing image management capabilities. However, if you don't want to have a dependency on an external daemon running in your system, or you don't feel like shelling out for exec-ing a command-line tool from the code, the options are at best limited.
I've reviewed a bunch of the said tools focusing on the underlying means they use to deal with images and at last, I found two appealing libraries. The first one is github.com/containers/image library "[...] aimed at working in various way with containers' images and container image registries". The second one is github.com/containers/storage "[...] which aims to provide methods for storing filesystem layers, container images, and containers". The libraries are meant to be used in conjunction and form a very powerful image management tandem. But unfortunately, I could not find a sufficient amount of documentation, especially how to get started kind of it.
Without the docs the only way to learn how to use the libraries for me was to analyze the code of their dependants (most prominently - buildah and cri-o). It took me a while to forge a working example which is capable of:
- pulling images from remote repositories;
- storing images locally;
- creating and mounting containers (i.e. writable instances of images).
In the rest of the article, I'll try to show how to use the libraries to perform the said task and highlight the most interesting parts of this journey.
Disclaimer: This is by no means an attempt to fully or even partially document the libraries!
Level up your server-side game — join 9,000 engineers getting insightful learning materials straight to their inbox.
Prerequisites
The storage library is responsible not only for storing things on disk but also for mounting overlay filesystems. It implements various storage drivers (overlay, btrfs, vfs, etc) and requires cgo
for building. While the full list of system dependencies can be looked up here, for the set of use cases from the article only the following packages are needed (assuming pre-installed go
and git
):
# Debian 10
apt-get install pkg-config libgpgme-dev libdevmapper-dev
# CentOS 8 (with enabled EPEL Repository)
yum install libassuan-devel gpgme-devel device-mapper-devel
Creating storage
Both storage and image libraries by default expect some configuration files present under /etc/containers
path. A reference set of config files could be obtained by installing buildah. However, most of the settings could be overwritten in code:
package main
import "github.com/containers/storage"
func main() {
// Reads /etc/containers/*.json
options, err := storage.DefaultStoreOptions(false, 0)
if err != nil {
panic(err)
}
// options.RunRoot = "/path/to/root"
// options.GraphRoot = "/path/to/graph/root"
// options.GraphDriverName = "vfs" | "overlay" | etc
// ...
store, err := storage.GetStore(options)
if err != nil {
panic(err)
}
status, err := store.Status()
if err != nil {
panic(err)
}
println(status)
}
The storage.DefaultStoreOptions()
tries to support different defaults for root- and rootless modes. However, the rootless mode requires some extra hustle with user namespaces and the set of supported drivers becomes pretty limited. Apparently, out of the box vfs is used and it's not a true overlay filesystem. On systems with rootless overlay filesystem support (Ubuntu?) behavior may vary. Using fuse-overlayfs is probably another alternative for the rootless mode. All the examples in the rest of the article will require sudo privileges and mostly use overlay driver.
The storage library uses reexec trick to run some of its subtasks in separate processes. Thus, before going full-on with storage, we need to initialize it properly:
package main
import "github.com/containers/storage/pkg/reexec"
func main() {
if reexec.Init() {
return
}
// your code here
}
Pulling images
It's time to take a look at the image library. It offers lots of functionality, but here we are mostly interested in its transport and copy capabilities. To pull an image, we need to know the transport (or will docker
used by default), the repository (or hub.docker.com
will be used by default), image name, and image tag (or latest
will be used by default). Obviously, the image library requires the storage counterpart to store pulled images. Here is how we can pull an image from a remote repository (see a working example here):
package main
import (
"context"
"os"
"github.com/containers/image/copy"
"github.com/containers/image/signature"
"github.com/containers/image/storage"
"github.com/containers/image/transports/alltransports"
"github.com/containers/image/types"
)
func main() {
imageName := "docker://alpine:latest"
srcRef, _ := alltransports.ParseImageName(imageName)
// Carries various default locations.
systemCtx := &types.SystemContext{}
policy, _ := signature.DefaultPolicy(systemCtx)
policyCtx, _ := signature.NewPolicyContext(policy)
dstName := imageName
if srcRef.DockerReference() != nil {
dstName = srcRef.DockerReference().String()
}
store := createStore() // see previous section
dstRef, _ := storage.Transport.ParseStoreReference(store, dstName)
copyOptions := ©.Options{ReportWriter: os.Stdout}
manifest, _ := copy.Image(
context.Background(),
policyCtx,
dstRef,
srcRef,
copyOptions,
)
println(string(manifest))
}
Listing layers and images
The storage library organizes images on disk in a well-known layered form originally popularized by docker. It's fairly simple to list the existing layers, images, (and containers, see the next section) alongside with their location, digest, and size information:
package main
import "github.com/davecgh/go-spew/spew"
func main() {
store := createStorage()
layers, _ := store.Layers()
spew.Dump(layers)
images, _ := store.Images()
spew.Dump(images)
}
Creating containers
From the storage library README file:
A container is a read-write layer which is a child of an image's top layer, along with information which the library can manage for the convenience of its caller. This information typically includes configuration information for running the specific container. Multiple containers can be derived from a single image.
To create a container, we need to call store.CreateContainer()
and the only required parameter is the image id:
package main
import "github.com/davecgh/go-spew/spew"
func main() {
store := createStorage()
imageID := "<image ID goes here>"
cont, _ := store.CreateContainer("", nil, imageID, "", "", nil)
spew.Dump(cont)
}
Mounting containers
Once the container is created, we still need to mount it using one of the storage drivers. The result of mounting is a writable location somewhere in the filesystem:
package main
func main() {
store := createStorage()
mountPoint, _ := store.Mount("<container ID goes here>")
println(mountPoint)
}
Demo
Combining all the knowledge from above and using the demo program, we can come up with the following scenario:
# Pull image
$ ./goimagego pull docker://alpine:latest
Pulling image docker://alpine:latest
Getting image source signatures
Copying blob c9b1b535fdd9 skipped: already exists
Copying config e7d92cdc71 done
Writing manifest to image destination
Storing signatures
Image pulled - {
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 1511,
"digest": "sha256:e7d92cdc71feacf90708cb59182d0df1b911f8ae022d29e8e95d75ca6a99776a"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2802957,
"digest": "sha256:c9b1b535fdd91a9855fb7f82348177e5f019329a58c53c47272962dd60f71fc9"
}
]
}
# Create a new container using the image from above
$ ./goimagego container e7d92cdc71feacf90708cb59182d0df1b911f8ae022d29e8e95d75ca6a99776a
Container:
(*storage.Container)(0xc000089790)({
ID: (string) (len=64) "f7c2136928fbe8e963b594833f0101b964edb5ec299444c2508ce1d3d15ef3c2",
Names: ([]string) <nil>,
ImageID: (string) (len=64) "e7d92cdc71feacf90708cb59182d0df1b911f8ae022d29e8e95d75ca6a99776a",
LayerID: (string) (len=64) "5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e",
Metadata: (string) "",
BigDataNames: ([]string) <nil>,
BigDataSizes: (map[string]int64) {
},
BigDataDigests: (map[string]digest.Digest) {
},
Created: (time.Time) 2020-02-29 17:31:14.129769831 +0000 UTC,
UIDMap: ([]idtools.IDMap) <nil>,
GIDMap: ([]idtools.IDMap) <nil>,
Flags: (map[string]interface {}) (len=2) {
(string) (len=12) "ProcessLabel": (string) "",
(string) (len=10) "MountLabel": (string) ""
}
})
# Mount the created container
$ ./goimagego mount f7c2136928fbe8e963b594833f0101b964edb5ec299444c2508ce1d3d15ef3c2
/var/lib/containers/storage/overlay/5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e/merged
$ df /var/lib/containers/storage/overlay/5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e/merged
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 10474496 5666980 4807516 55% /var/lib/containers/storage/overlay/5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e/merged
$ ./goimagego unmount f7c2136928fbe8e963b594833f0101b964edb5ec299444c2508ce1d3d15ef3c2
Unmounted!
After unmounting, the location /var/lib/containers/storage/overlay/5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e/merged
will not exist anymore. However, the topmost (i.e. container's) layer will still be there containing all the changes made to the filesystem. It's a so called diff
folder and usually it resides close to the original merged
folder. In my case it was /var/lib/containers/storage/overlay/5d531d0194a79ad4a40232cae1346012739380c0d7c354485d70ddc298ea0d9e/diff
. To clean up after the container properly, we need to wipe the whole /var/lib/containers/storage/overlay/<container_id>
hierarchy.
Miscellaneous
A lot of things could go wrong while working with images, containers, storage drivers, etc. So, never skip or suppress errors =) However, sometimes error messages aren't that helpful. In such situations, I found setting the log level to DEBUG extremely useful. Both image and storage use sirupsen/logrus, so putting logrus.SetLevel(logrus.DebugLevel)
at the beginning of your program sheds tons of light on the actual execution problems.
And if you're getting completely desperate, even with enabled DEBUG logs, delve, is there to save you!
See also
- A tale of two Go container libraries - choosing a library to work with images in production by Eric Stroczynski (Slim.AI)
- How are docker images built? A look into the Linux overlay file-systems and the OCI specification by Nicola Apicella
Level up your server-side game — join 9,000 engineers getting insightful learning materials straight to their inbox: