X
Innovation

AWS preps its own library of public Docker container images

With Docker restricting how quickly users can pull down images from its Docker Hub for free, Amazon Web Services is finally creating its own repository of public container images. But there's a bigger problem here for all container and cloud users.
Written by Steven Vaughan-Nichols, Senior Contributing Editor

When you run applications on the cloud, odds are you're actually running them in containers. And many of us aren't creating our own containers of common applications, such as the Apache Web Server, MySQL DBMS, or the Traefik cloud-native edge router. Instead, we simply grab them from the Docker Hub or another repository of popular container images. Unfortunately, for users who don't want to pay for their images, Docker is not a charity. Starting in November, Docker has started limiting Docker container pull requests for anonymous and free authenticated users. To address this issue, Amazon Web Services (AWS) has started working on its own public container registry.

You may think this is much ado about nothing. I mean how many container images can one company pull anyway? The answer is that Amazon Elastic Container Registry (ECR) customers alone download billions of images each week. That's no typo. Billions.

Today's production software chain often consists of grabbing a popular container image, running it for a few minutes or hours, and then dumping it. If you need it again, you just repeat the process.

That's great for you, but it's not so great for Docker. As Jean-Laurent de Morlhon, Docker's VP of software engineering, explained: "The vast majority of Docker users pulled images at a rate you would expect for normal workflows. However, there is an outsized impact from a small number of anonymous users. For example, roughly 30% of all downloads on Hub come from only 1% of our anonymous users." So, since bandwidth isn't free, Docker is rate-limiting its free and anonymous users. 

This began on Nov. 2. Anonymous and free users are now limited to 5,000 Docker Hub pulls per six hours. These pulls will be gradually reduced over a number of weeks. Eventually, anonymous users will be restricted to 100 container pulls per six hours and free users limited to 200 container pulls per six hours. All paid Docker accounts -- Pro, Team, and Legacy subscribers -- are exempt from rate-limiting. No pull rate restrictions will be applied to namespaces approved as non-commercial open-source projects. 

It doesn't cost much to use Docker's paid accounts -- $5 a month for individual Pro accounts and $7 a month per user on Team accounts. Nevertheless, AWS will soon be offering its own container image repository. 

This will enable developers to share and deploy container images publicly. The new registry will allow developers to store, manage, share, and deploy container images for anyone to discover and download. Developers will be able to use AWS to host both their private and public container images, eliminating the need to go outside the AWS ecosystem. Public images will be geo-replicated for reliable availability around the world and offer fast downloads to quickly serve up images on-demand. 

Curiously, this move comes only months after Docker and Amazon Web Services (AWS) announced they would make Docker application developers' lives easier by streamlining the process of deploying and managing containers from Docker Compose, Docker Desktop, and Docker Hub to Amazon Elastic Container Service (Amazon ECS) and Amazon ECS on AWS Fargate.

Users outside AWS will be able to browse and pull AWS containerized images for their own applications. Developers will be able to use the new registry to distribute public container images and related files like Kubernetes Helm charts and policy configurations for use by any developer. AWS public images such as the ECS agent, Amazon CloudWatch agent, and AWS Deep Learning Container images will also be available.

Must read:

One thing AWS does not appear to be offering at this time is automatically secured images. Docker and Snyk, an open-source security company, have partnered together to find and eliminate security problems in the Docker Official Images. Since you literally don't know what's in a container image unless you bother to check it for problems yourself, this new Docker and Synk offering is reason enough in itself to pay for a  Docker account. 

That said, developers sharing public images on AWS will get 50GB of free storage each month and will pay nominal charges afterward. Anyone who pulls images anonymously will get 500GB of free data bandwidth each month. More than that and you'll need to sign up for an AWS account. So, AWS's container repository too will have its own limits.  

Still, simply authenticating with an AWS account increases free data bandwidth up to 5TB each month when pulling images from the internet. And finally, workloads running in AWS will get unlimited data bandwidth from any region when pulling publicly shared AWS images.

AWS users, of course, aren't the only ones facing problems with the new Docker rules. Google Cloud users, for example, may run right into it without even realizing that they're headed for trouble. Michael Winser, Google's Cloud CI/CD product lead, wrote: "In many cases, you may not be aware that a Google Cloud service you are using is pulling images from Docker Hub. For example, if your Dockerfile has a statement like 'FROM debian:latest' or your Kubernetes Deployment manifest has a statement like 'Image: postgres:latest' it is pulling the image directly from Docker Hub."

The solution? Besides more complex approaches, Winser suggests simply you "upgrade to a paid Docker Hub account."

The Open Container Initiative (OCI) has pointed out, though, that this problem is greater than just Docker changing its rules. "The overarching public content distribution problem isn't limited to who should bear the cost for public content, but rather also encompasses who bears the responsibility of assuring the content is accessible and secure for your environment, 100% of the time. . . . The problem isn't limited to production container images but extends to all package manager content (debs, RPMs, RubyGems, node modules, etc)."

The long-term answer to this, the OCI suggests is to configure "a workflow that imports the content, security scans the content based on your organization's scanning policies, runs functional and integration tests to assure this most recent version of the content meets all expectations, then promote the validated content to a location your team(s) can utilize."

For years now we've been relying on the kindness of open-source companies to provide us with free and, we hoped, trustworthy programs. Now that our automated workflows have moved the source code far outside our hands, we must take a more active and responsible role in both obtaining and deploying not just Docker container images, but all the content we now thoughtlessly depend on for our mission-critical programs.

Editorial standards