My latest achievement in my homelab is Goharbor as a Container Registry. I am going to write down how I did that.

Motivations

I had the following targets:

  • Run applications air-gapped, so the registry should allow proxying of images
  • Setup own images on my own registry
  • Allow caching stuff, as my internet speed is quite slow

Solutions I thought about

It was quite obvious that I would have to setup some kind of container registry in my home environment, particularly due to the caching approach - caching only makes sense locally.

I knew that Gitea/Forgejo allow hosting packages like Docker images, but this was not an option for some reasons: First, my Forgejo server is one of these really small LXC containers. I passed in a disk with 8GB size, and altough hosting a bunch of repositories there, it currently uses half of it. When running a registry there, I would need loads of disk space, which the machine it is running on is not prepared to do. Second, I don’t think that Forgejo has a caching concept here. So, having the image:latest tag which updates every time I push to the repo is not that cool, as old (and now untagged) images would be dropped. Finally, Forgejo does not provide the Proxy thing that I was looking for.

Knowing of quite a few other options, none really was to my liking. Just by accident, I was pushed to Harbor then at work, so I decided to finally give it a try.

Host setup

I have got an LXC host setup and I mounted a directory from my NAS to /data/registry on this host. The /data/registry folder is the one where the images actually are placed, so this is the large one. The base disk of the LXC container has 16GB of size and currently eats a bit less than 8GB of it.

Be sure that /data is writable by user 10000 - or the respective user who is mapped to that.

Harbor Installation

Installation itself is quite straight-forward. I downloaded the latest release (2.12.2, currently) from Github, giving me the file harbor-online-installer-v2.12.2.tgz. Note that I opted for the online installer, as this server definitely has to be online anyways. It should be the proxy I am looking for, so air-gapping this one does not really make a lot of sense.

Unzipping the image1 gave me a directory ./harbor with all relevant stuff in it. I moved that to /root/harbor (running Docker in LXC, so running the containers as root is fine).

Now next thing to do is to setup the configuration file. In /root/harbor, you will find a harbor.yml.tmpl; copy that one to /root/harbor/harbor.yml. I did the following changes here:

  • hostname: harbor.tech-tales.blog
  • https.certificate: /path/to/my/tls/certificate.crt and https.private_key: /path/to/my/tl/private.key. Altough I am quite sure that these options did not work out; I still had to setup the certificate differently later on. More on that in a moment.
  • harbor_admin_password. I won’t give you my actual password here…
    This is just the initial password. You may (should) change it after first login to something else, and the configuration will be ignored after that.

Now that all the configuration is done, it’s time to actually run Harbor: This is quite simple:

./install.sh --with-trivy

Note: I also installed Trivy which can be used to do CVE scanning on images I push.
Another note: The automatic CVE scanning does not work in the proxy repositories; you have to do the scans here on your own!

Setup TLS

tl;dr: Create the files server.crt and server.key in /data/secrets/cert. Then, docker restart nginx

I first kind of failed in setting up TLS correctly, but I found out how it works now.

  • Checking the nginx container, it reads the certificate files from /etc/cert/server.crt and /etc/cert/server.key. Note that these paths are mounted to the container and hence will be different on the host system.
  • Next, I inspected the nginx container and found the required mount: /data/secret/cert is mapped to /etc/cert in the container.
  • So, use your TLS certificate (ideally, just a wildcard certificate) and place the server.crt and server.key in /data/secret/cert.
  • Finally, don’t forget to docker restart nginx. This is for nginx to actually load the new certificate.
  • Please do not ask me why I defined a certificate key and path in the config file. Not quite sure if they are used.

Setup Harbor as a Docker Proxy

Now this was one of my most wanted things to do. The docs of harbor are quite good, but let’s get the gist of how to set that up here:

  • Login as admin, go to Administration - Registries. Setup a new registry with the following settings:
    • Provider: Docker Hub
    • Name: Whatever you like, I called it “Docker Hub Registry Endpoint”
    • Endpoint URL should be set automatically
    • Leave Access ID and secret empty
    • Also, keep the “Verify Remote Cert” on.
  • Test the connection if you please, and then save.
  • Now, go to Projects. Setup a new project with the following settings:
    • Project Name: Whatever you like, I called it docker-proxy
    • Access Level: Public if you like. I defined it as public.
    • I kept the Project Quota Limit on -1, use whatever you like.
    • Swith the “Proxy Cache” Button to “On”. From the drop down menu, choose the “Docker Hub Registry Endpoint”.
    • Keep the bandwidth at -1 if you will have a low traffic instance2. Note that this setting cannot be changed any more later!
    • Note: The Proxy Cache Settings cannot be changed any more later! If you want to change them, you need to re-setup a new repository.
  • Finally, I want do use that as a cache, meaning that I don’t want to keep the pulls infinitely. For that, I defined a retention policy:
    • Go to the project, to the “Policy” tab.
    • Add a new rule with the following settings:
      • “For repositories matching” ** (so, all repositories in this project)
      • By artifact cont …: I chose retain the most recently pullet # artifacts and set the count to 10. Change that to your liking.
      • Tags: Here, I used “matching” and ** again. I also checked the “untagged artifacts” part.
    • So what this gives me: For every image pulled, keep the top 10 recently pulled tags. For my stuff, this is totally fine.
    • Note that setting up a schedule may also be required; I don’t remember if I changed anything here. I set it to daily, just in case somebody asks. If this does not work, feel free to contact me and let me know.
  • Side note: Since Github Container Registry is also used more and more, I’ve set that one up too, in a similar fashion. Just in the Registry, I used “Github GHCR” instead of “Docker Hub”.

Now how to use this registry? There are two possibilities:

  • The usual case should be that you pull some repo like docker pull apache/airflow, or maybe you have a Docker Compose file with image: apache/airflow. Replace the apache/airflow part with harbor.tech-tales.blog/docker-proxy/apache/airflow and keep the rest as is.
  • You may also poll something from their library. So if you did docker pull debian, then the “user” they implicitly defined is library, so this statement is equivalent to docker pull library/debian or, with our cache, docker pull harbor.tech-tales.blog/docker-proxy/library/debian.
  • Note that polling specific tags works the same way.

Keep the system running

I am not sure if this part is actually required, but my installation sometimes failed and did not automatically restart. Particularly, after a restart of the host, there seem to be dependencies that cannot be satisfied, and then some of the hosts do not restart, altough proposed by the restart: always command in the compose file.

I found a quite simple way around it:

  • Create a file called /root/keep-system-running.sh. Add the following contents:
    #!/bin/bash
    
    containers=$(docker ps -a --format "{{ .Names }}")
    
    for container in $containers
    do
        running=$(docker inspect $container | jq ".[0].State.Running")
        if [ $running = 'false' ];
        then
            echo "$container is not running; starting it"
            docker start $container
        fi
    done
    
  • Make the script executable: chmod +x /root/keep-system-running.sh
  • As root user, run crontab -e and add the following line:
    */10 * * * * /root/keep-system-running.sh
    

This might not be the best solution, but it works for me.

Finished!

Have fun with your shiny new container registry!


  1. Knowing that tar commands are typically hard to remember: tar xzvf harbor-online-installer-v2.12.2.tgz ↩︎

  2. If you have a high traffic instance, you may be well-advised to use a more professional setup than what I wrote here. ↩︎