A 1 KB Docker Container

Posted by Nathan Osman on September 28, 2017

No, that’s not a typo or a joke. I have created a Docker container containing a single Unix executable with no dependencies that occupies less than 1 KB of space on disk. There are no other files included in the container — not even libc.

Here’s the proof.

Why?

Before explaning how this was accomplished, it is worth explaining why this was accomplished. caddy-docker (which is another tool I wrote and explain in detail here) routes incoming requests to running containers based on their labels.

I needed caddy-docker to act as a reverse proxy for a particular host and the easiest way to do that was by spinning up a container whose sole purpose was to contain two special labels. The container should not do anything until it is stopped.

That’s when I came up with an idea.

What?

I immediately began working on the application, naming it “hang” for its rather unusual purpose. Go can easily produce executables that have no dependencies, allowing the Docker container to inherit from scratch. The only downside is that Go executables tend to be massive, even very simple ones often exceed 8 MB in size.

This would never do.

I reasoned that a C application could easily be written that registered a signal handler for SIGTERM and quit when it was received. Unfortunately, this meant that I would need to use libc, which in turn meant the container would quickly become comparable in size to the Go executable. This would provide no advantage at all.

Assembly?

Yes, the quickest way to produce a tiny executable with no dependencies is to write it in assembly. I prefer Intel-style syntax so NASM was the obvious choice.

Syscalls

Once upon a time, back in the early days of the x86 architecture, a syscall looked something like this:

mov eax, 0x01
mov ebx, 0x00
int 0x80

The first line specifies which syscall to invoke — sys_exit in this case. The second line specifies the exit value (0). The third line generates an interrupt which the kernel will then process.

x86 operating systems later moved to using sysenter/sysret while x86_64 introduced a new (aptly named) opcode: syscall. Similar to the example above, the rax register is used for specifying the specific syscall to invoke. The example above could be rewritten in x86_64 assembly like this:

mov rax, 0x3c
mov rdi, 0x00
syscall

Note that the syscall number for sys_exit is different on x86_64.

Signal Handlers

Registering a signal handler is fairly trivial in C:

#include <signal.h>

void handler(int param) {}

int main() {
    struct sigaction sa;
    sa.sa_handler = handler;
    sigaction(SIGTERM, &sa, 0);
    return 0;
}

Unfortunately, a couple of things are being hidden by the C standard library:

  • the flag SA_RESTORER is added to sa.sa_flags
  • the sa.sa_restorer member is set to a special function

We cannot directly translate the C code to assembly since the sigaction struct doesn’t correspond to the one sys_rt_sigaction expects. Here’s what the kernel struct looks like in NASM:

struc sigaction
    .sa_handler  resq 1
    .sa_flags    resq 1
    .sa_restorer resq 1
    .sa_mask     resq 1
endstruc

Each member is 8-bytes in size.

Setting the Signal Handler

First, we must allocate space for the struct in the .bss section:

section .bss

    act resb sigaction_size

Note that sigaction_size is a special value the assembler creates for us — it is equal to the size of sigaction in bytes. The struct can then be initialized in the .text section like so:

section .text
global _start

    lea rax, [handler]
    mov [act + sigaction.sa_handler], rax
    mov [act + sigaction.sa_flags], dword 0x04000000  ; SA_RESTORER
    lea rax, [restorer]
    mov [act + sigaction.sa_restorer], rax

handler and restorer are labels that we’ll come to in a moment. Now we can invoke the sys_rt_sigaction syscall:

    mov rax, 0x0d  ; sys_rt_sigaction
    mov rdi, 0x0f  ; SIGTERM
    lea rsi, [act]
    mov rdx, 0x00
    mov r10, 0x08
    syscall

Handling the Signal

The next step is waiting for the SIGTERM signal to arrive. The sys_pause syscall easily accomplishes this:

    mov rax, 0x22  ; sys_pause
    syscall

The handler itself is fairly trivial — it doesn’t really do anything:

handler:

    ret

The restorer is fairly simple as well, though it does need to invoke the sys_rt_sigreturn syscall:

restorer:

    mov rax, 0x0f  ; sys_rt_sigreturn
    syscall

Building

Two commands are required to build the application. Assuming the source file is named hang.asm, the commands are:

nasm -f elf64 hang.asm
ld -s -o hang hang.o

This produces an executable named hang — and it’s small:

$ stat hang
  File: hang
  Size: 736

Yes, that’s 736 bytes.

The Dockerfile is fairly simple, requiring only two commands:

FROM scratch
ADD hang /usr/bin/hang
ENTRYPOINT ["/usr/bin/hang"]

Testing

Let’s see if the container works:

$ docker build -t nathanosman/hang .
$ docker run -d --name hang nathanosman/hang

At this point, the container should remain running:

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             STATUS
f1861f628ea8        nathanosman/hang    "/usr/bin/hang"     Up 3 seconds

It should also immediately stop when docker stop is run:

$ docker stop hang
hang

It works! Let’s make sure the container is only as large as the executable:

$ docker images
REPOSITORY               TAG                 CREATED             SIZE
nathanosman/hang         latest              2 minutes ago       736B

And there you have it — a really tiny container!

You can find the source code in its entirety here:

github.com/nathan-osman/hang