Kubernetes is a container orchestration framework that was built by Google for deployment, easy scaling and simple management.
In Kubernetes there is a feature that allows the sharing of a volume inside the containers of a pad. For instance, this could be used for pre-populating a database or something like that.
Since the user can provide a path to be used for this, Kubernetes has to be extra careful when handling this path. Attacks with symbolic links and time of check vs. time of use (TOCTOU) issues are quite common in these areas.
A
previous vulnerability involved a symbolic link. One container would create a symbolic link outside of the container. Then, when another container would start up and setup a volume according to this symbolic link, it would go to the
host system instead.
A
fix to another vulnerability was to make sure the subpath mount location is resolved and validated to be inside of the volume. This fixed a TOCTOU issue between the verification and the usage of the link.
The previous fix takes several steps to ensure that the directory being mounted is safely opened and validated. After the file is opened and validated, the kubelet uses the magic-link path under /proc/[pid]/fd directory for all subsequent operations to ensure the file remains unchanged, which is awesome. However, the authors found out that all of the efforts of this fix were for none because mount uses the procfs magic-link by default.
So, there is a small race condition: but is it exploitable? Very much so! There is a syscall called renameat2 which just swaps two files paths. But running this in a loop, it is possible to get the verification to do check thing but the mount to use another!
The solution to the bug was to add the --no-canonicalize flag to the mount command. This ensures that the tool doesn't use the magic links.
TOCTOU bugs are hard to find, as they usually only exist in complicated applications. Files are a great place to find these bugs though; using it securely is intuitive.