Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Fall of the machines: Exploiting the Qualcomm NPU (neural processing unit) kernel driver- 686

Man Yue Mo - Github Security LabsPosted 4 Years Ago
  • The NPU (neural processing unit) is a co-processor on Qualcomm chips that is designed for AI and machine learning tasks. The NPU has a kernel driver that can be interacted with from user space on Samsung devices. Since this is Linux, the code is open source!
  • To interact with the driver, the file /dev/msm_npu is used. This driver has many IOTCL calls, such as allocate/unmap a DMA buffer, load/unload neural network model and several other operations. Most of the commands are synchronous with a few being asynchronous.
  • When loading an NPU model, there is a statically sized global array of contexts that are the different jobs taking place. When calling npu_close, the client pointer is removed from the network.
  • Since this information is global, all information associated with the old clients needs to be removed. By calling npu_close and async npu_exec_network at close to the same time, the client is used but the NPU is never cleaned up! This leads a use after free on a pointer in the global buffer. By replacing the client object with a fake object arbitrary kernel functions, with 1 parameter of control, can be called.
  • The next bug is very strange; it is like the code was never checked for functionality. While calling npu_exec_network_v2, stats_buf can be specified to collect some debugging information. But, this never worked? Instead of specifying the buffer address, an additional dereference was used! &kevt->reserved[0] should have been kevt->reserved[0].
  • The bug above lead to the leaking of the stats_buf address rather than the copying the contents. This allowed the attacker to learn where this buffer was in memory and partially defeat KASLR. What a stupid bug that leads to another step in the chain.
  • The author noticed that an object was being never being initialized and some of the values of it were not guaranteed to be set either. By itself, this may not lead to any interesting bugs. However, when diving into this code further, this object was being copied back into memory, making it a good option for an information leak.
  • struct npu_kevent contained a UNION with four potential elements. In C, the compiler creates a UNION with the largest of the elements for size reasons. The largest element (uint8_t data[128])is an auxiliary buffer of size 128. When the copy happens when a small UNION field is used, such as struct msm_npu_event_execute_v2_done exec_v2_done, then the rest of the data is never initialized.
  • Now, here is the best part: all of the bytes unused by the other field in the UNION will be copied over! This is because the code sizeof(struct msm_npu_event)) takes the size of the struct with the largest field in the UNION for the size. So, even though the used parts of the UNION were initialized, the rest of the buffer was not. Damn, this is an awesome bug!
  • To bring this all together, the third vulnerability can be used to defeat KASLR and all other randomness. The second bug can be used to determine the address of stats_buf, which is important for creating a fake object. The first vulnerability can then have a fake object, on the use after free, that calls a function pointer to get code execution.
  • Once code execution was achieved, the author needed to bypass control flow integrity (CFI). The goal was to call __bpf_prog_run32 with bytecode pointer that should be executed in the kernel. Since the parameters were not setup properly, they needed to find a function to control the second parameter. Moving from parameter 1 to parameter was easy because of the large occurrence of small wrappers in Linux.
  • Overall, these bugs were difficult to spot bugs that were either found by code review or accident with somebody intentionally looking for these bugs. For me, if I see a UNION or global variables being shared, I'll make sure to check out this flow. Great article!