The beginning of a philosophy on the art of exploit development. In particular, it goes over the development process at Apple and how it tries to mitigate bugs with secure architecture design, exploit mitigations and (recently) fuzzing. However, a determined attacker (with resources) will find bugs and exploit them. This project took Ian Beer 6 months from start to finish as a single person.
From the reversing side of things, Ian Beer had a special copy of an iOS beta build that had symbols still in it. This allowed for much easier reverse engineering. While Mr. Beer was looking through this in a decompiler (particularly moving for memmove operations).
While reversing, the function IO80211AWDLPeer::parseAwdlSyncTreeTLV stood out because of "TLVs (Type, Length, Value) are often used to give structure to data, and parsing a TLV might mean it's coming from somewhere untrusted". From this (even without knowing AWDL), he took a closer look at the code.
The code did some parsing and the data sent it. Ian looked at some error messages for length checks (normal place to find vulns). He noticed that the error messages being thrown were not
fatal. This means that the error would happen but would just CONTINUE! This reminds me of the
There's a hole in the Boot bug a while ago.
Using the information from above, the author attached a debugger on MacOS (this driver exists on both MacOS and iOS) and altered the payload to another MacOS computer. By barely altering this TLV object, it caused the kernel to go into a kernel panic from a buffer overflow. And we have a bug!
As a note on reversing, the control flow graphs seems to help Ian quite a bit on finding this bug. Seeing code from different angles allows for different problems to be found. Additionally, this is a remotely touchable service; Apple-proprietary mesh networking protocol is what was being hit on this. Finally, the author read several papers on AWDL and built off the work of others to get this to actually work.
AWDL allows iDevices to connect to mesh networks automatically. "Chances are that if you own an Apple device you're creating or connecting to these transient mesh networks multiple times a day without even realizing it." That is crazy to think about, for an attack surface.
AWDL is built ON the WiFi layer of the radio. Most people (including myself) think about the infrastructure network of the home being connected to. However, the WiFi layer has much more than just this. The idea (behind the protocol) is that if the WiFi signal is already there going to ALL devices, why not just talk to the devices directly (peer-to-peer)? This is what AWDL does.
An interesting thing to note is that the AWDL protocol allows for devices to BOTH be on the infrastructure network AND participate in the peer-to-peer communication on the same network interface. This is done by time sharing in 16ms intervals. Although this could miss frames, radio is considered to be a unreliable anyway (meaning that handling this missed data is already part of the protocol).
There is quite a bit more background on the attack surface, the protocol, reversing and MUCH more. However, for the sake of conciseness, it will not be included here. Only interesting notes will be included from these points for now on.
The vulnerability discovered from a linear heap buffer overflow in the storage of MAC Addresses. There is an inline array that should only hold a maximum of 10 objects but does not limit this. By adding a bunch of peers to the mesh network, it is possible to corrupt many different sections of data.
To perform the actual exploitation, libpcap was used in order to inject raw data into 802.11 frames (WiFi). He also used an Raspberry Pi for the network data transferring. An additional note was on debugging the kernel driver... both the MacOS kernel debugger and Dtrace were used to dynamically understand the kernel object allocation. Overall, setting up an injection framework for WiFi took a significant amount of time and understanding to do.
The added mitigation of Pointer Authentication Codes (PAC) made the exploitation much more difficult by cryptograpically signing function pointers. With the original POC, it was possible to overwrite a Vtable function pointer. However, this no longer trivially works because of PAC. PAC validates the signature AND the function prototype. To bypass PAC, either signing gadget needs to be found or a function with a similar prototype and leaked VTable signature needs to be used. It should be noted that PAC is ONLY for function pointers and NOT regular pointers.
For grooming the heap, there is a mitigation in place that returns chunks of the SAME SIZE in random order to the user. However, in this case, the ordering was not important as long as a particular type of object was being used.
For building primitives, a
safe objects needs to be made. The following is what is considered a
safe object:
- arbitrary size
- unlimited allocation quantity
- allocation has no side effects
- controlled contents
- contents can be safely corrupted
- can be free'd at an arbitrary, controlled point, with no side effects
From previous iOS research, there was a good idea of them. But, the remote attack surface makes finding these an entirely new ball game.
The OG leak technique did not work (relative overwrite of pointer) because of a boundary issue. However, after updating to the newest version of iOS, some new fields were added that created new hope. After a crazy amount of reversing the code, an arbitrary read primitive could be constructed in order to leak data from a buffer by overwriting a pointer.
Getting the first leak is difficult; if everything is randomized then where do you go? Either relative overwrites or targeting places with insufficient randomness is your best bet. By allocating a BUNCH of heap data (referenced as zone) with the frames, we can create enough memory then safely read from this section to leak arbitrary data. In particular, they use the PAC pointer to leak KASLR address.
Two other bugs were accidentally discovered. The first one was a double free vulnerability for a pointer on the steering_msg_blob object. It can be triggered by DOING NOTHING. The second bug was an integer underflow on parsing of a length value.
What now? Time to create an arbitrary write primitive! By using the leak from before, a relative arbitrary add can be created in order to write to where ever we want. This primitive ADDs whatever value we specify to the current location in memory. Using memory leaks, it is possible to turn this into a a FULL arbitrary write.
However, there is an issue... if we write, the program crashes RIGHT afterwards unless a specific if statement is taken, which requires specific timing. This if statement required a bunch of playing around and proper timing in order to work properly.
With a true read-write primitive in the kernel, what is next in order to get code execution? Popping a calc was done by altering a kernel object being sent back to userspace. Once this function pointer (without PAC going to userspace) is altered, we can pop a Calc :) From here, the author goes into making this exploit FASTER by using different types of settings.
There is one final trick: getting the AWDL interface to be open. This can be done by brute forcing a contact ID (last 2 bytes of a SHA-256 hash). Once this is open, we can launch our full attack.
The rest of the article goes into taking the kernel read-write primitive into a fake application that will steal all of the data from the phone.
This is an amazing piece of engineering! To me, the takeaways are endless.. from reversing to exploit dev, it is tremendous. I see that setting up a crazy tech stack for the attack and the massive amount of reversing makes the exploit doable. In the future, it might be wise to look into harder to reach surfaces, such as radio protocols.