Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

The State of State Machines- 380

Natalie Silvanovich - Project Zero (P0)Posted 5 Years Ago
  • Different technologies have different threat models, depending on the usage and sensitivity. In most scenarios, forcing a connection is not a huge deal. However, in the world of phones, being able to force audio or video on a call serious breach of privacy.
  • Most of the cellular phone apps used for Real Time Communication (RTC) use the standard WebRTC, such as Signal, WhatsUp and many more. Using WebRTC for calls and video works well but can be still be hit by security issues. WebRTC makes the app creators store an internal state for the user, such as consent for a call.
  • In Signal, the logic of the application did not check if the user accepting the call was the user callee. Because of this, an attacking user of the application could go through the routine for a call then force accept the connection for the user. By the way, Signal is on Github!
  • The second issue was found in both Jiochat & Mocha. Both of these applications attempted to make the transition from accepting the call and being on the call faster. In order to do this, they used a non-WebRTC client in order to handle this and pre-establish the connection. At this point, the only thing missing was a candidate (user to connect to). Once this was added, the call would be accepted.
  • With WebRTC, the candidate can be established on the initial connection. Because this was not the expected workflow, the WebRTC client had not implemented any logic for this! So, a phone could automatically be connected to just by sending the candidate object in the WebRTC call.
  • With Google Duo, it suffered from a race condition on the startup process. Upon making a connection, the application tells the state machine to not allow video/audio (since they have not consented to it). However, because call is done asynchronously, a few video frames could be sent prior to the state machine updating.
  • The author took 2 weeks in order to figure out a way to win the race. Eventually, she figured out that by adding multiple candidates to the connection (multiple ways to send data) and sending many messages at a time, the race could be won to get a few frames of data.
  • At the end of this, the author makes a point about asynchronous call issues: "However, asynchronous calls make it more difficult to model how a state machine will behave in all situations, so it is important to be cautious about adding asynchronous calls to WebRTC signalling."
  • Facebook Messenger was an entirely different beast! All of the previous apps had Java bindings (which decompiles nicely) but Messenger used C++ binding with a statically linked & stripped library. A good section of the article is just about reversing the messaging app itself.
  • The author noticed that when a track connection is added, it is put into an inactive state immediately, prior to the call being answered. There are some really nice diagrams on how all of these connections work too.
  • Several of the messages to this are completely ignored but some are processed, such as SdpUpdate. This was a feature that is called when a second device (phone and laptop) are both connecting to the call. Using this functionality, it is possible to force a connection!
  • Overall, state handling is hard. Memory corruption bugs are awesome and have universal patterns; hence, they sometimes require a lesser understanding of the system. These logic bugs require a very deep understanding of the ecosystem, but are fruitful once it is understood.