The Nintendo Entertainment System (NES) was built in an era of CRT TVs, where rendering it entirely different than LEDs. Most graphical changes happen during a blanking period; so, there is an interrupt to ensure this is the case. The VBlank interrupt is a Non-Maskable-Interrupt (NMI).
The game console also has Interrupt Requests or ICQs for short. Depending on the game mode that the gameplay is in, the IRQ will behave differently. Additionally, the NES had logical blocks of code and assets in banks, where only one bank can be loaded at a time.
The NMIs swap out the PRG bank during graphics changes. Eventually, by the end of the NMI, the proper banks are swapped back in. What if we could trick code to run with the improper banks loaded? This is how the vulnerability that was found works!
DPCM audio samples have the ability to corrupt controller inputs. This is because a register is shifted one too many times. Since the DMA read is asynchronous and this is a hardware issue, we must find a way to workaround this. To fix this issue, the most common fix was to simply poll the controller over and over again until the same buttons were seen twice in a row.
So, what's the bug? By changing buttons at a rate of 8K inputs per frame, we can trick the polling code for controller inputs to be stuck forever! This paired with an interrupt leads to a situation where code from a bank never intended to be executed in this context will be ran!
By some miracle, the code runs fine. Eventually, a RTS instruction will jump the code to 0x0000 on the stack. The NMI continues to happen every frame - it records button press inputs to $17,$18,$F5,$F6 and $F8. Through careful planning, the controller inputs can be used to write somewhat arbitrary asm to execute.
$17 is the total held buttons on controller 1 and $18 is the new buttons pressed using a bit for each button. $F5,$F6 and $F8 have similar limitations to $17/$18. This creates a limitation of which bytes can be used for the second byte. Additionally, left and right as well as up and down cannot be held at the same time, further limiting the instructions.
With these limitations in mind, our goal is to warp to the end credits. There are 6 criteria that need to be met with 3 of them already there once we start relating to banks. The first is the stack be larger than 0x30, second is NMI mode at address $100 must be 0x20 and we need to jump to $B85A.
Previous versions of the TAS had to work around the limitations above. However, the author found a special case - bytes 0x0-0x2 uses these for scratch addresses at the end of an NMI. They happen to be for controller inputs INCLUDING the conflicting inputs. By using this property, we have more control over these two bytes, which happens to be enough :)
The TAS is 3 frames long of game play. Here is what happens:
- Write
JSR $9000 at the scratch address using two controllers. Using the only inputs PUSH a value of 0xFA to register SP.
- The next NMI occurs and writes our controller inputs to the stack. This time, our inputs result in
JSR $0000 being executed.
JSR $9000 is executes from the previous write after our jump occurs. Since the SP is sane this works.
The
video for this explains a slightly simplified version, which is what the example is based on. However, the concepts are the same. A funny change they made was using a different version of the game because of the addresses are slightly different.
Overall, the article and video are amazing resources! Beating SMB3 is less than a second is hilarious and I very much enjoyed learning about this. From the vulnerability itself to a making the exploit work, it's truly magic :)