Microsoft office allows users to put videos into Word from external locations, such as YouTube, via the Online Videos feature. When the video is embedded in the document, Office checks that the video is from a trustworthy via a regex.
If the link is proper, then it will make a web request to get information about the video title and other information. While it's doing the processing of the title, it adds it into an iFrame tag without input validation.
This turns into a classic HTML injection vulnerability via the title within the iFrame. Using this, the context of the iFrame attribute can be escaped, leading to the ability to add other attributes. The beginning of the payload is simply " onload=..."
Loading arbitrary JavaScript into the iFrame is game over for Word. An attacker can make a request to an arbitrary location then execute this code dynamically to get RCE.
Returning the malicious JavaScript can be used to load arbitrary applications. The example JS is window.open("calculator://. This does require some user interaction but nonetheless it's interesting seeing XSS within such a weird contest.