zkSync Lite is a zkRollup L2 blockchain. The operator submits a proof attesting to the transition from the old root to the new root via state transitions. The L1 does not re-execute every transaction; the L1 just verifies the proof. Practically, this means that any bug that allows an invalid state transition to satisfy the circuit becomes the on-chain truth once proven.
zkSync Lite has operations that are processed in chunks. The circuit iterates over these chunks for verification. The first check is for state mutations, such as balances, nonces and such. The middle chunk is for pubdata consistency. The final chunk is done for fee accounting. From these separate locations in code came two definitions of valid: one for mutation validity (chunk 1) and another for tx validity (signatures, timestamps, etc.).
This discrepancy in valid is what causes the bug. The function ChangePubKey sets the account's L2s signing key. On the L1, the contract verifies that the pubkey change uses the nonce from the pubdata. pub_nonce equality is NOT checked in the tx validity, but IS checked within the mutation validity. When handling the fees, the validity was checked via tx validity and not both of them.
By putting these altogether, it's possible to create a transaction for ChangePubKeyOffchain where the transaction checker believes it's valid but the mutation doesn't believe so. On the fee accrual chunk, the fee accounting adds more fees than it should without increasing the user debit/nonce. In practice, this attack could be repeated with a malicious proof to mint infinite funds in fees. It appears that this was permissioned because of the reliance on the prover/sequencer/operator though.
With ZK vulnerabilities, the most common issues are around missing constraints. In this case, it was a control-flow issue with a semantic meaning mismatch that led to the vulnerability. So, the next time a complicated set of operations confuses you, maybe it confused the devs, too!