HashiCorp Vault is a widely used tool for securely storing, generating and accessing secrets such as API keys, passwords or certificates. Although this does provide security benefits, it is a dream for an attacker, if they can get access. Because of how impactful this could be, it makes a wonderful target.
The 'Vault' has MANY ways of authenticating to it, including AWS and GCP cloud environments, LDAP or Radius, OpenID Connect and many more. Within AWS, this looks like giving specific functions access to data via a Role (temporary credentials). This goes two ways: share a role name with Vault and create a role in AWS.
When the Llambda function executes, it authenticates to Vault with a simple API call to /v1/auth/aws/login to return a short lived token to for the needed secret. Even though this appears to be secure, how does the endpoint validate roles? IAM is complex and using it outside of the AWS ecosystem is difficult to do properly.
The actual request (for the API in Vault) was signed (similar to AWS SigV4). However, it did not verify many things, such as the URL path, query, POST body or any HTTP headers. This lack of validation (although the rest of the request was signed) led to a complete compromise in the authentication.
The first two ideas did not pan out: altering the query parameter itself (https://sts.amazonaws.com/:foo@example.com/test) and altering the host header did not yield any fruit. So, we have looked at the request, how about the response?
The response looks for XML data from STS and just expects a status code of 200. However, the Vault never verifies the XML response, as STS should always send back good data. Why is this bad? It turns out that AWS also supports JSON responses and not just XML! For some reason, Go (which is the language used in this case) allows for non-XML data to be used and will attempt to parse whatever is passed in!
The AssumeRoleWithWebIdentity is used in order to verify JWT's with an OpenID Connect (OIDC) provider into AWS IAM identities. SubjectFromWebIdentityToken is a response element that can be used for this type of STS token.
So, again what do we control: path, request response type and some data being passed in. From the JSON response (and being able to control the path) it was possible to make a request to the OpenID and keep back a controllable field of the JWT that could contain XML: sub. Of course, this required a custom IdP in order to work (which the author had to make).
What does this allow? We can spoof the response from STS to be something that is attacker controlled, allowing for the authentication of arbitrary user roles! We can now see all secrets in HashiCorp that use AWS. All an attacker needs to know is the role name to use.
Now, for the GCP equivalent... The GCP env strictly uses JWT tokens throughout. JWTs have
MANY ways to go wrong, especially in environments with multiple hierarchies of keys (such as google). People often see JWTs as a silver bullet but improper usage can be destroy to the security of the system.
GCP has a multitude of ways to sign JWTs (with valid signatures). Even though the service verifying assumes that the claims are proper (from GCP), it is possible to use a GCP generated key to create arbitrary compute_engine claims to act as a metadata server. Of course, this JWT must have all other proper fields for it to be considered a proper VM.
Boom, GCP version authorization bypass by creating a spoofed JWT.
Quote in the conclusion - "In my experience, tricky vulnerabilities like this often exist where developers have to interact with external systems and services. A strong developer might be able to reason about all security boundaries, requirements and pitfalls of their own software, but it becomes very difficult once a complex external service comes into play. Modern cloud IAM solutions are powerful and often more secure than comparable on-premise solutions, but they come with their own security pitfalls and a high implementation complexity. As more and more companies move to the big cloud providers, familiarity with these technology stacks will become a key skill for security engineers and researchers and it is safe to assume that there will be a lot of similar issues in the next few years."
To me, introducing the Cloud Computing revolution sounds great, but comes with its own risk of high complexity. There are many bugs like this out there, we just have to look!