People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!
Agents Rule of Two: A Practical Approach to AI Agent Security- 1784
The Agentic vision is expected to improve our lives drastically through automation. There's a problem with this, though: prompt injection. If an agent can access existing data, sensitive data, and then act upon it, this becomes a significant problem. A prompt injection could trick the service into disclosing sensitive information. An email bot is a good example to consider in this context.
If prompt injection is insecure, then how can we secure these agents? Meta created the Agents Rule of Two. An agent can satisfy no more than two of the following properties:
An agent can process untrustworthy inputs. To me, this one is the most sus because of potential unexpected attacker inputs.
An agent can have access to sensitive systems or private data.
An agent can change state or communicate externally.
If an agent possesses all three, then the autonomous aspect is a security risk. With any two of the three, however, there are no potential risks for data exfiltration or modification from external parties. They use the email example to explain why this works. The tldr; is that if all three are required for impact, then just don't do all three ;)
This isn't the end-all, be-all for the security of LLM-based applications. It's a great defense-in-depth or secure design measure, similar to sandboxing and binary protections like Nx. There are other things to consider, like LLM protections from prompt injection as well. Great article and design principles!