Resources
People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

AdversaPosted 4 Months Ago

When you use a major AI service like ChatGPT there is more than one model that you're talking to. How does it decide which model to use? More AI! According to this post, very quick neutral networks choose which model to use, known as the router. Some of the backend models are more powerful while the some other ones are less powerful.
This creates a potential security issue when it comes to jailbreaking: what if you can trick the router to use a less powerful model? By tricking the router, it makes jailbreaking much, much easier to do.
This is more of an abuse issue than anything else. You could likely get ChatGpt to generate inappropriate content such as create recipes for bombs and such. Being able to downgrade jailbreaking detection is interesting!