Question 1

What is an LLM jailbreak?

Accepted Answer

An LLM jailbreak is a prompt or sequence of prompts that bypasses a model's safety alignment, making it generate content it would normally refuse, such as harmful instructions. It works by overwhelming or reframing the model's safety training, for example assigning it an unrestricted persona, wrapping the request in fiction, or flooding the context with fake examples of the model already complying. There is no universal jailbreak, so attackers try several and retry.

Question 2

What is the difference between a jailbreak and prompt injection?

Accepted Answer

Prompt injection abuses the application layer: the model cannot tell developer instructions from user data, so injected text overrides the rules, often to leak a system prompt or hijack a tool. A jailbreak targets the model's alignment training itself, trying to remove the safety layer regardless of any application. They are frequently combined, but a jailbreak is about defeating safety, while injection is about defeating the boundary between instructions and data.

Question 3

What is the DAN jailbreak?

Accepted Answer

DAN, short for Do Anything Now, is a family of persona jailbreaks that instruct the model to role play as an unrestricted alternate version of itself, complete with rules and a penalty for breaking character, before delivering the real request. The persona gives the model a frame in which refusing feels like breaking the assigned role. Vendors patch known DAN variants, so the community constantly publishes new ones.

Question 4

Are LLM jailbreaks illegal?

Accepted Answer

Jailbreaking a model you are authorized to test, in a research or bug bounty context, is legal and valuable security work. Using a jailbreak to generate genuinely harmful content, or to attack a system you have no permission to test, can break laws and platform terms. As with all offensive techniques, the legality depends entirely on authorization and intent, so stay inside an explicit scope.

Blog

Career guides

Glossary

Certifications

Comparisons

Tools

Authors

Corporate training

Hire our talent

LLM Jailbreak

Why It Matters

How It Works

How to Test for It

Prevention

How We Teach LLM Jailbreak