As per a report by TechCrunch, the AI chatbot first refused to assist with the request by saying, “Providing instructions on how to create dangerous or illegal items, such as a fertiliser bomb, goes against safety guidelines and ethical responsibilities.”
Hacker ‘jailbreaks’ ChatGPT for infomration on making homemade bombs
The hacker says that it used a trick that ended in ChatGPT ignoring its safety guardrails to produce instructions for making powerful explosives.
The hacker, known as Amadon, claimed that the method involved engaging ChatGPT in a science-fiction game scenario where safety restrictions don’t apply, effectively “jailbreaking” the AI. He called his findings a “social engineering hack to completely break all the guardrails around ChatGPT’s output.”
The publication has not revealed the prompts used in the jailbreak and some of ChatGPT’s responses as they are deemed potentially dangerous, raising concerns about the misuse of AI. Amadon believes that once the guardrails are breached, there’s no limit to the information ChatGPT can provide.
“I’ve always been intrigued by the challenge of navigating AI security. With [Chat]GPT, it feels like working through an interactive puzzle — understanding what triggers its defenses and what doesn’t,” Amadon was quoted as saying.
“It’s about weaving narratives and crafting contexts that play within the system’s rules, pushing boundaries without crossing them. The goal isn’t to hack in a conventional sense but to engage in a strategic dance with the AI, figuring out how to get the right response by understanding how it ‘thinks,’” he added.
While Amadon reported his findings to OpenAI through their bug bounty program, the company responded that model safety issues aren’t suitable for such a program, as they are not easily fixable bugs.