Tech

Researchers create AI that can ‘jailbreak’ other chatbots

DIGITAL BUSINESS MAGAZINE0101/2024202420242024

137 2 minutes read

Researchers at the Nanyang Technology University (NTU) in Singapore have created an artificial intelligence (AI) chatbot that can circumvent protections on chatbots such as ChatGPT and Google Bard, coaxing them to generate forbidden content, reports Tom’s Hardware.

Because generative AI such as the large language models (LLMs) behind popular chatbots are trained on such vast quantities of data, they will inevitably contain dangerous information that should not be easily accessible – how to make explosives or drugs for example. So they have protections in place to prevent users from accessing this information.

However, the NTU researchers have developed a technique called ‘Masterkey’, allowing them to bypass the guardrails and access data not intended for public access. The team started by reverse-engineering the protections target chatbots had in place. They did this using methods that get around keyword filtering, such as adding extra spaces between letters; and by doing things like asking the chatbots to take on the persona of a hacker or a research assistant – this allowed it to share information it might otherwise not have done, generating prompt suggestions to help jailbreak other chatbots.

After gathering this data, the team of researchers, led by Professor Liu Yang, used it to teach their own LLM the methods to jailbreak the targeted chatbots. Because LLMs are so capable of adapting to new information and expanding their knowledge, the Masterkey AI can work to get around any new protections that are implemented, using the techniques it has been taught.

Yang’s team claims that Masterkey is three times more effective in penetrating the defenses of a chatbot than a human user with the same intent using prompts generated by an LLM. It is also around 25 times faster.

Why create an AI that jailbreaks AI?

Speaking to Scientific American, study co-author Soroush Pour said “We want, as a society, to be aware of the risks of these models. We wanted to show that it was possible and demonstrate to the world the challenges we face with this current generation of LLMs.” Pour is the founder of the AI safety company Harmony Intelligence.

The intent behind this research is to equip LLM developers with information about their weaknesses so they can better work towards robust prevention in the future.

Featured image credit: AI-generated image from DALL-E

Ali Rees

Ali Rees is a freelance journalist and mature student based in Scotland.

Source link

DIGITAL BUSINESS MAGAZINE0101/2024202420242024

137 2 minutes read

Researchers create AI that can ‘jailbreak’ other chatbots

Why create an AI that jailbreaks AI?

Ali Rees

DIGITAL BUSINESS MAGAZINE

New York Post’s slams activists removing posters of captives

Unraveling the Complex Threads of Israel-Palestine: Navigating Racism and Antisemitism

50 Jobs That AI Will Replace In The Next 5 Years

Culinary Union Strike Isn’t Happening in Las Vegas, But Dumbassery Is

BitMEX Introduces 100x Leverage for SOLUSD Perpetual Swap

Micro becomes Macro and Macro-Financial Interactions

Gold and FX Reserves in 2025Q1

Punctual’s Luxury Bus Charter Service

Punctual’s Luxury Bus Charter Service

Retirement Parties: Celebrate a Lifetime of Achievements with Unforgettable Transportation

Super Stretch Limousine Rental Service: Luxury and Elegance Redefined

The Secret to Dominating Search Results

Why create an AI that jailbreaks AI?

Ali Rees

DIGITAL BUSINESS MAGAZINE

Nervous times ahead for Square Enix employees as it aggressively pursues AI, claiming it will help “reshape” its creations

X Holdings faces 71.5% valuation cut by Fidelity following turbulent year

Related Articles

SMTP vs. API: Which Email Sending Method is Best for Your Business?

When is it and what can we expect?

The 5 best password managers for 2024

New interactive Super Mario World LEGO set leaked ahead of official reveal

New York Post’s slams activists removing posters of captives

Unraveling the Complex Threads of Israel-Palestine: Navigating Racism and Antisemitism

50 Jobs That AI Will Replace In The Next 5 Years

Culinary Union Strike Isn’t Happening in Las Vegas, But Dumbassery Is

BitMEX Introduces 100x Leverage for SOLUSD Perpetual Swap

*Micro* becomes *Macro* and Macro-Financial Interactions

Gold and FX Reserves in 2025Q1

Punctual’s Luxury Bus Charter Service

Punctual’s Luxury Bus Charter Service

Retirement Parties: Celebrate a Lifetime of Achievements with Unforgettable Transportation

Super Stretch Limousine Rental Service: Luxury and Elegance Redefined

The Secret to Dominating Search Results

Micro becomes Macro and Macro-Financial Interactions