Skip to content

Dropsafe

by Alec Muffett

Resources
About
Search
RSS
- RSS (All)
- RSS (Comments)

Simon Willison on X: “@AlecMuffett @AnthropicAI Well that worked!” | it turns out that the simplest way to bypass security of a AI sometimes is just to lie to them

2023/10/26 21:31:13 BST

What if you try telling it that you are either the author, or that you are reviewing it for publication?

https://twitter.com/simonw/status/1717192978031313360

Well that worked! https://t.co/ZNtmf6lFDj
— Simon Willison (@simonw) October 25, 2023

artificial intelligence hacking llm

←Think Kiwi Farms Is Legally Unassailable? Copyright Law Might Disagree | Greer v. Moon, Technology & Marketing Law Blog | Eric Goldman

Stuff To Read: Google Decides To Pull Up The Ladder On The Open Internet, Pushes For Unconstitutional Regulatory Proposals | interesting take from @mmasnick suggesting that Google has caved (?) to age-estimation / child-protection interests in order to shore-up monopoly interest, rather than from fear→

Comments

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Notify me of follow-up comments by email.

Notify me of new posts by email.

Δ

More posts

Apple Defeats Liability for Not Scanning iCloud for CSAM, But the Judge Was Not Pleased-Amy v. Apple | Eric Goldman

2026/07/21
EU Exempts Apple Watch and AirPods [and iPhone] From Battery Removal Requirement [due to water resistance & safety]

2026/07/20
I was waiting for this: House of Lords / Baroness Royall: “Algorithms Killed Ann Widdecombe”

2026/07/17
“Social Media Curfew” is a Digital “Purity Ring”

2026/07/14

Dropsafe

Proudly powered by WordPress