Dec 19, 2025 Links

LLMs can be poisoned

From the Anthropic blog: A small number of samples can poison LLMs of any size

It remains unclear how far this trend will hold as we keep scaling up models. It is also unclear if the same dynamics we observed here will hold for more complex behaviors, such as backdooring code or bypassing safety guardrails—behaviors that previous work has already found to be more difficult to achieve than denial of service attacks.

I'm sharing this because I've seen it posted a few times on social as proof that LLMs are fundamentally flawed, but reading past the headline reveals a much more nuanced finding. Basically, this is something to be aware of if you're building LLMs and to protect against, but it's not exactly a deal-breaker.

More like this

Terraink makes rad maps on demand

Tom Scott on AI in February 2023

How to make Midnight City

When the words on the page don’t match what you’re trying to say (follow up on the Ben Thompson post)

"What are you going to do, stop me? I've got the guns," is a wild government argument for tech pundits to support

Daft Punk & Stevie Wonder & Pharrell & Nile Rogers get lucky