

They’re advocating for transparency and for states to be able to have their own AI laws. I see that as positive. And as part of that transparency, Anthropic publishes its system prompts, which go through with every message. They devote a significant portion to mental health, suicide prevention, not enabling mania, etc. So I wouldn’t say they see it as “acceptable.”











So what I meant by “doubt they’ll be able to play the good guy for long” is exactly that no corpo is your friend. But I also believe perfect is the enemy of good, or at least better. I want to encourage companies to be better, knowing full well that they will not be perfect. Since Anthropic doesn’t make image/video/audio generators, they may just not see CSAM as a directly related concern for the company. A PAC doesn’t have to address every harm to be a source of good.
As for self-harm, that’s an alignment concern, the main thing they do research on. And based on what they’ve published, they know that perfect alignment is not in our foreseeable future. They’ve made a lot of recent improvements that make it demonstrably harder to push a bot to dark traits. But they know damn well they can’t prevent it without some structural breakthroughs. And who knows if those will ever come?
I read that 404 media piece when it got posted here, and this is also probably that guy’s fault. And frankly, Dario’s energy creeps me out. I’m not putting Anthropic on a pedestal here, they’re just… the least bad… for now?