Safety & Ethics News

143 curated articles in Safety & Ethics.

Anthropic details Fable 5's cyber safeguards and jailbreak severity framework

July 2, 2026|

Cyber classifiers' blocking criteria and a draft jailbreak severity framework have been outlined.

July 1, 2026|

Anthropic will redeploy Claude Fable 5 from July 1 after export controls are lifted, with updated cybersecurity and a new jailbreak framework.

June 18, 2026|

OpenAI released an AI Control Roadmap outlining a defense-in-depth system for managing advanced AI agents.

June 16, 2026|

June 12, 2026|

Google combats scammers through security measures, lawsuits, and partnerships with law enforcement and industry groups.

June 10, 2026|

Progress toward recursive self-improvement and its implications were discussed.

June 8, 2026|

June 5, 2026|

OpenAI announced enhanced security features for a limited group of users who require them.

June 4, 2026|

Anthropic urged a global pause in advanced AI development to prevent potential loss of human control.

May 28, 2026|