143 curated articles in Safety & Ethics.
Cyber classifiers' blocking criteria and a draft jailbreak severity framework have been outlined.
Anthropic will redeploy Claude Fable 5 from July 1 after export controls are lifted, with updated cybersecurity and a new jailbreak framework.
OpenAI released an AI Control Roadmap outlining a defense-in-depth system for managing advanced AI agents.
Google combats scammers through security measures, lawsuits, and partnerships with law enforcement and industry groups.
Progress toward recursive self-improvement and its implications were discussed.
OpenAI announced enhanced security features for a limited group of users who require them.
Anthropic urged a global pause in advanced AI development to prevent potential loss of human control.