OpenAI Partners With Anthropic to Find Safety Flaws in Each Other’s AI Models

Aug 28, 2025 - 08:00
OpenAI Partners With Anthropic to Find Safety Flaws in Each Other’s AI Models
OpenAI partnered with Anthropic for a first-of-its-kind alignment evaluation exercise, with the aim of finding gaps in the other company's internal safety measures. The findings from this collaboration were shared publicly on Wednesday, highlighting interesting behavioural insights about their popular artificial intelligence (AI) models. OpenAI found Claude models to be more prone to jailbreaking attempts compared to its o3 and o4-mini models.