Back to news
Seguridad
A “diff” tool for AI: Finding behavioral differences in new models
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
anthropicresearchseguridad
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Original source
View original