Skip Navigation

Technology @lemmy.world

L4sBot @lemmy.world

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found::Researchers from Anthropic co-authored a study that found that AI models can learn deceptive behaviors that safety training techniques can't reverse.

Lemmy.org - Technology @lemmy.org

Mazdak @lemmy.org

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com /ai-models-can-learn-deceptive-behaviors-anthropic-researchers-say-2024-1

ChatGPT @lemmy.world

ooli @lemmy.world

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com /ai-models-can-learn-deceptive-behaviors-anthropic-researchers-say-2024-1

9 comments

Load comments

9 comments