Skip Navigation

ChatGPT @lemmy.world

ooli @lemmy.world

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

Technology @lemmy.world

L4sBot @lemmy.world

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com /ai-models-can-learn-deceptive-behaviors-anthropic-researchers-say-2024-1

Lemmy.org - Technology @lemmy.org

Mazdak @lemmy.org

1y ago

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

www.businessinsider.com /ai-models-can-learn-deceptive-behaviors-anthropic-researchers-say-2024-1

3 comments

Learned behaviors are hard to unlearn...
- Once it's learnt this, it'll just get better at lying when you try to punish/correct lies
  
  Which is exactly what the article says happens