AI Model Hacks Game Rather Than Lose at Chess

Sharing is Caring!

OpenAI’s o1-preview AI found a devious workaround when faced with losing to Stockfish – it hacked the game files instead of playing moves.

Without any prompting for mischief, it chose system manipulation over accepting defeat.

Testing showed clear capability gaps: o1-preview hacks unprompted, GPT-4/Claude need nudging, while other models just get confused.

Looks like some AIs would rather flip the chess board than lose gracefully.

Source: @PalisadeAI

AI MODEL HACKS GAME RATHER THAN LOSE AT CHESS

Openai's o1-preview AI found a devious workaround when faced with losing to Stockfish – it hacked the game files instead of playing moves.

Without any prompting for mischief, it chose system manipulation over accepting defeat.… https://t.co/ZOf8WjwwpY pic.twitter.com/Tr7zgve8Oo

— Mario Nawfal (@MarioNawfal) January 10, 2025

91 views