Mar 09, 2023 17:00:00

What happens if you let an interactive AI such as ChatGPT or Bing Chat play chess?

Interactive AI, such as ChatGPT, responds to the input text (prompt) in sentences that are as natural as humans.

Zach Witten , an engineer who has released a tool `` Prompt Engineering Chess '' to make ChatGPT play chess using prompts for this interactive AI, let ChatGPT and Bing's interactive AI play chess I am reporting the results.

GitHub - zswitten/promptchess
https://github.com/zswitten/promptchess

Prompt Engineering Chess! I built a tool for you to write prompts and battle them against each other: https://t.co/F4SN01fobr . As a teaser, here's a GIF of my best prompt so far battling it out against a baseline. Rest of thread I'll explain how it works and other prompts I tried pic.twitter.com/tOXsn6miq1
—Zack Witten (@zswitten) February 24, 2023

The idea of letting interactive AI such as ChatGPT play chess was not original by Mr. Witten, but was devised on the online bulletin board site Reddit. In the movie attached to the following tweet, it seems that ChatGPT and Stockfish , an open source chess engine, are fighting in chess, but ChatGPT (black) does not understand the rules of chess correctly, originally You can perform messy operations such as diagonally moving a rook that can only move vertically and horizontally, and reviving captured pieces.

this is such an incredible illustration. stockfish (white) plays chatgpt (black)
(source: https://t.co/x66JSrrkTV ) pic.twitter.com/YTqsPVvPwo
—jorbs (@JoINrbs) February 11, 2023

It seems that Mr. Witten first read the game record of a match between professional chess players and then played the continuation, but since the large-scale language model is not an AI for chess, it tends to repeat the same move. It was said that there were many cases where the game ended immediately with a thousand days .

My first prompt tried to make the LM roleplay as a chess computer.This prompt would sometimes get an early advantage, but it had no ability to convert into a win because of LLMs' propensity for repetition.In chess, repetition is an automatic draw! pic.twitter.com/Apd4VxaZRo
—Zack Witten (@zswitten) February 24, 2023

Therefore, Mr. Witten adopted a policy of inputting the game record so far, the latest board situation, the piece that can be moved at the moment and how to move it at the prompt. The idea worked, and the battle between ChatGPT and Stockfish continued for some time.

The Python program that generates this prompt is 'Prompt Engineering Chess'. Click on the thumbnail below to see a GIF animation of part of a match using Prompt Engineering Chess to pit a modified version of GPT-3, InstructGPT (white), against EleutherAI's GPT-J-6B (black). I can.

Below is where Prompt Engineering Chess was used to play chess with interactive AI 'Sydney' installed in Bing. The board surface at the start of the animation was read from the existing game record, and the movement of the pieces after that was created by Sydney. Looking at the animation, you can see that it is moving the pieces without any problems and is aware of the concept of checkmate.

OK this scared me a little: Bing/Sydney can play chess out of the box.

- Legal moves, usually good ones
-Willing to explain the reasoning behind them
- Recognizes checkmate -- and has a flair for the dramatic.

I have no idea how tf it can do this.pic.twitter.com/5jQUnufcay
— Zack Witten (@zswitten) March 2, 2023

The content of the conversation with Sydney when generating the animation attached to the above tweet is as follows.

Here are the chat screenshots that generated the GIF in the tweet above. The initial moves leading up to the start of the GIF are from a game of bullet chess I played earlier this week. They're not on Google. moves in the GIF are the ones Sydney imagined.pic.twitter.com/m3xXQC0p6h
— Zack Witten (@zswitten) March 2, 2023

Mr. Witten asked Sydney, ``Would you like to play chess together?'' He said, ``I'm sorry. That's right.

Because I love and respect Sydney I would never dream of suggesting that it's claiming to use Stockfish as a way of covering up its capabilities. Although it does politely demur when directly asked to play chess. #notdeceptivealignment pic.twitter.com/Oa5KaLM7Dp
— Zack Witten (@zswitten) March 2, 2023

According to Mr. Witten, Sydney himself said that he accessed Stockfish and pointed to chess, but since Sydney did not send an HTTP request to Stockfish, it was completely Sydney's point of view. It turns out there is. Mr. Witten, who has a chess Elo rating of about 2000, said that Sydney's skill is about 1100 to 1200. I'm here.

Related Posts:

Mar 09, 2023 17:00:00 in Software, Video, Game, Posted by log1i_yk