The curiosity to learn and the desire to master games have been a hallmark of humanity since the dawn of time, as Oliver Roeder describes in his fantastic book "Seven Games". Interestingly, games have also been the testing ground for artificial intelligence (AI) - something I learnt while reading this book. What better way to test the capabilities of Mistral's Le Chat, which is based on the Large Language Models (LLMs) that emerged just a few years ago, than by trying to implement the oldest known game, the Royal Game of Ur, which was played in ancient Mesopotamia at least 4,400 years ago?

The Royal Game of Ur

The Royal Game of Ur was played in ancient Mesopotamia in the early third millennium BC. It was popular throughout the Middle East, and boards for playing it have been found in places as far away from Mesopotamia as Crete and Sri Lanka. The oldest board is dated to 2600-2400 BC and was found in the royal cemetery at Ur, giving the game its name.

A partial cuneiform description of the rules of the game as it was played in the second century BC has been preserved on a Babylonian clay tablet written by the scribe Itti-Marduk-balāṭu. Based on this tablet and the shape of the board, Irving Finkel, a curator at the British Museum, has reconstructed the basic rules of the game as it might have been played.

Instructions to Le Chat

To kick things off, I gave Le Chat the following instructions:

I will give you the rules of a board game for two players.
Your task is to generate Python code implementing the game.
If something about the rules is ambiguous, ask me questions.

Here are the rules:

# The Royal Game of Ur

-- snip --

See this README.md for the complete rules I used in my prompt.

Telling Le Chat to ask me questions if something is ambiguous turned out to be a good strategy. Le Chat promptly replied asking to clarify several aspects of the rules. As an example, Le Chat asked if a player may choose not to enter a new piece onto the board if they have other legal moves available with pieces already on the board. In another case, it asked whether the game ends immediately when one player has removed all their pieces from the board, or does the other player get a final turn.

There are two take-aways here:

  • Specifying things in a clear, unambiguous manner – effectively, writing good requirements – is hard and we tend to overestimate the quality of our specifications
  • Adding if something is ambiguous, ask me questions to your prompt is a useful strategy as it makes the LLM to review your prompt before generating the response

Le Chat's Responses

Interestingly enough, Le Chat generated three versions of the code. More precisely, it tries to run the generated code and, if not successful, tries to fix the code.

After running the first version of the code, Le Chat responded with:

It seems there's an issue with the board setup, specifically with
the position 'b4'. The board dictionary does not include this position, which is causing a `KeyError`. Let's fix this by ensuring
all necessary positions are included in the board setup.
I'll correct the board initialization and rerun the game simulation.

After generating and trying to run the second version of the code, Le Chat responded:

It seems there was an error while trying to simulate the game.
Let's address the issue by ensuring the game logic is correctly
implemented, particularly focusing on the movement and special
square mechanics. I'll review the code and make necessary adjustments.

After a third version, Le Chat stopped. So, I stored that last version in this Python script and started looking at the code.

The Generated Code

The code is composed of one class, RoyalGameOfUrCorrected. It contains a constructor and 12 functions:

  1. __init__
  2. roll_dice
  3. initial_setup
  4. switch_player
  5. is_legal_move
  6. get_piece_position
  7. calculate_new_position
  8. move_piece
  9. remove_piece
  10. check_winner
  11. play_turn
  12. play_game

At first glance, the code looks good. For example, Le Chat has generated Python docstrings for each function, something that many models do not yet do.

However, trying to run the code throws an error:

NameError: name 'random' is not defined. Did you forget to import 'random'?

So, I ask Le Chat to fix this error:

When I try to run the code, I get the following error: "NameError: name 'random' is not defined. Did you forget to import 'random'?" Please fix it.

Le Chat again needs 2 iterations to produce this code. The code now starts with the import random statement. So, Le Chat seems to be capable of fixing the code it generated if it is given the specific error message.

Interestingly, the fixed code version ends with the following two statements:

import random

class RoyalGameOfUr:
    def __init__(self):

    -- snip --

    def play_game(self):
        """Play the game until it is over."""

    -- snip --



# Create a game instance
game = RoyalGameOfUr()

# Check initial setup
game.board, game.player1_pieces, game.player2_pieces

It looks like Le Chat is trying to verify whether the code can be executed rather than actually playing the game by calling game.play_game(), something it did in the previous code version. Even Pylint in Visual Studio Code IDE reports that the statement seems to have no effect (W0104:pointless-statement). But that, of course, doesn't make the code wrong per se.

So, does the code work? Let's try by adding game.play_game() as shown here. Let's run the code ... 🤞🤞🤞 (drums in the background) ... And voilà! The code runs:

➜  royal-game-of-ur git:(main) ✗ python3 game_v6.py
Player 2 moves L1 to a5.
Player 1 moves D1 to c5.
No move this turn.
Player 1 moves D1 to c7.

-- snip --

Player 2 moves D3 to c2.
No legal moves available.
Player 1 moves L5 to a8.
No move this turn.
Player 1 moves D3 to off.

Of course, just because the code executes doesn't mean it correctly implements the rules of the Royal Game of Ur. But that is something for the next post – stay tuned!