A Chess AI In Only 4K Of Memory

The first computer to ever beat a reigning chess world champion didn’t do so until 1996 when a supercomputer built by IBM beat Garry Kasparov. But anyone who wasn’t a chess Grandmaster could have been getting beaten by chess programs as early as 1979 when Atari released one of the first ever commercially-available chess video games for the Atari 2600. The game was called Video Chess and despite some quirky gameplay it is quite impressive that it was able to run on the limited Atari hardware at all as [Oscar] demonstrates.

The first steps of getting under the hood of this program involved looking into the mapping of the pieces and the board positions in memory. After analyzing some more of the gameplay, [Oscar] discovered that the game does not use trees and nodes to make decisions, likely due to the memory limitations, but rather simulates the entire game and then analyzes it to determine the next step. When the game detects that there are not many pieces left on the board it can actually increase the amount of analysis it does in order to corner the opposing king, and has some unique algorithms in place to handle things like castling, finishing the game, and determining valid movements.

Originally it was thought that this engine couldn’t fit in the 4K of ROM or work within the 128 bytes of system memory, and that it was optimized for the system after first developing a game with some expanded capabilities. The game also has a reputation for making illegal moves in the higher difficulty settings although [Oscar] couldn’t reproduce these bugs. He also didn’t get into any of the tricks the game employed just to display all of the pieces on the screen. The AI in the Atari game was a feat for its time, but in the modern world the Stockfish open-source chess engine allows for a much more expanded gameplay experience.

22 thoughts on “A Chess AI In Only 4K Of Memory

      1. “700 bytes (IIRC)were consumed by the screen display” this isn’t quite true. According to the article, 693 bytes were consumed by the program. The display only consumed a 14×14 grid on the screen = 196 bytes for a total of 889 bytes.

        You probably know this, but other people reading this won’t. The ZX81 display is character-mapped and has a variable-size from 25 to 793 bytes if the computer has <about 3kB, otherwise it's always 793 bytes.

        The ZX81 does this by terminating each line with a NewLine character: CHR$(118). This is important, because it's also the Z80 processor's code for the HALT instruction. Thus if the display only consists of (24+1) x CHR$(118) characters, then it uses 25 bytes, and as each line is padded out with characters to display the display memory usage increases.

        And of course, the ZX81's display memory is part of its normal memory, there is no separate VRAM and the display memory can be anywhere from address 16509 to 32767-theSizeOfTheDisplay.

        How does the ZX81 do that? Well, it scans the display mostly in software. For every scan line the ULA (an early gate array) generates an interrupt: the ZX81 sets its I register to point to the character set/256; then attempts to execute code from the address of the display memory for the current character line + 32768 (i.e. so that bit 15 of the address bus is set).

        The ULA snoops bit 15 of the address bus and every time it sees an instruction fetch from the Z80, it samples the data bus for the byte read and then pulls all the data lines to 0, so the Z80 actually reads a 0, which is a NOP instruction taking 4 cycles. Thus the ZX81 then executes a sequence of NOPs instead of whatever's in the display memory, but the ULA reads the character.

        The Z80's main instruction cycle is actually 2 phases: the first reads the instruction and the second is a refresh cycle for supporting DRAM. The ULA shifts the byte it had just read by <<3 and then masks its lower 6 bits into bits 3..8 of the address bus; places its own 3-bit scan line counter in bits 0..2 while bits 9..15 contain the top 7 bits of the I register (in ROM), and so then the memory location in ROM is read, which contains the pixels for the correct row for the given character.

        However, if the ULA reads a CHR$(118) it passes that to the Z80 and stops translating all the characters; the Z80 then waits for an INT interrupt, which handles the next scan line, resetting the jump address to the beginning of the line until it needs to go to the next character line.

        Complex, but pretty clever!


  1. Computers 1, Humans 0

    OK, it took until 1996 for a computer to beat the worlds beat human chess player. Chess is a strict rule based game, with a number of choices that have to be evaluated at each stage of the game. Becoming better at chess doesn’t make you better at anything else, unfortunately, other than chess. Humans love to flex over chess matches, because they fallaciously think it demonstrates a superior level of smartness. Is learning other players moves cheating? After all, the player that does that, did not themselves figure out the moves! Computers don’t forget or “make mistakes” unless they are poorly programmed, or faulty.

    1. This is an old dilemma that doesn’t get nearly as much attention as it should.
      It’s similar to the problems of IQ measuring.
      Because, there are at least three different factors, I think.
      Knowledge/memorizing things learnt, raw brain power and phantasy/imagination.
      An individual who’s good at one of these isn’t necessarily at one of the others, too.
      The smart dude in history class might be poor at math, while the math freak might be incredible quick at calculating, but doesn’t realize how silly the question/result actually is (Question: “If 10 musicians need 5 minutes to play a song, how long do 20 musicians need?” Math guy: “2,5min!” The sleepy dude that’s good in English class: “What?! Same time.”)
      Anyway, that’s just a small example here. I really recommend checking the papers about the Cognitive Reflection Test (CRT). It describes what I mean.

      1. My personal theory is that raw brain power is upstream of both of the others, and those arise based on individual proclivities towards the orderly or chaotic. I think Plato said that lack of memory was next to madness, but madness was next to creativity. Of course you have to not only be mad but gifted as well.

    2. Computers make mistakes in chess simply because the game is too big & complicated for a perfect solution. It’s not because they are poorly programmed or faulty. The fact that something is a strict rule based game doesn’t make it easy. You can make unlimited complicated things with a few simple rules.

        1. Programming techniques did improve considerably, but I wouldn’t say that early versions were “poorly programmed”. They were state of the art at the time. Deep Blue also didn’t improve the programming much. It mostly won due to massive special purpose hardware.

    3. Most strong players are happy to admit that chess skill has little to do with intelligence. Nakamura (one of the strongest bullet players in the world) even did an IQ test on stream that found he was pretty much average.

      Fischer and Kasparov are probably exceptions but these kinds of people exist in any competitive domain.

  2. These days chess AI is a good measure of compute power. There is no way to solve a chess board with an algorithm, you can only win with practice. Better computers allow for chess AI to spend more time in reinforcement learning, and allow it memorize more possible openings and board layouts.

    I would consider this older method to be quaint compared to how we currently train chess engines.

    1. And this is what bugs me about calling it AI. No intelligence present… Just a computer learning algorithm. AI term sure gets hyped doesn’t it! :) Basically someone wrote a program that could beat a human at a specific task. No more… No less.

      1. Back in the day, there were two main categories for AI.
        a) so-called “expert systems”, based on large databases
        b) neuronal nets

        This was in a time when Lisp programming language was still a thing.
        Not sure if these definitions are still in use.

  3. “Toledo Nanochess: The commented source code” by Oscar Toledo Gutierrez is a fun read. You can get most of the meat of that book for free on his website.
    Micro-Max and Fairy-Max are also fun to read minimalist chess engines.

  4. So looking at the pieces, you can tell clues about how they did the display. With the interlacing and staggering of the pieces, there are only (at most) 4 per scanline instead of 8. Once piece can be drawn with a single “player” sprite. However, the 2600 has only 2 of those. If they modify the player sprite data mid-scanline, they can achieve 4 players (pieces). These two tricks together lets them draw (up to) 8 pieces across a chess-board row. By modifying the data for every chess-board row, they can put any piece anywhere. The board itself can be done with playfield graphics. But merely having figured out how they might have done it at a high level, I think the code to draw the screen would still be very challenging.

  5. Unrelated – is the Hackaday.com homepage broken? For the last few days, this entry has shown up on top and only shows 4 comments. hackaday.com/blog gives the up-to-date list, but it’s more of an expanded view of each entry than the original homepage used to be.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.