Demoing an 8088

The demoscene usually revolves around the Commodore 64, and when you compare the C64 hardware to other computers of a similar vintage, it’s easy to see why. There’s a complete three-voice synthesizer on a chip, the hardware allows for sprites, a ton of video pages, and there are an astounding sixteen colors, most of which look good. You’re not going to find many demos for the Apple II, because the graphics and sound are terrible. You’re also not going to find many demos for an original IBM PC from 1981, because for thirty years, the graphics and audio have been terrible.

8088 MPH by [Hornet], [CRTC], and [DESire], the winner of the recent 2015 Revision Demo compo just turned conventional wisdom on its head. It ran on a 4.77 MHz 8088 CPU – the same found in the original IBM PC. Graphics were provided via composite output by a particular IBM CGA card, and sound was a PC speaker beeper, beeping sixty times a second. Here’s a capture of the video.

Because of the extreme nature of this demo, it is unable to run on any emulator. While the initial development happened on modern machines with DOSbox, finishing the demo needed to happen on an IBM 5160, equivalent to the 5150, but much easier to find.

Despite the meager hardware and a CPU that reads a single byte in four cycles, effectively making this a 1.19 MHz CPU, the team produced all the usual demoscene visuals. There are moire patterns, bobbing text, rotated and scaled bitmaps, and an astonishing 1024-color mode that’s an amazing abuse of 80×25 text mode with NTSC colorburst turned on.

Below you can find a video of the demo, and another video of the audience reaction at the Revision compo.


If Demos are your thing, there will be a compo at the LayerOne Demoparty in Monrovia, CA on May 23rd. There are categories for every piece of hardware you can imagine

40 thoughts on “Demoing an 8088

  1. The first PC I ever owned was based around the Lightning fact 8088 chip. It had a whopping 256K or RAM, a 10mb hard drive, and Hercules Monochrome graphics. For the last 25 years, I have known that the Hercules monochrome was better than CGA. I may have just been proved wrong.
    Kudos to the hackers involved in this project.

    1. What’s most amazing is that nobody did this, what, 30 years ago. It’s using pure IBM PC 5150 technology AIUI. There were hackers back then, and people who understood NTSC, as well as most of the quirks of the CGA card, which isn’t exactly complicated. Of course it’s the subtleties that matter, but we had brains and even demos back then. IBM might’ve paid the guys millions for it! It would have made world news!

  2. I was fortunate enough to have an IBM PC in my house in 1984, followed by an IBM PC/XT. The Technical Reference Manual and various other sources mentioned that the CGA could do a low-resolution (160×100) graphics mode with 16 colors, but the BIOS didn’t support it. There were a few games that used it, the only one I remember was Qix. For a long time I wondered how that mysterious graphics mode worked and I only figured it out a few years ago.

    Now I’m going to have to figure out how these guys did those 256 and 1024 color modes by “manipulating the NTSC color burst”. Thanks guys! :-)

    Seriously, this is very impressive! They even report that they had to reprogram the DMA channel that refreshes the DRAM so they could get the timing right for some of the effects.

    Color me impressed!

    ===Jac

    1. Fun fact/bug:
      On composite output, in 80-column mode, the CGA draws the NTSC color burst too early. Setting the overscan color ([3D9] & 15) changes what the monitor thinks is the colorburst, and thus rotates the received colors via NTSC

      1. That good old, crappy old, CGA card!

        The hack *I* thought was impressive is a few years old (from the 1980s in fact). You set the card to 80 column graphics mode. By messing with the timing chip, you could set the text columns to be only 2 pixels high. By filling each character cell with the vertical half-bar, you had 160 blocks, with background colour setting one half, ink colour the other half. The result is 160×100 with the full 16 colours. Could CGA use all 16 colours for the background? I think it was an option, sacrificing hardware flash, something like that.

        And you didn’t have to use the half-bar, you could use the top 2 lines of any ASCII character you liked. So there was a lot of variation. Didn’t get used much, only example I have is of a Galaxian clone. Would’ve been impressive as pants tho back in the day, especially if you were used to that atroctious, awful, eye-bleedin cyan and bloody magenta that so many games used. At least red / green / yellow / blue was OK, even if there was no black.

        Even better than that, is the hack for the Sinclair ZX81, giving bitmap graphics using different scanlines lines of the character set. You didn’t have full bitmap access, but there were enough combinations to give some pretty good games, particularly when Clive Sinclair himself said it wasn’t possible!

        But THIS, 1024 colours out of 4, or 16, is incredible.

  3. I’m always amazed what people could do with such (comparatively) primitive equipment with just some brains and hard work, kinda makes you ashamed we live in a world where people make a doorbell with an arduino.

    1. It’s a proof that it’s not what you have that counts, but what you know. I really think that people making doorbells out of arduino are a bad thing. Every time someone learns something, humanity is better as a whole. This something may look trivial, sometimes useless, but nobody should look down on someone who has the will to learn more, on any level, age or complexity.
      I’m sure there will be some angry people to say that even this demo is useless. Poor words from poor spirits, I feel sorry for those people.
      I praise the easy, the simple, the straightforward, as more people has access to knowledge, the better this world is.

      I may sound pompous, but this demo is a work of art that reminded me that anything is possible, when there’s a will.

      Curiosity is not a crime.

    1. Honestly I think that polygon demo east the most impressive I’ve seen, I remember how hard it was for me to just get the abstract line screen saver to run on my 286 in qbasic and that had a whole 12mhz and 16 if you pressed the turbo button

    2. Not quite Crysis, and defiantly not a PC – but there’s a glorious nutter getting a Quake / Quake 2 renderer working on a 16MHz Atari Falcon. Integer-only, perspective-correct lightmapped texture mapping – using the Falcon’s intended-for-audio programmable DSP as a fancy SPU-style coprocessor.

      (He’s already got a fully-working Doom port running on this forgotten Atari machine. A bit different from previous ports, in that it’s intended for the computer as originally released, without any enhancements or accelerator boards.)

  4. In the local robot club, someone won the line following competition for the first few years straight with just op-amps and no MCU. The ‘Zen’ of programming and even good engineering is being lost on the younger generation with their copious MIPS and ever-flowing petabytes. Necessity is the mother of invention and only constraints drive innovation. Kudos to the authors of 8088MPH for both great work, keeping the demo scene alive and well, and hopefully inspire others to do the same.

    1. Nah, there will always be “bad” engineers and good ones. The good engineers just aren’t wasting their time with op-amps and TTL’s like they used. A lot of people are making the most of their “MIPS and petabytes”, because it’s whats their and available, and fast to work with. Of course sometimes things are best served by the most parsimonious approach, but in general using a chainsaw to pluck a weed still works. This is usually only a problem when someone doesn’t realize that they’re being inefficient.

      There are very few if any legitimate cases where you need to be programming things “the old way” with tons of memory management anymore. It’s just an evolution. In any case, the constraints of yesterday wont really drive the innovation of tomorrow.

      1. My point about Zen is there are a lot of “old” tricks and techniques still useful in a modern context that are being lost with the younger generation because of lack of necessity. For example there are millions of model year 12-15 vehicles driving around roads across the world today with a sprite compiler I wrote in 1987 for a parallax scroller game – now being used to do the vehicle startup/welcome animations. I runs at 3x efficiency vs a green programmer’s implementation of a JPEG flip-book. (no hardware codecs on the platform). Where will that green programmer ever be exposed to the technique of code generation for pixels in the land of GPUs in every SoC? He’s not a “bad” engineer. He’s just one that will never have the exposure to a different set of incredibly small constraints like I did that forced creative solutions. Vintage computing is a great learning asset!

        1. Those constraints are useless in a modern context. A GPU imposes new constraints that can be incredibly “small” as well. There was little opportunity to work within parallelism constraints in the past and if you worked on modern hardware trying to make GPU shaders do heavy lifting you’d find that the constraints are numerous, and tons of forced creative solutions are necessary to get the most out of the tools your using. The fact remains that no one really needs a creative solution until they do, and programmers / engineers today are no less equipped. They have the advantage of being able to do much more with more, rather than more with less. This is the constant cycle and what your saying would be simpler to a statement like people have to mess around with stacked electrical plates before they use a battery. Of course all knowledge is useful, but efficient sprite rendering is largely a solved problem and if taking overpowered hardware to do a simple job is affordable and easy, there’s really no reason to do it any other way. To do advanced tasks things need to be automated, and what were once important optimizations can eventually become micro-optimizations that aren’t worth the time. There are every increasingly big fish to fry.

          1. Increased capacity breeds both increased capability and increased inefficiency. I’m setting up an old 466Mhz Slot 1 Celeron with 448 meg of PC66 SDRAM for a CNC control system. With Windows 95B it is damn freaky fast. Boots and reboots much faster than any of my current desktops and laptops with Windows 7. After I defrag it with Norton, it’ll be even faster.

            I know people with significantly faster PCs running Windows 8.1, and they boot even slower than this Win7 box I’m using right now.

            If a brand new, multi-core, multi-ghz computer running the latest OS can’t even boot up as fast as a 466Mhz box from 1999, what kind of “progress” is that? Current operating systems and much of the application software have become so bloated that their actual efficiency has fallen behind what was available over 15 years ago.

            Windows 95 was no paragon of code efficiency, but compared to today’s systems, its authors still had to work hard to make it run in the RAM and storage space available. With today’s hardware where a 1TB drive is rapidly becoming low end and 4 gig of RAM is the new 512 meg, many programmers just don’t work as hard as their forebears had to in order to wring all the performance they could out of their code.

          2. Win95 wasn’t really designed to run on 466 MHz celerons. It was designed for 75-120 MHz Pentiums with 32-64 MB of RAM. By the time the 466 Celeron slot processor came out, it was already the year 2000 and Win95 was obsoleted by Windows 2000 and Windows 98 SE.

            If you run sofware on a machine that is 10 times faster than the stuff it was designed for, it’s going to be pretty snappy. Your comparison scaled up to date would be like decking a modern 8-core machine with 16 GB of RAM and SSD drives all the way. Boots up in 10 seconds flat. Well, surprise surprise.

          3. “4 gig of RAM is the new 512 meg”, haha! Try buying your 3rd and 4th MB, for 100 quid, for the joy of being able to squeeze DOOM into it. Just! In a tiny window, at half-res, on a 386SX-40. And being more amazed than you’ve ever been in your life!

            And that’s after the roomy, disk-drive having grown-up computer that was the Atari ST. And the 128K Spectrum (with SOUND CHIP! and built-in tape player) to replace your old 48K model.

            And I’m hardly an old geezer round here. 512MB! You could store every disk I ever had for the ST in 512MB, and that could store a dozen Spectrum games on one of it’s floppies!

  5. This is a trip, impressive hacks.
    The “Demo Scene” is new to me, so the shear extent of the knowledge/experience going into these hacks goes over my head without an explanation… They did a good job of explaining it.
    Half-tempted to put together that old 8088 mobo and cards in my piles of circuit-boards… well, maybe 1/4th tempted.

    1. Unless you have a genuine IBM system with a genuine IBM CGA card (and not a system with e.g. a cloned BIOS) you probably wont be able to run the demo anyway :)

      1. I don’t reckon this demo uses much of the BIOS code. The graphics chip itself is probably what matters. And IBM CGA used a standard Motorola one, whose name I can’t remember, that lots of earlier 8-bit computers had used. Not even a really good one, either.

        With that, an an NTSC monitor (or composite-in to a TV), it might just work. I’d give it a go if I had the bits. How hard to plug a few cards together? And remember all those DIP switch settings and whatnot. After that have a few games of Commander Keen. But nothing else, cos commercial games were shit on early PCs.

      2. You indeed need a genuine IBM CGA card (old style, with Motorola 6845, not a clone such as the Hitachi HD6845) to view the demo 100% as intended.
        Other cards may give wrong colours, have wrong timing, lose sync during certain parts, or even cause the end part to crash. We know that at least ATi Small Wonder and Paradise PVC4 do not work correctly.

        On the motherboard-side, things seem a bit more positive. IBM only used off-the-shelf parts, namely the Intel 82xx chipset family. At least some clones also use the exact same parts. I have found that a Philips P3105 clone, when running at 4.77 MHz (it also has a turbo mode of 8 MHz), and an IBM CGA card installed, will run the demo perfectly.
        I have also found that a Commodore PC20-III uses a single-chip Faraday FE2010 solution, whose timings are not compatible with the Intel chipset. It crashes on the end part, and even with an original IBM CGA card, some parts were not displayed correctly.
        So, there are some clones that can run the demo as well as an IBM system, but certainly not all 8088 machines at 4.77 MHz.

    2. Look up demos by Farb Rausch aka farbrausch. Literal translation from German is Color Rush. An early one, and still good, is “the product”. From a 64K executable it generates around a gigabyte of 3D models, textures and sound in realtime. Some of their demos won’t run on newer versions of Windows or DirectX than they were written for.

      One thing to watch for is many demos get false positive hits from antivirus programs, and those companies refuse to whitelist them or examine them to learn from the demos how to reduce false positives.

  6. “effectively making this a 1.19 MHz CPU”

    Not exactly. The 8088 did work slower than the 8086 but you can’t assume all other architectures could read a byte every cycle.

    1. The main competitor is the 6502 though (Apple II, Atari, VIC-20/C64 etc), which, when clocked at ~1 MHz, indeed only takes 1 cycle per byte (as per the comparison with the C64 in the intro of the demo). The Z80 needs 3 to 4 cycles per byte, but it runs at ~3.5 MHz in most cases… so effectively all machines have roughly the same memory speed, in the absolute sense. It’s all an early case of the MHz myth.

      1. Doesn’t the 6502 leave the bus alone every other cycle? It’s meant to allow other things (CPUs, or graphics chips) to access the RAM easily.

        Indeed the Acorn Electron, in a, well, confusing, application of ingenuity, to save money, used the new 64Kx1 chips as it’s RAM. 4 of them, 64Kx4, with an ASIC buffering and converting them into 32Kx8. It limited the highest-res graphics it could use to less than the computer it was a cheap knockdown of, the BBC Micro. By the same manufacturer, it was a licensed cheap effort.

        For however much 4 of 4164 DRAM chips must’ve cost, I think it’s a stupid idea myself.

        1. What we compared was nothing more than the number of cycles it takes for the CPU to take a byte off the data bus.
          We are not talking about a complete instruction that reads a byte.
          The 6502 indeed has ‘idle’ bus cycles, which it uses to decode and execute instructions. These idle cycles can be and are used by other hardware (such as the graphics chip in Apple II or C64 for example).

          But the 8088 has similar issues. Many instructions take far longer to execute than just fetching from memory. Another difference between 8088 and 6502 is that the 8088 has a larger, more complex instructionset, where most instructions are 2 or more bytes long. The CPU was actually designed as a 16-bit CPU, the 8086, where you’d have twice as much memory bandwidth.
          Putting the CPU on an 8-bit bus (the 8088) severely hampers performance.
          Effectively the CPU performance does not differ that much between 6502, 6800, Z80 and 8088 derivatives in the early 80s.

          The 8086 was much faster, as was the 68000, but both required a 16-bit motherboard, which made these systems far mor expensive. The 68000 became popular later, as costs came down, and powered the Mac, Amiga and Atari ST, to name a few.
          The 8086 never became popular because it was replaced by the 80286 by the time 16-bit became mainstream. The 80286 was a huge leap forward in performance, especially with mul and div operations. 8086/8088 still used microcode emulation, where the 286 had much faster dedicated circuitry.

          1. Interesting… I didn’t realize that about the 8088 and the 8086… From your description it doesn’t seem plausible, but I swear I remember *thinking* I was “upgrading” a system by replacing its 8086 with an 8088. Though, this description suggests that was merely a wonky-memory. Or maybe I had an 8086 mobo and pulled that to replace it with a 8088 mobo…? Either way, it was the era of 486’s, I was just dabbling and thought the higher number meant it was better. Clearly a rookie mistake :)

          2. You can’t upgrade an 8088 CPU to an 8086, because they require a different chipset, and the pins have different meanings (the 8088 is compatible with the 8085). So you’d have to replace the whole motherboard.
            A common CPU upgrade back then was to upgrade an 8088 to a NEC V20, or upgrade an 8086 to a NEC V30.
            Intel also offered InBoard solutions, which were ISA cards that also plugged into your CPU socket. This way you could even upgrade your PC to a 386 (and then replace the 386 on the InBoard with a 486-upgrade chip :)).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s