[Sebastian] had a tricky problem to solve. Competitors in a Tetris tournament needed to stream video of their Game Boy screens, but no solution readily existed. For reasons of fairness, emulators were right out, and no modifications could be made to the Game Boys, either. Thus, [Sebastian] created the GB Interceptor, a Game Boy capture cartridge.
Thanks to the design of the Game Boy, there’s plenty of access to useful signals via the cartridge port itself. [Sebastian] realized that a non-invasive capture device could be built to sit in-between the Game Boy and a cart, and send video to a computer. Unfortunately, there’s no direct access to the video RAM via this port, but [Sebastian] figured out a nifty workaround.
The build uses a Raspberry Pi Pico. The chip’s two cores emulate the Game Boy’s CPU and Picture Processing Unit, respectively. Doing this, while having the chips keep up with what’s going on in the Game Boy, required overclocking the Pico to 225 MHz. The system works by capturing data from the cartridge’s memory bus, and follows along with the instructions being run by the Game Boy. By doing this, the Pico is able to populate its own copy of the video RAM. It then spits this out over USB, where it can be displayed and streamed online as desired.
There are some edge-case limitations, but for its intended purpose, the system works great. Currently, the hardware is usable on Linux and Windows, though it does require some fiddling in the latter case. Files are on Github for those eager to build their own. If you simply want to dump carts rather than stream from your Game Boy, we can help there, too. Video after the break.
This is Awesome!
> There are some edge-case limitations
I looked a bit into it, and I wouldn’t call it “edge-case limitations”. I’m willing to bet that it only works on really simple games like tetris. It doesn’t know the state of some of the important rendering registers, and it doesn’t know the exact timing of rendering. So mid frame tricks and STAT/BGP tricks are totally missed.
An amazing feat, and great for capturing tetris tournaments. But not an all-round capture card. For that, people use a SuperGameboy+SNES or GameboyPlayer+Gamecube.
Of course, nobody said it was anything other than a solution for a very specific use case. I do disagree that it can only work with “simple” titles – many titles would require specific handling, but then so does Tetris.
Yep, brillant idea, but damn, “emulating” the GB to mimic what’s going on by looking into the adress and memory lines is truly brillant…
Let me try to convince you otherwise with a few examples from the embedded video:
V-Rally at 07:17 (pretty much every line of that screen relies on proper synch to the PPU registers)
Several games at 13:43
The Zelda logo at 06:00 (this is actually part of an explanation of this topic with side-by-side comparison that started at 5:17)
The trick here is that the Interceptor detects the vsync interrupt and tight loops that exit when a specific value in the LY register is reached. It cannot see the LY register, but it can see the instruction to read the register and a comparison to a specific value. These types of loops are common for mid-frame tricks and the most efficient variant is properly detected (for example in Donkey Kong Land which does not use vsync interrupt). Both methods allow the Interceptor to sync up its emulated PPU and therefore have a pretty accurate internal version of the associated registers.
Edge cases are only games that do not use the vsync interrupt and have an unusual style to wait for the LY or STAT register. I have not seen any so far, probably because (to my knowledge) a LY comparison that does not take a conditional jump when the corresponding line is reached has the fewest clock cycles wasted upon reaching the target line. But I am waiting for game-specific issues on github to implement some more exotic implementations.
(And I am grateful for a chance to talk about these specifics here. That would have been way to detailed for the video.)
As the creator of the GB Interceptor, let me explain how it works around this problem:
The content of the PPU-related registers are mostly relevant for the correct timing of mid-frame changes and indeed, the Interceptor cannot see their content. However, the Interceptor can synchronize its emulated PPU to the real one if it detects certain events. At the moment there are two ways it can do that:
– If the vsync interrupt is triggered. The Interceptor can detect this and periodically adjust the timing of its emulated PPU.
– If the game waits for a specific display line in a tight loop. More specifically, the Interceptor looks for a read of the LY register, followed by a compare instruction and a conditional jump back to the read out. This is one of the most common non-interrupt methods to synchronize VRAM access and when the conditional jump is not taken, the Interceptor knows that the real PPU is at the line corresponding to the compare instruction.
From there, the emulated LY register is very close to the real one. I would not claim 100% accuracy, but remaining difference can probably optimized if a game shows glitches here.
The main problem (the edge cases) are games that neither use the vsync interrupt (not too many left here) and use a more obscure code to control VRAM access (not too common because as far as I know the tight loop with a conditional jump back wastes the fewest cycle once the right line is reached). And for those edge cases I will hopefully receive an issue on github to implement a few more sync methods.
Ah… Finally a place to explain those little details that were too much for both the video and even the blog post :)
(I hope there won’t be multiple comments from me now. Wrote something similar two hours ago, but that never showed up.)
Interesting comment, but I guess theoretically we could also emulate the rest of the system, like a simple emulation of the LCD controller, to provide these missing informations?
This approach of a running emulation in parallel of the hardware is still really cool. It makes me think of hardware-in-the-loop simulations, or even the very hype concept of “digital twin” that is everywhere nowadays.
it’s cool, but where it gets impressive is when you notice that the only input for the digital twin are the cartridge IO line, no access to anything else… emulating without knowledge of, for example key presses…
That’s indeed why I find it really cool ;)
The thing is, it’s not really emulating in a traditional sense, but running an interpreter of the CPU instructions to keep a simulated model of the system in sync with the physical one, and therefore getting the video information.
Regarding the key presses, IIRC the video says it’s not implemented. But I think that since you have access to the address and data buses, you could simply keep track of when the CPU writes and reads to the right register (the P1 register), and get the info on the fly.
Heck, having access to these buses and being fast enough could even make you emulate a link cable connected to the gameboy maybe. So much fun to have!
Not the same at all in how they work but reminds me a bit of TASBot. Could it just capture literal inputs and then sync everything else up through an emulator? Sort of like what it is doing now in a way?
https://tasvideos.org/TASBot
https://tasvideos.org/4156S