The Nintendo Switch CPU Exposed

Ever wonder what’s inside a Nintendo Switch? Well, the chip is an Nvidia Tegra X1. However, if you peel back a layer, there are four ARM CPU cores inside — specifically Cortex A57 cores, which take up about two square millimeters of space on the die. The whole cluster, including some cache memory, takes up just over 13 square millimeters. [ClamChowder] takes us inside the Cortex A57 inside the Nintendo Switch in a recent post.

Interestingly, the X1 also has four A53 cores, which are more power efficient, but according to the post, Nintendo doesn’t use them. The 4 GB of DRAM is LPDDR4 memory with a theoretical bandwidth of 25.6 GB/s.

The post details the out-of-order execution and branch prediction used to improve performance. We can’t help but marvel that in our lifetime, we’ve seen computers go from giant, expensive machines to the point where a game console has 8 CPU cores and advanced things like out-of-order execution. Still, [ClamChowder] makes the point that the Switch’s processor is anemic by today’s standards, and can’t even compare with an outdated desktop CPU.

Want to program the ARM in assembly language? We can help you get started. You can even do it on a breadboard, though the LPC1114 is a pretty far cry from what even the Switch is packing under the hood.

A handheld computer made on a piece of prototyping board running a Tetris clone

Tetris Clone Uses 1000 Lines Of Code, And Nothing Else

If you’re programming on a modern computer, you typically make use of lots of work done by other people. There’s operating systems to abstract away the complexities of modern hardware, standard libraries to implement common tasks, and tons of third-party libraries that prevent you from having to reinvent the wheel all the time: you’re definitely not the first one trying to draw graphics onto a screen or store data in a file.

But if it’s the wheels you’re most interested in, then there’s nothing wrong with inventing new ones now and then. [Michal Zalewski], for instance, has made a beautiful Tetris clone in just 1000 lines of C, without using anyone else’s code.

The purpose of this exercise is to show that it’s possible to make a game with graphics comparable to modern, complex computing systems, without relying on operating systems or third-party libraries. The hardware consists of not much more than an ARM Cortex-M7 MCU, a 240×320 LCD screen and a few buttons soldered onto a piece of prototyping board, all powered by a set of AAA batteries.

The software is similarly spartan: just pure C code running directly on the CPU core. Graphic elements, some generated by AI and others hand-drawn, are stored in memory as plain bitmaps. They are manipulated by 150 lines of code that shuffles sprites around the display at a speed high enough to generate smooth motion. Game mechanics take up about 250 lines, while sound consists of simple square-wave chiptunes written in just 50 lines of code.

[Michal]’s code is very well documented, and his blog post gives even more details about all the problems he had to solve. One example is the length of keypresses: when do you interpret a keypress as a single “press”, and when does it become “press and hold”? Apparently, waiting 250 ms after the first press and 100 ms after subsequent ones does the trick. [Michal] is a bit of an expert on bare-bones game programming by now: he has previously pushed several 8-bit micros to their very limits. Third-party libraries can make your programming life a lot easier, but it’s good to reflect on the dangers of relying too much on other people’s code.

Continue reading Tetris Clone Uses 1000 Lines Of Code, And Nothing Else”

Emulating X86 On Apple’s AARCH64 X64 Emulator

You might know [Evan Martin] as the developer of retrowin32. It’s a Windows and x86 emulator designed to run on a Mac or on the web. He’s recently been exploring how to run 32-bit x86 binaries on the AArch64 (aka ARM64) architecture.

[Evan] realized that Apple’s ARM-based Macs feature a high-quality x86 emulator, used via the Rosetta binary translation system. It only supports 64-bit x86-64 binaries, also known as x64, and thus he had initially discounted it for running older 32-bit x86 software. However, as it turns out, x64 features a special compatibility mode for running 32-bit code. [Evan] was able to leverage this to run 32-bit Windows executables rather neatly via the high-performance Rosetta emulator.

To run a 32-bit executable on a 64-bit processor in this way, one creates a 64-bit program that is tasked with loading the 32-bit executable. It’s a little fussy, involving some tricks to handle memory management between the 32-bit code and the 64-bit wrapper, and how to interface with the OS, but [Evan] explains deftly how it’s all done.

[Evan] notes that this hack may not work forever, especially if Apple changes or deprecates Rosetta’s remaining x86-64 emulation in the future. Regardless, Apple’s “Game Porting Toolkit” relies on similar techniques used by Wine. If you find yourself dancing across platforms, you might learn some nifty tricks from [Evan]’s example!

DIY Metal Detector

If you want to get rich by hunting with a metal detector, you might want to consider how much you invested in the hardware to start with. Finding a tin can with a $200 detector might not make economic sense. But building a metal detector yourself doesn’t have to be hard, as [Mirko] shows in a recent post. His STM32-based pulse induction metal detector looks good and works well, as you can see in the video below.

[Mirko] reports that the device can detect a coin at 30 cm and a large metal object at more than 80 cm. The project uses the Arduino IDE and a Blue Pill STM32 module. The project looks good with an LED module and a rotary encoder to set sensitivity.

Continue reading “DIY Metal Detector”

Picture of the miniJen structure on a presentation desk

A Jenkins Demo Stand For Modern Times

Once you’re working on large-scale software projects, automation is a lifesaver, and Jenkins is a strong player in open-source automation – be it software builds, automated testing or deploying onto your servers. Naturally, it’s historically been developed with x86 infrastructure in mind, and let’s be fair, x86 is getting old. [poddingue], a hacker and a Jenkins contributor, demonstrates that Jenkins keeps up with the times, with a hardware demo stand called miniJen, that has Jenkins run on three non-x86 architectures – arm8v (aarch64), armv7l and RISC-V.

There’s four SBCs of different architectures involved in this, three acting as Jenkins agents executing tasks, and one acting as a controller, all powered with a big desktop PSU from Pine64. The controller’s got a bit beefier CPU for a reason – at FOSDEM, we’ve seen it drive a separate display with a Jenkins dashboard. It’s very much a complete demo for its purpose, and definitely an eyecatcher for FOSDEM attendees passing by the desk! As a bonus, there’s also a fascinating blog post about how [poddingue] got to running Jenkins on RISC-V in particular.

Even software demonstrations get better with hardware, and this stood out no doubt! Looking to build a similar demo, or wondering how it came together? [poddingue] has blog posts on the demo’s structure, a repo with OpenSCAD files, and a trove of videos demonstrating the planning, design and setup process. As it goes with continuous integrations, we’ve generally seen hackers and Jenkins collide when it comes to build failure alerts, from rotating warning lights to stack lights to a Christmas tree; however, we’ve also seen a hacker use it to keep their firmware size under control between code changes. And, if you’re wondering what continuous integration holds for you, here’s our hacker-oriented deep dive.

New Product: The Raspberry Pi Debug Probe

It’s fair to say that among the new product launches we see all the time, anything new from the folks at Raspberry Pi claims our attention. It’s not that their signature Linux single-board computers (SBCs) are necessarily the best or the fastest hardware on paper, but that they’re the ones with meaningful decade-plus support. Add to that their RP2040 microcontroller and its associated Pico boards, and they’re the one to watch.

Today we’ve got news of a new Pi, not a general purpose computer, but useful nevertheless. The Raspberry Pi Debug Probe is a small RP2040-based board that provides a SWD interface for debugging any ARM microcontroller as well as a more generic USB to UART interface.

The article sums up nicely what this board does — it’s for bare metal ARM coders, and it uses ARM’s built-in debugging infrastructure. It’s something that away from Hackaday we’ve seen friends using the 2040 for as one of the few readily available chips in the shortage, and it’s thus extremely convenient to have readily available as a product.

So if you’re a high level programmer it’s not essential, but if you’re really getting down to the nuts-and-bolts of an ARM microcontroller then you’ll want one of these. Of course, it’s by no means the first SWD interface we’ve seen, here’s one using an ESP32.

Nucleo-F429ZI development board with STM32F429 microcontroller

Epic Guide To Bare-Metal STM32 Programming

[Sergey Lyubka] put together this epic guide for bare-metal microcontroller programming.  While the general concepts should be applicable to most any microcontroller, [Sergey]s examples specifically relate to the Nucleo-F429ZI development board featuring the ARM-based STM32F429 microcontroller.

In the realm of computer systems, bare-metal programming most often refers to programming the processor without an intervening operating system. This generally applies to programming BIOS, hardware drivers, communication drivers, elements of the operating system, and so forth. Even in the world of embedded programming, were things are generally quite low-level (close to the metal), we’ve grown accustomed to a good amount of hardware abstraction. For example, we often start projects already standing on the shoulders of various libraries, boot loaders, and integrated development tools.

When we forego these abstractions and program directly on the microprocessor or microcontroller, we’re working on the bare metal. [Sergey] aptly defines this as programming the microcontroller “using just a compiler and a datasheet, nothing else.” His guide starts at the very foundation by examining the processor’s memory map and registers including locations for memory mapped I/O pins and other peripherals.

The guide walks us through writing up a minimal firmware program from boot vector to blinking an LED connected to an I/O pin. The demonstration continues with setup and use of necessary tools such as the compiler, linker, and flasher. We move on to increasingly advanced topics like timers, interrupts, UART output, debuggers, and even configuring an embedded web server to expose a complete device dashboard.

While initially more time consuming, working close to the metal provides a good deal of additional insight into, and control over, hardware operations.  For even more on the subject, you may like our STM32 Bootcamp series on bare-metal STM32 programming.