OpenGL In 500 Lines (Sort Of…)

How difficult is OpenGL? How difficult can it be if you can build a basic renderer in 500 lines of code? That’s what [Dmitry] did as part of a series of tiny applications. The renderer is part of a course and the line limit is to allow students to build their own rendering software. [Dmitry] feels that you can’t write efficient code for things like OpenGL without understanding how they work first.

For educational purposes, the system uses few external dependencies. Students get a class that can work with TGA format files and a way to set the color of one pixel. The rest of the renderer is up to the student guided by nine lessons ranging from Bresenham’s algorithm to ambient occlusion. One of the last lessons switches gears to OpenGL so you can see how it all applies.

Continue reading “OpenGL In 500 Lines (Sort Of…)”

WebGPU… Better Than WebGL?

As the browser becomes more like an operating system, we are seeing more deep features being built into them. For example, you can now do a form of assembly language for the browser. Sophisticated graphics have been around using WebGL since around 2011, but some people find it hard to use. [Surma] was one of those people and tried a new method that is just surfacing to do the same thing: WebGPU.

[Surma] liked it better and shares a lot of information in the post and — oddly — the post doesn’t use WebGPU for graphics very much. Instead, the post focuses on using GPU cores for fast computation, something else you can do with WebGPU. If your goal is to draw on the screen, though, you need to know the basics and the post links to a site with examples of doing this.

Continue reading “WebGPU… Better Than WebGL?”

Modern CPUs Are Smarter Than You Might Realize

When it comes to programming, most of us write code at a level of abstraction that could be for a computer from the 1960s. Input comes in, you process it, and you produce output. Sure, a call to strcpy might work better on a modern CPU than on an older one, but your basic algorithms are the same. But what if there were ways to define your programs that would work better on modern hardware? That’s what a pre-print book from [Sergey Slotin] answers.

As a simple example, consider the effects of branching on pipelining. Nearly all modern computers pipeline. That is, one instruction is fetching data while an older instruction is computing something, while an even older instruction is storing its results. The problem arises when you already have an instruction partially executed when you realize that an earlier instruction caused a branch to another part of your code. Now the pipeline has to be backed out and performance suffers while the pipeline refills. Anything that had an effect has to reverse and everything else needs to be discarded.

That’s bad for performance. Because of this, some CPUs try to predict if a branch is likely to occur or not and then speculatively fill the pipeline for the predicted case. However, you can structure your code, for example, so that it is more obvious how branching will occur or even, for some compilers, explicitly inform the compiler if the branch is likely or not.

As you might expect, techniques like this depend on your CPU and you’ll need to benchmark to show what’s really going on. The text is full of graphs of execution times and an analysis of the generated assembly code for x86 to explain the results. Even something you think is a pretty good algorithm — like binary search, for example, suffers on modern architectures and you can improve its performance with some tricks. Actually, it is interesting that the tricks work on GCC, but don’t make a difference on Clang. Again, you have to measure these things.

Probably 90% of us will never need to use any of the kind of optimization you’ll find in this book. But it is a marvelous book if you enjoy solving puzzles and analyzing complex details. Of course, if you need to squeeze those extra microseconds out of a loop or you are writing a library where performance is important, this might be just the book you are looking for. Although it doesn’t cover many different CPUs, the ideas and techniques will apply to many modern CPU architectures. You’ll just have to do the work to figure out how if you use a different CPU.

We’ve looked at pieces of this sort of thing before. Pipelining, for example. Sometimes, though, optimizing your algorithm isn’t as effective as just changing it for a better one.

An Interview With Reinhard Keil

Over on the Embedded FM podcast, [Chris] and [Elecia] just released their interview with [Reinhard Keil] of compiler fame. [Reinhard] recounts the story of Keil’s growth and how it eventually became absorbed into Arm back in 2005. Along with his brother Günter, the two founded the company as Keil Software in the Americas, and Keil Elektronik in Europe. They initially made hardware products, but as the company grew, they became dissatisfied with the quality and even existence of professional firmware development tools of the day. Their focus gradually shifted to making a CP/M- and a PC-based development environment, and in 1988, they introduced the first C-compiler designed for the 8051 from the ground up.

Love it or hate it, the Arm Keil suite of µVision IDE and the MDK/Cx51 compiler have been around a long time and used by embedded developers in many industries. Although a free and restricted-use version is available, the license fees prevent most folks from getting very enthusiastic about it. Pricing aside, the µVision IDE has its critics: [Jay Carlson], who used every IDE under the sun a few years ago in his review of sub-one-dollar microcontrollers, opined that it was nothing more than a free editor you get with C51 or MDK-ARM. On the other hand, even [Jay] concedes is that every chip he tested was officially supported by Keil and worked out of the box. Another thing that is important to some users is being able to produce consistent binaries from old projects. This isn’t important for your one-off MQTT hot tub thermometer. But if you need to recompile firmware for a fifteen-year-old railroad signaling system that has multiple certifications and regulatory approvals, using the original compiler and library versions is a huge help.

[Reinhard] goes on to discuss various tools and systems being developed at Arm by his team, such as improvements and additions to the CMSIS suite, the transition of the online Mbed compiler to the new Keil Studio Cloud, and an Arm hardware virtualization tool for cloud-based CI verification. Lest you think everything at Arm is proprietary and expensive, he points out that Arm is a major contributor to the GCC project and the CMSIS components are open source. Even if you aren’t interested in Arm/Keil tools, do check out the interview — it’s quite interesting and touches on several topics of general interest to all firmware developers. Or if you prefer, read the interview when the transcript is completed.

Ray Tracing On A Modern TI Graphing Calculator

Something being impractical isn’t any reason not to do it, which is why just about anything with a CPU in it can run Doom by now. For the same reason there obviously is a way to do ray tracing of 3D scenes on a modern-day TI-84 Plus CE graphical calculator. This is excellent news for anyone who has one of these calculators, along with a lot of time, perhaps during boring classes, to spare.

As [TheScienceElf] demonstrates in a video, also embedded after the break, it’s not quite the real-time experience one would expect from an NVidia RTX 30-series GPU. Although the eZ80-based CPU in the calculator is significantly more efficient than a Z80 as found in many 1980s home computers, the demo scene at standard resolution takes about 12 minutes to render, as also noted on the GitHub project page.

Perhaps the most interesting part about this project is its use of the Clang-based C & C++ toolchain for the TI-84 Plus CE which gives easy access to the calculator’s hardware and related, including graphics, file I/O, fonts, keypad input and more. Even if using a TI-84 Plus CE to render the next Pixar-level movie isn’t the most productive use imaginable for these devices, this project and the CE toolchain make it all too easy to tinker with these $150 devices.

It would also offer a nice change of pace from writing Snake in TiBASIC, a BASIC dialect in which [TheScienceElf] incidentally has also written a ray tracer.

(Thanks to [poiuyt] for the tip)

Continue reading “Ray Tracing On A Modern TI Graphing Calculator”

Commodore 64 Monitor Traces I/O Calls, Eases Debugging

Developing for the Commodore 64 can be a rewarding retrocomputing experience, and thanks to [Dave Van Wagner], things are easier with his C64 IO_Monitor project, which opens the door to logging and tracing Kernal I/O calls for closer inspection. That’s not a typo, by the way. Kernal is what handles the C64’s low-level OS routines. Amusingly, as the story goes, it did in fact originate as a misspelling of kernel, but the name stuck.

What [Dave]’s program does is trace and log all input and output calls going through Kernal, which includes just about any function one might imagine. Things like keyboard input, screen output, and disk or tape I/O are all dutifully counted and logged, allowing one to really peek under the hood at a low level when doing any kind of development work. This kind of tool has turned out to be pretty handy given [Dave]’s penchant for porting Commodore emulators to a variety of (sometimes unusual) platforms.

Interested in giving it a spin? Head to the project’s GitHub repository for all the necessary files as well as some usage details, and enjoy making debugging and development a little less opaque than it otherwise would be.

Hello (Many Quantum) World(s)

Historically, the first program you write for a new computer language is “Hello World,” or, if you are in Texas, “Howdy World.” But with quantum computing on the horizon, you need something better. Like “Hello Many Worlds.” [IonQ] proposes what that looks like and then writes it in seven different quantum languages in a post you should check out.

Here’s the description of the simple program:

The basic quantum program we’ll write is simple. It creates a fully-entangled state between two qubits, and then measures this state. This state is sometimes called a Bell State, or Bell Pair, after physicist John Stewart Bell.

The measurement results for this program should give us 0 for both qubits or 1 for both qubits, in equal amounts. When running these, we’ll be able to tell that we’re running on real hardware because that’s not always what we get! These errors are what currently limit quantum computers, but the first steps to overcome this with quantum error correction have already begun.

Continue reading “Hello (Many Quantum) World(s)”