AVX-512: When The Bits Really Count

For the majority of workloads, fiddling with assembly instructions isn’t worth it. The added complexity and code obfuscation generally outweigh the relatively modest gains. Mainly because compilers have become quite fantastic at generation code and because processors are just so much faster, it is hard to get a meaningful speedup by tweaking a small section of code. That changes when you introduce SIMD instructions and need to decode lots of bitsets fast. Intel’s fancy AVX-512 SIMD instructions can offer some meaningful performance gains with relatively low custom assembly.

Like many software engineers, [Daniel Lemire] had many bitsets (a range of ints/enums encoded into a binary number, each bit corresponding to a different integer or enum). Rather than checking if just a specific flag is present (a bitwise and), [Daniel] wanted to know all the flags in a given bitset. The easiest way would be to iterate through all of them like so:

while (word != 0) {
  result[i] = trailingzeroes(word);
  word = word & (word - 1);
  i++;
}

The naive version of this look is very likely to have a branch misprediction, and either you or the compiler would speed it up by unrolling the loop. However, the AVX-512 instruction set on the latest Intel processors has some handy instructions just for this kind of thing. The instruction is vpcompressd and Intel provides a handy and memorable C/C++ function called _mm512_mask_compressstoreu_epi32.

The function generates an array of integers and you can use the infamous popcnt instruction to get the number of ones. Some early benchmark testing shows the AVX-512 version uses 45% fewer cycles. You might be wondering, doesn’t the processor downclock when wide 512-bite registers are used? Yes. But even with the downclocking, the SIMD version is still 33% faster. The code is up on Github if you want to try it yourself.

SGX Deprecation Prevents PC Playback Of 4K Blu-ray Discs

This week Techspot reported that DRM-laden Ultra HD Blu-ray Discs won’t play anymore on computers using the latest Intel Core processors. You may have skimmed right past it, but the table on page 51 of the latest 12th Generation Intel Core Processor data sheet (184 page PDF) informs us that the Intel Software Guard Extensions (SGX) have been deprecated. These extensions are required for DRM processing on these discs, hence the problem. The SGX extensions were introduced with the sixth generation of Intel Core Skylake processors in 2015, the same year as Ultra HD Blu-ray, aka 4K Blu-ray. But there have been numerous vulnerabilities discovered in the intervening years. Not only Intel, but AMD has had similar issues as we wrote about in October.

This problem only applies to 4K Blu-ray discs with DRM. Presumably any 4K discs without DRM will still play, and of course you can still play the DRM discs on older Intel processors. Do you have a collection of DRM 4K Blu-ray discs, and if so, do you play them via your computer or a stand-alone player?

Peering Into The Murky Depths Of Alder Lake

The winds of change are in the air for CPUs. Intel has long lorded over the computing world, and they remain a force to contend with, but many challengers gather at their gates. AMD, ARM, IBM, and other X86 designs sense a moment of weakness. In response, Intel released their Alder Lake platform with high-performance and high-efficiency cores, known as Golden Cove and Gracemont, respectively. [Clamchowder] and [cheese] have written up as many details as they were able to suss out about Gracemont.

ARM has done a multi-multi core design (big.LITTLE) for several years where they have a mix of high-power, high-performance cores and smaller, low-power cores. This allows the scheduler to make tradeoffs between power and performance. Typically the smaller cores in an ARM design are simpler in-order processors, having more in common with a microcontroller than with a full-scale desktop core. Many people have made an obvious comparison with the apparent similarities between ARM’s approach and Intel’s new offerings as Gracemont is based on Intel’s old Atom core, a low-power single issue, in-order processor. Continue reading “Peering Into The Murky Depths Of Alder Lake”

Intel RealSense D435 Depth Camera

RealSense No Longer Makes Sense For Intel

We love depth-sensing cameras and every neat hack they enabled, but this technological novelty has yet to break through to high volume commercial success. So it was sad but not surprising when CRN reported that Intel has decided to wind down their RealSense product line.

As of this writing, one of the better confirmations for this report can be found on the RealSense SDK GitHub repository README. The good news is that core depth-sensing RealSense products will continue business as usual for the foreseeable future, balanced by the bad news that some interesting offshoots (facial authentication, motion tracking) will be declared “End of Life” immediately and phased out over the next six months.

This information tells us while those living out on the bleeding edge will have to scramble, there is no immediate crisis for everyone else, whether they be researchers, hobbyists, or product planners. But this also means there will be no future RealSense cameras, kicking off many “What’s Next?” discussions in various communities. Like this thread on ROS (Robot Operating System) Discourse.

Three popular alternatives offer distinctly different tradeoffs. The “Been Around The Block” name is Occipital, with their more expensive Structure Pro sensor. The “Old Name, New Face” option is Microsoft Azure Kinect, the latest non-gaming-focused successor to the gaming peripheral that started it all. And let’s not forget OAK-D as the “New Kid On The Block” that started with a crowdfunding campaign and building an user community by doing things like holding contests. Each of these will appeal to a different niche, and we’ll keep our eye open in the future. Let’s see if any of them find the success that eluded the original Kinect, Google’s Tango, and now Intel’s RealSense.

[via Engadget]

Installing Linux Like It’s 1989

A common example of the sheer amount of computing power available to almost anyone today is comparing a smartphone to the Apollo guidance computer. This classic computer was the first to use integrated circuits so it’s fairly obvious that most modern technology would be orders of magnitude more powerful, but we don’t need to go back to the 1960s to see this disparity. Simply going back to 1989 and getting a Compaq laptop from that era running again, while using a Raspberry Pi Zero to help it along, illustrates this point well enough.

[befinitiv] was able to get a Raspberry Pi installed inside of the original computer case, and didn’t simply connect the original keyboard and display and then call it a completed build. The original 286 processor is connected to the Pi with a serial link, so both devices can communicate with each other. Booting up the computer into DOS and running a small piece of software allows the computer into a Linux terminal emulator hosted on the Raspberry Pi. The terminal can be exited and the computer will return back to its original DOS setup. This also helps to bypass the floppy disk drive for transferring files to the 286 as well, since files can be retrieved wirelessly on the Pi and then sent to the 286.

This is quite an interesting mashup of new and old technology, and with the Pi being around two orders of magnitude more powerful than the 286 and wedged into vacant space inside the original case, [befinitiv] points out that this amalgamation of computers is “borderline useful”. It’s certainly an upgrade for the Compaq, and for others attempting to get ancient hardware on the internet, don’t forget that you can always use hardware like this to access Hackaday’s retro site.

Continue reading “Installing Linux Like It’s 1989”

Will We Soon Be Running Linux On SiFive Cores Made By Intel?

There’s an understandably high level of interest in RISC-V processors among our community, but while we’ve devoured the various microcontroller offerings containing the open-source core it’s fair to say we’re still waiting on the promise of more capable hardware for anything like an affordable price. This could however change, as the last week or so has seen a flurry of interest surrounding SiFive, the fabless semiconductor company that has pioneered RISC-V technology. Amid speculation of a $2 billion buyout offer from the chip giant Intel it has been revealed that the company best known for the x86 line of processors has licensed the SiFive portfolio for its 7nm process. This includes their latest and fastest P550 64-bit core, bringing forward the prospect of readily available high-power RISC-V computing. Your GNU/Linux box could soon have a processor implementing an open-source ISA, without compromising too much on speed and, we hope, price.

All this sounds pretty rosy, but there is of course a downer for open-source hardware enthusiasts. These chips may rely on some open-source technologies, but sadly they will not themselves be open-source chips as there will be plenty of proprietary IP contained within them. We can thus only hope that Intel see fit to provide the same level of Linux support for them as they do for their x86 ranges, and we’re not left in the same situation with respect to ongoing support as we are with so many other chips. Meanwhile it’s worth remembering that SiFive are not the only player in the world of RISC-V cores, so it’s likely that competitors to the P550 and its stablemates will not be far behind.

If you’d like a more in-depth explanation of the true open-source nature of a RISC-V chip, we’ve featured something on that theme before.

Header image: Gareth Halfacree, CC BY-SA 2.0.

Where Are All The Cheap X86 Single Board PCs?

If we were to think of a retrocomputer, the chances are we might have something from the classic 8-bit days or maybe a game console spring to mind. It’s almost a shock to see mundane desktop PCs of the DOS and Pentium era join them, but those machines now form an important way to play DOS and Windows 95 games which are unsuited to more modern operating systems. For those who wish to play the games on appropriate hardware without a grubby beige mini-tower and a huge CRT monitor, there’s even the option to buy one of these machines new: in the form of a much more svelte Pentium-based PC104 industrial PC.

Continue reading “Where Are All The Cheap X86 Single Board PCs?”