The U.S. Department of Energy’s National Nuclear Security Administration (NNSA) and its three national labs this week announced they have reached an agreement for an open-source Fortran front-end for Higher Performance Computing (HPC). The agreement is with IBM? Microsoft? Google? Nope, the agreement is with NVIDIA, a company known for making graphics cards for gamers.
The heart of a graphics card is the graphics processor unit (GPU) which is an extremely powerful computing engine. It’s actually got more raw horsepower than the computer CPU, although not as much as many claim. A number of years ago NVIDIA branched into providing compiler toolsets for their GPUs. The obvious goal is to drive sales. NVIDIA will use as a starting point their existing Fortran compiler and integrate it with the existing LLVM compiler infrastructure. That Fortran, it just keeps chugging along.
You can try out GPU programming on your Raspberry Pi. Yup! Even it has one, a Broadcom. Just follow the directions from Raspberry Pi Playground. You’re going to get your hands dirty with assembly language so this is not for the faint hearted. One of the big challenges with GPUs is exchanging data with them which gets into DMA processing. You could also take a look at [Pete Warden’s] work on using the Pi’s GPU.
Still wondering about the performance of CPU vs GPU? Here’s Adam Savage taking a look…
Today, Nvidia announced their latest platform for advanced technology in autonomous machines. They’re calling it the Jetson TX1, and it puts modern GPU hardware in a small and power efficient module. Why would anyone want GPUs in an embedded format? It’s not about frames per second; instead, Nvidia is focusing on high performance computing tasks – specifically computer vision and classification – in a platform that uses under 10 Watts.
For the last several years, tiny credit card sized ARM computers have flooded the market. While these Raspberry Pis, BeagleBones, and router-based dev boards are great for running Linux, they’re not exactly very powerful. x86 boards also exist, but again, these are lowly Atoms and other Intel embedded processors. These aren’t the boards you want for computationally heavy tasks. There simply aren’t many options out there for high performance computing on low-power hardware.
Tiny ARM computers the size of a credit card have served us all well for general computing tasks, and this leads to the obvious question – what is the purpose of putting so much horsepower on such a small board. The answer, at least according to Nvidia, is drones, autonomous vehicles, and image classification.
Image classification is one of the most computationally intense tasks out there, but for autonomous robots, there’s no other way to tell the difference between a cyclist and a mailbox. To do this on an embedded platform, you either need to bring a powerful general purpose CPU that sucks down 60 or so Watts, or build a smaller, more efficient GPU-based solution that sips a meager 10 Watts.
If you want video support on your project, you might start from a device like a Raspberry Pi that comes with it built in. [Kevinhub88] doesn’t accept such compromises, so he and his Black Mesa Labs have come up with a whole new way to add video support to devices like the Arduino and other cheap controllers. This project is called Mesa-Video, and it can add digital video at a resolution of up to 800 by 600 pixels to any device that has a single serial output.
The video is created by an FT813, a low cost GPU from FTDI that offers a surprising amount of video oomph from a cheap, low power chip (he has demoed it running from a lemon battery), meaning that he is hoping to be able to sell the Mesa-Video for under $50.
UPDATE: [KevinHub88] let us know that he didn’t actually power the device from a lemon battery, as you would need a lot of lemons to make 50mA at 5V. Apologies for any confusion!
However, Mesa-Video is just the beginning. [Kevinhub88] wanted to get around the problem of stacking shields on Arduinos: add more than one and you get problems. He wanted to create an interface that would be simpler, faster and more open, so he created the Mesa-Bus. This effectively wraps SPI and I2C traffic together over a simple, fast serial connection that doesn’t require much decoding. This means that you can send power and bi-directional data over a handful of wires, and still connect multiple devices at once, swapping them out as required. You could, for instance, do your development work on a PC talking to the prototype devices over Mesa-Bus, them swap the PC out for an Arduino when you have got the first version working in your dev environment. Is the Arduino not cutting it? Because Mesa-Bus is cross-platform and open source, it is easy to swap the Arduino for a Raspberry Pi without having to change your other devices. And, because all the data is going over a simple serial connection in plain text, it is easy to debug.
It’s an ambitious project, and [Kevinhub88] has a way to go: he is currently working on getting his first prototype Mesa-Bus devices up and running, and finalizing the design of the Mesa-Video. But it is an impressive start and we’ll be keeping a close eye on this work. Hopefully he can avoid that head crab problem as well because those things are as itchy as hell.
[Radical Brad] has played around with FPGAs, video signals, and already has a few astonishing projects of bitbanged VGA on his resume. Now he’s gone insane. He’s documenting a build over on the 6502.org forums of a computer with Amiga-quality graphics built out of nothing but a 65C02, a few SRAM chips, and a whole pile of logic chips.
The design goals for this project are to build a video game system with circa 1980 parts and graphics a decade ahead of its time. The video output is VGA, with 400×300 resolution, in glorious eight-bit color. The only chips in this project more complex than a shift register are a single 65c02 and a few (modern) 15ns SRAMs. it’s not a build that would have been possible in the early 80s, but the only thing preventing that would be the slow RAM chips of the era.
So far, [Radical] has built a GPU entirely out of 74-series logic that reads a portion of RAM and translates that to XY positions, colors, pixels, and VGA signals. There’s support for alpha channels and multiple sprites. The plan is to add sound hardware with support for four independent digital channels and 1 Megabyte of sample memory. It’s an amazingly ambitious project, and becomes even more impressive when you realize he’s doing all of this on solderless breadboards.
[Brad] will keep updating the thread on 6502.org until he’s done or dies trying. So far, it’s looking promising. He already has a bunch of Boing balls bouncing around a display. You can check out a video of that below.
The Raspberry Pi has a port for a camera connector, allowing it to capture 1080p video and stream it to a network without having to deal with the craziness of webcams and the improbability of capturing 1080p video over USB. The Raspberry Pi compute module is a little more advanced; it breaks out two camera connectors, theoretically giving the Raspberry Pi stereo vision and depth mapping. [David Barker] put a compute module and two cameras together making this build a reality.
The use of stereo vision for computer vision and robotics research has been around much longer than other methods of depth mapping like a repurposed Kinect, but so far the hardware to do this has been a little hard to come by. You need two cameras, obviously, but the software techniques are well understood in the relevant literature.
[David] connected two cameras to a Pi compute module and implemented three different versions of the software techniques: one in Python and NumPy, running on an 3GHz x86 box, a version in C, running on x86 and the Pi’s ARM core, and another in assembler for the VideoCore on the Pi. Assembly is the way to go here – on the x86 platform, Python could do the parallax computations in 63 seconds, and C could manage it in 56 milliseconds. On the Pi, C took 1 second, and the VideoCore took 90 milliseconds. This translates to a frame rate of about 12FPS on the Pi, more than enough for some very, very interesting robotics work.
There are some better pictures of what this setup can do over on the Raspi blog. We couldn’t find a link to the software that made this possible, so if anyone has a link, drop it in the comments.
Nearly a year ago, an extremely interesting project hit Kickstarter: an open source GPU, written for an FPGA. For reasons that are obvious in retrospect, the GPL-GPU Kickstarter was not funded, but that doesn’t mean these developers don’t believe in what they’re doing. The first version of this open source graphics processor has now been released, giving anyone with an interest a look at what a late-90s era GPU looks like on the inside, If you’re cool enough, there’s also enough supporting documentation to build your own.
A quick note for the PC Master Race: this thing might run Quake eventually. It’s not a powerhouse. That said, [Bunnie] had a hard time finding an open source GPU for the Novena laptop, and the drivers for the VideoCore IV in the Raspi have only recently been open sourced. A completely open GPU simply doesn’t exist, and short of a few very, very limited thesis projects there hasn’t been anything like this before.
Right now, the GPL-GPU has 3D graphics acceleration working with VGA on a PCI bus. The plan is to update this late-90s setup to interfaces that make a little more sense, and add DVI and HDMI output. Not bad for a failed Kickstarter, right?
Hold on to your hats, because this is a good one. It’s a tale of disregarding the laws of physics, cancelled crowdfunding campaigns, and a menagerie of blogs who take press releases at face value.
Meet Silent Power (Google translation). It’s a remarkably small and fairly powerful miniature gaming computer being put together by a team in Germany. The specs are pretty good for a completely custom computer: an i7 4785T, GTX 760, 8GB of RAM and a 500GB SSD. Not a terrible machine for something that will eventually sell for about $930 USD, but what really puts this project in the limelight is the innovative cooling system and small size. The entire machine is only 16x10x7 cm, accented with a very interesting “copper foam” heat sink on top. Sounds pretty cool, huh? It does, until you start to think about the implementation a bit. Then it’s a descent into madness and a dark pit of despair.
There are a lot of things that are completely wrong with this project, and in true Hackaday fashion, we’re going to tear this one apart, figuring out why this project will never exist.