Turtles All The Way Down, 40 Propeller MCU Skyscraper

October 8, 2010

Why bother interconnecting 40 Propeller microcontrollers one on top of the other? For the power that comes from parallel processing of course! [Humanoido] put the setup together for a total of 1280 ports, 640 counters, and more all running at 6.4 billion instructions per second for the low low price of 300-500$ by our count. The “skyscraper” even comes complete with software and schematics, promising developers the ability to expand or adapt for any venture. Why would we need such a setup in the first place? For any of the following: vision tracking/modification, artificial intelligence, advanced robotic control, or more.

Related: [Humanoido] loves putting MCUs together, check out one of his other creations the Basic Stamp supercomputer.

[Thanks Logan996]

29 thoughts on “Turtles All The Way Down, 40 Propeller MCU Skyscraper”

osgeld says:

October 8, 2010 at 6:51 am

Okay?

Report comment

Reply
Matt R says:

October 8, 2010 at 7:03 am

We missing a video or any links?

Report comment

Reply
Yann Vernier says:

October 8, 2010 at 7:03 am

Except it’s a poor design for *any* of those uses. But I guess that comes with the territory when Parallax is your only hammer. Try a pile of XMOS XK-1 for something usable?

Report comment

Reply
dragonfli says:

October 8, 2010 at 7:21 am

Your alt-text is showing~

Report comment

Reply
delta says:

October 8, 2010 at 7:30 am

He seems to have tons of projects like that, and he never shows a real use or application for them (except of having several BASIC stamps beep synchronised), or did I miss something?

Report comment

Reply
Alex Rossie says:

October 8, 2010 at 7:47 am

Love it

Report comment

Reply
MV says:

October 8, 2010 at 8:17 am

Hear ya, Delta.. My hoax-bells started ringing on seeing the video of a previous project mentioned above. I always like to not-judge-too-quickly (although i sometimes do) so I tried finding anything that would prove authenticity. All I found was more bells going off. On the most recent page he has a link to the schematics. But it never was meant to be a link; “..”
I wouldn’t buy one before demanding (and getting) concrete proof

Report comment

Reply
MV says:

October 8, 2010 at 8:18 am

“..” = {font = “blue}..{/font}
designed not to work

Report comment

Reply
Brad Hein says:

October 8, 2010 at 8:36 am

This is so cool!

Report comment

Reply
lwatcdr says:

October 8, 2010 at 8:39 am

Actually it kind of reminds me of the Connection Machine super computer.
http://en.wikipedia.org/wiki/Connection_Machine
Make it a hypercube with fast interconnects and it could be kind of interesting if you are interested in really fast integer performance.

Report comment

Reply
Leigh says:

October 8, 2010 at 9:21 am

The interconnect and routing was the valuable part of the CM1; the processors themselves weren’t that interesting.

Report comment

Reply
peter says:

October 8, 2010 at 9:56 am

The interconnect is the really tricky part — A few years ago, as a for-fun project to learn Verilog, I implemented something like the CM1 direct from Danny Hillis’ specifications in his dissertation, in a Xilinx FPGA, but I didn’t finish the interconnect. I added in a picoblaze processor as a serial console to access the processors, but always thought it’d be neat to feed it over ethernet or USB. I was able to fit about 256 of the 1-bit processors with about 50% utilization on a spartan 3e starter kit, but intuitively since the processors in Hillis’ dissertation are essentially 1-bit ALU’s with some addressing, I felt as though the density could be vastly improved and much of the gate count was likely going into the memory addressing and block ram.

i love to see folks these microcontroller clusters, and it’s even more impressive when they build some simple software demonstration to go along with them. i’m not sure that it has any particularly useful applications other than as an academic persuit, but that’s more than enough! :)

Report comment

Reply
Slanesch says:

October 8, 2010 at 9:58 am

Got to agree with Leigh on this one. 1 bit at a time isn’t all that great. I’m so glad things got cleaned up in the later versions.

Report comment

Reply
Slanesch says:

October 8, 2010 at 10:00 am

@ peter
That is pretty epic man! could i see the schematic?

Report comment

Reply
Micah says:

October 8, 2010 at 10:40 am

So, it’s just slightly faster than one of these: http://www.gumstix.com/store/catalog/product_info.php?products_id=227

which is just slightly smaller than 1/4 the size of one of the boards he’s using.

Report comment

Reply
jeditalian says:

October 8, 2010 at 10:43 am

this reminds me of the time i interconnected 42 pentium cpus and ok bullshit but: 6.4 billion instructions per second
what is that like Gips or Bips?
and how does this differ from 6.4ghz cuz no comprende señor

Report comment

Reply
Leithoa says:

October 8, 2010 at 11:12 am

The real question is how many FLOPS does this thing put out?

@Jeditalian
6.4 billion instructions per second is 6.4GIPS.
Hertz are cycles per second as you’re probably aware. In computers only a certain number of operations can be executed in any one cycle of the clock, dictated by their bit rate(32 or 64 for personal computers). Newer desktops operate in 64-bits at their clock speed eg; 3.6GHz 64-bit.
Now each segment of code has a size associated with it, so moving a data from one section of the RAM to another may require 4 bits to accomplish it. So you can do this operation 16 (4/64) times per cycle. Since your CPU goes at 3.6 billion hertz you can move that data around 57.6 billion times per second. Moving it counts as one instruction,so if all the computer did was that move operation it would operate at 57.6GIPS.

Report comment

Reply
lwatcdr says:

October 8, 2010 at 12:00 pm

@Leigh @peter Maybe you could use the COGs for IO and the inter connects and use the same hyper cube design that the CM-1 used.
Just for fun mind you. Not really a good system.
PS peter wow on the CM1 on an FPGA Well part of one anyway. Very cool. Right up there with the CRAY-1 on an FPGA.

Report comment

Reply
cantido says:

October 8, 2010 at 5:55 pm

@leigthoa

Wtf is a bitrate?? The amount of instructions per clock is determined by design… I.e pipeline design, availability of execution units…

@article

Makes no sense. Has zero real world use.

Report comment

Reply
humdum says:

October 8, 2010 at 7:23 pm

I’d bet he compensates the lack of coding (or any real) skills by making these seemingly complicated designs with no real life use. Sad :(

Report comment

Reply
peter says:

October 8, 2010 at 8:39 pm

@Slanesch: by schematic do you mean the Verilog code? I just used a stock Spartan 3E starter kit (although I had hoped of one day making a board full of FPGAs for some fun connection-machine-like and other experiments).

if you’re a verilog and xilinx person (the control processor is a picoblaze), feel free to e-mail me. the code is a few years old and it’s somewhere in the middle of development (by a person learning verilog), and i’m in the last few weeks of writing my own disertation so i can’t devote any time to it/cleaning it up, but i might be able to just zip it up and send it off for purely academic interest?

Report comment

Reply
Leithoa says:

October 8, 2010 at 9:16 pm

@cantido
I’m assuming that comment was directed to me, so:
Bit rate probably isn’t an accurate description, but what I was referring to was the chip architecture. From a simple perspective, Personal computers are either 32 or 64-bit machines depending on age. So again in a simple model every clock cycle they can execute 32 or 64 bits worth of instructions,4 or 8 instructions respectivly. This model is ignoring hyper-threading, pipelines, memory tricks and other speed enhancing techniques.

Report comment

Reply
Anonymouse says:

October 8, 2010 at 11:51 pm

No. That is bullshit. x64 gives you twice as many general-purpose registers (and each holding 64 bits instead of 32), twice as many SIMD registers, and a lot more address space.

Read the wikipedia article.

http://en.wikipedia.org/wiki/X86-64

Report comment

Reply
M4CGYV3R says:

October 9, 2010 at 1:26 pm

@Anonymouse:
Did you read the datasheet on those prop chips? Each of them has 8 cores with simultaneous instruction processing. This thing can move a buttload of data.

It can only run at a max of 80MHz, but it can process more than 53 times the instructions per cycle of the highest-end Intel i7 6-core monster.

Even allowing for 4-opcode-per-cycle SIMD operations on all 6 cores, you still perform more than twice as many operations on this prop tower.

Report comment

Reply
walt says:

October 10, 2010 at 10:31 am

Anon

The individual processors on the Prop are actually not that powerful by themselves. Due to their design they top out at around 5 MIPs at 80Mhz if you are running a C program. If you code a assembly program(which has to be less than 2k total size) you can achieve 20 MIPs performance. A $6 ARM basically beats the Prop like a cheap drum.

If you want to play around with real parallel processing check out the Xmos processors.

Report comment

Reply
Anon says:

October 10, 2010 at 2:51 pm

Leon, Is that you again????

Report comment

Reply
G2 says:

October 10, 2010 at 3:02 pm

He called it UltraSpark o_0 … seriously??? I wonder if he’ll call his OS Solariss…
Interesting – but I would be more impressed if he had built it with some actual purpose or task in mind…

Report comment

Reply
Prime Time News says:

October 31, 2010 at 11:13 pm

1 Propeller, 8 Cogs, 200MIPS, 32 I/O pins, HUB 32K, Cogs 16K, 200MHZ clock, low cost, programs easy, loads of new features, tons of support, good choice. cheers.. ptn

Report comment

Reply
steve says:

December 28, 2010 at 9:51 am

With friends like Humanoido to make it look stupid, the Propeller chip does not need enemies.

Report comment

Reply