Distributed computing in JavaScript

mapreduce

We’ve heard about the idea of using browsers as distributed computing nodes for a couple years now. It’s only recently, with the race towards faster JavaScript engines in browsers like Chrome that this idea seems useful. [Antimatter15] did a proof of concept JavaScript implementation for reversing hashes. Plura Processing uses a Java applet to do distributed processing. Today, [Ilya Grigorik] posted an example using MapReduce in JavaScript. Google’s MapReduce is designed to support large dataset processing across computing clusters. It’s well suited for situations where computing nodes could go offline randomly (i.e. a browser navigates away from your site). He included a JavaScript snippet and a job server in Ruby. It will be interesting to see if someone comes up with a good use for this; you still need to convince people to keep your page open in the browser though. We’re just saying: try to act surprised when you realize Hack a Day is inexplicably making your processor spike…

[via Slashdot]

Comments

  1. Maybe instead of selling advertising websites could start selling process cycles on their visitor’s computers.

    I run adblock because hell if most ads aren’t just annoying eyesores that I wouldn’t click on anyway. But I’d certainly be willing to let sites snag some of my background cycles for their render farms, computer simulations, etc.

  2. cyrozap says:

    Now to make it so that people can’t damage your computer with this.

  3. silic0re says:

    while this might be an interesting conceptual demonstration, java is generally very slow when compared to other high-level languages like c/c++ — and certainly when executed in the confines of a browser. it doesn’t look like they have any performance benchmarks in their article, but my guess is that the java running in a browser is probably anywhere from 10 to 1000x slower than similar code written in C and executed in a shell. (it’s been a while since I’ve compared java/c speeds, so maybe this has increased a little in the last few years, but i’d imagine the gap is still quite large).

    this would essentially mean that you’d have to have anywhere between 10 to 1000 machines running a java client to do the work of a single machine written using a high-performance computing language and libraries. that being said, i’m certain there are ways you could get more performance out of java — maybe by implementing a special set of libraries that do certain computations particularly efficiently and natively on a machine — but it’s still likely a very large source of inefficiency in the current model.

    still, a neat idea! i especially like the whole ‘let me compute a tiny bit for you using idle cycles in a very controlled way instead of showing me tons of ads’ idea a commenter above just suggested. i suppose in any case, even if the efficiency was only 1%, that’s potentially a lot of cycles that were otherwise unused.

  4. Angus says:

    This might gain a few more users than a traditional distributed computing client, by avoiding the need for users to install additional software.

    However, I think it’s likely the significant slowdown caused by running everything as Javascript instead of native CPU/GPU code will outweigh the advantages.

    @silic0re: This is written in Javascript, which is a completely different language from Java. You’re probably still right about the speed, though.

  5. RomanSB says:

    Never!

  6. Paul says:

    @silic0re & angus

    The hackaday writer touched on this, but did not really go into detail. While it is probably still slower than C++ the latest generation of browsers are including new javascript engines that take the javascript and compile it into a binary object. While this slows down the initial load, after that it goes light years faster.

    And the benefit here is that rather than having people download a client, all they need to do now is navigate to a website to join the computing cloud.

    Also, just thinking about it a completely useless but interesting sidenote is that this will run on anything that can run javascript, which is far more devices than can run any other distributed computing tool.

  7. angus says:

    Yes, modern javascript engines compile to native code and are fast. But for something computationally intensive I think there’d still be a significant slowdown compared to C.

    The main reason I think this is that Javascript is a dynamically-typed language, which means that the interpreter has to do lots of checking at runtime to know what type of an object it’s dealing with, or do sophisticated analysis at load-time to predict the type. (Contrast with C in which you tell the compiler the types of your variables).

    I don’t know the extent to which modern browsers can do this analysis. I’m guessing that the speed difference between Javascript and C is significant enough to make this system unable to compete effectively with other distributed computing platforms.

    I also think a conventional application which runs in the background and starts automatically is likely to contribute a lot more CPU time than a web page which the user has to deliberately load and leave open.

  8. The speed difference is absolutely irrelevant. We are not talking about substituting this for natively run code, we are talking about reaching a different demographic. If the code is not run this way that doesn’t mean it will be run in C, it means that it will not be run at all.

    It is like arguing that there is no point for the Salvation Army to stand on the sidewalk ringing their bells because it would be much more efficient if people just mailed in checks.

  9. cppchriscpp says:

    There’s no doubt that JavaScript would be slower than C or C++ or something else compiled, but that is irrelevent here. The issue that most distributed computing projects face is a lack of computers actually computing. This approach would trade off some serious power behind each user, however it could theoretically increase the number of users exponentially.

    Imagine if for example, someplace like Facebook or Addicting Games implemented something like this behind the scenes. (Disregard any legal issues/ToS issues/ etc for now.) While none of the users would actually contribute enough to notice a slowdown of the site, each and every user would provide some small amount of processing power. This by itself would not be much, but now compare the demographic of users on that site to the demographic of users running any major distributed computing client. I think you will find an exponentially higher number of users behind the website.

    This is definitely an interesting idea, and I can see it taking off it it is approached in the right way.

  10. squeakyneb says:

    I like this idea. I’m gonna go look for source code to see if I can rewrite this in Python. Pointless, yes, and I realize that a standar browser won’y run python, but I’m going to try to rebuild the distribution portion of this. Just for the hell of it :D I’m also into networking with python.

  11. Fowl says:

    If only *all* of the major browsers supported web workers and did so in a way that they were schelduled properly so they don’t make browsing take a back seat to computing the universe.

  12. bWare says:

    Maybe this is what Google’s Native client is about?

  13. Timothy says:

    It’s amazing to see the direction that JavaScript is going.

  14. SOOPERGOOMAN187 says:

    I think if you embedded a small applet in all pages on the net then that would help to smooth the processing requirements out. You said to expect weird spike to the processor but if implemented the right way that would never happen. It would only need to share a small portion from each computer connected. Eventually with all pages exhibiting this it would be a continuous sharing of resources, if that makes sense to anyone else but me.

  15. Mycroftxxx says:

    Uhm, how about loading enough work units to consume 10 seconds of time on a “standard” pc that would have to be computed to post a comment on a blog?

    It seems likely that there’ld be some side-benefit to forcing spammers to develop more efficient javascript engines.

  16. DarkFader says:

    No worries about security and exploitability.
    And if it calls a trusted math library in ActiveX form, all the better!

  17. TehDooMCat says:

    Aww damn, I got beaten to it :P

    I was working on a distributed computing in javascript concept myself. All mine did at the time I forgot about it is generate custom md5 rainbow tables, with hashes 6 to 20 characters long… of no real use when it comes to cracking passwords. It -did- work though, and it didn’t use up too much processing power.

    Methinks the way to make it fully invisible to the user and not hog their system is, if it’s implemented using setInterval, start at an interval of 0 between iterations and increase the interval ’til it’s higher than the average time it takes to run the function.

    “Uhm, how about loading enough work units to consume 10 seconds of time on a “standard” pc that would have to be computed to post a comment on a blog?”

    I like this idea and might just implement it on the registration form of the forums I’m developing ;)

  18. James says:

    There’s an interesting moral(?) question to this though – if sites could sell processor time of their visitors computer for someone elses computational needs, unless they make this very clear from the outset they are effectively stealing both bandwidth and processor performance from the end user. It’s analogous to a viral folding client.

  19. jroelofs says:

    i really don’t see this being all that useful, that is unless the computation being done takes significantly more time than a couple of ms. for most problems, it would be faster to compute locally on one decently fast machine. if the problem is not embarrassingly parallel or even if there is a lot of data to move around there will be too much overhead. for this to work you’d need problems that are very simple to specify and whose answers are also really simple (read: brute-force crypto & genetic algorithms).

  20. jroelofs says:

    nonetheless, very fascinating research.

  21. cali says:

    this was actually implemented in the svn testing version of beef (bindshell.net) about 3 years ago. Proof of concept was good but unfeasible as gpu’s and standard cpu’s are a lot faster. It was estimated you’d need 2 million+ hosts, to get any benefit, and then your limited by connectvity. If a host goes down you lose their solution.

    Nice try though

  22. Tom says:

    As proof of concept I realized a motion detection in Javascript. This is actually something more practically useful… A PDF with the details is available at: http://mjpg-streamer.wiki.sourceforge.net/space/showimage/Distributed+Computing+and+Image+Processing+in+JavaScript.pdf

  23. Kory says:

    http://seniorproject.korykirk.com

    I created a distributed implementation of Pi in hexadecimal using javascript (before I ever saw this) it can be found above.

  24. Ashley says:

    That must be a glad news to hear. I am wondering whether it will work really

  25. Guy says:

    Think what could be achieved if Google or Facebook would ask their users to contribute some of their browser power for the benefit of some data processing projects.

    There are many academic projects that could use this IMO, mainly in the field of bioinformatics.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 92,339 other followers