Try Image Classification Running In Your Browser, Thanks To WebGPU

When something does zero-shot image classification, that means it’s able to make judgments about the contents of an image without the user needing to train the system beforehand on what to look for. Watch it in action with this online demo, which uses WebGPU to implement CLIP (Contrastive Language–Image Pre-training) running in one’s browser, using the input from an attached camera.

By giving the program some natural language visual concept labels (such as ‘person’ or ‘cat’) that fit a hypothetical template for the image content, the system will output — in real-time — its judgement on the appropriateness of such labels to what the camera sees. Again, all of this runs locally.

It’s maybe a little bit unintuitive, but what’s happening in the demo is that the system is deciding which of the user-provided labels (“a photo of a cat” vs “a photo of a bald man”, for example) is most appropriate to what the camera sees. The more a particular label is judged a good fit for the image, the higher the number beside it.

This kind of process benefits greatly from shoveling the hard parts of the computation onto compatible graphics cards, which is exactly what WebGPU provides by allowing the browser access to a local GPU. WebGPU is relatively recent, but we’ve already seen it used to run LLMs (Large Language Models) directly in the browser.

Wondering what makes GPUs so very useful for AI-type applications? It’s all about their ability to work with enormous amounts of data very quickly.

WebGPU… Better Than WebGL?

As the browser becomes more like an operating system, we are seeing more deep features being built into them. For example, you can now do a form of assembly language for the browser. Sophisticated graphics have been around using WebGL since around 2011, but some people find it hard to use. [Surma] was one of those people and tried a new method that is just surfacing to do the same thing: WebGPU.

[Surma] liked it better and shares a lot of information in the post and — oddly — the post doesn’t use WebGPU for graphics very much. Instead, the post focuses on using GPU cores for fast computation, something else you can do with WebGPU. If your goal is to draw on the screen, though, you need to know the basics and the post links to a site with examples of doing this.

Continue reading “WebGPU… Better Than WebGL?”