Writing An Optimizing Tensor Compiler From Scratch

Not everyone will write their own optimizing compiler from scratch, but those who do sometimes roll into it during the course of ever-growing project scope creep. People like [Michael Moroz], who wrote up a long and detailed article on the why and how. Specifically, a ‘small library’ involving a few matrix operations for a Unity-based project turned into a static optimizing tensor compiler, called TensorFrost, with a Python front-end and a shader-like syntax, all of which is available on GitHub.

The Python-based front-end implements low-level NumPy-like operations, with development still ongoing. As for why Yet Another Tensor Library had be developed, the reasons were that most of existing libraries are heavily focused on machine learning tasks and scale poorly otherwise, dynamic control flow is hard to implement, and the requirement of writing custom kernels in e.g. CUDA.

Above all [Michael] wanted to use a high-level language instead of pure shader code, and have something that can output graphical data in real-time. Taking the gamble, and leaning on LLVM for some parts, there is now a functional implementation, albeit with still a lot of work ahead.

2 thoughts on “Writing An Optimizing Tensor Compiler From Scratch

  1. In a time where we’re threatened with statements like “in the future, 100% of your game pixels will be generated” and all effort will be poured into silicon with large NPUs instead, my a-priori questions are:
    – can we run shader code on NPUs (assuming they’re not limited to INT8 ops)?
    – what good are NPUs for general purpose computing?

    Since the headline here doesn’t say “Nice NPU You Got There, Would Be A Shame If Someone Turned It Back Into A Good Old Graphics Card”, I take it the matter is a lot more complex, and after a quick glance it seems like machine learning libraries canonically used to utilize NPUs don’t provide the means to even implement control flow or graphics interaction:

    “control flow can be very prevalent, but unfortunately it’s quite inconvenient to express in these libraries, if even possible”

    “The lack of a native way to output graphical data from these libraries is even more annoying when you remember that GPU’s are called Graphics Processing Units, not Tensor Processing Units. And they have all the required hardware to work with and output graphics.
    PS. Taichi actually does have a way to this! It has integration with GLFW and ImGUI.”

    (also: if this tensor compiler is what I think it is, it’s definitely a hack!)

Leave a Reply to tamusjroyceCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.