The best part about the term “Artificial Intelligence” is that nobody can really tell you what it exactly means. The main reason for this stems from the term “intelligence”, with definitions ranging from the ability to practice logical reasoning to the ability to perform cognitive tasks or dream up symphonies. When it comes to human intelligence, properties such as self-awareness, complex cognitive feats, and the ability to plan and motivate oneself are generally considered to be defining features. But frankly, what is and isn’t “intelligence” is open to debate.
What isn’t open to debate is that AI is a marketing goldmine. The vagueness has allowed for marketing departments around the world to go all AI-happy, declaring that their product is AI-enabled and insisting that their speech assistant responds ‘intelligently’ to one’s queries. One might begin to believe that we’re on the cusp of a fantastic future inhabited by androids and strong AIs attending to our every whim.
In this article we’ll be looking at the reality behind these claims and ponder humanity’s progress towards becoming a Type I civilization. But this is Hackaday, so we’re also going to dig into the guts of some AI chips, including the Kendryte K210 and see how the hardware of today fits into our Glorious Future.
Introducing the K210
The Kendryte K210 System-on-Chip is the AI-on-the-edge chip du jour and combines a dual-core, 64-bit RISC-V processor, along with the usual slew of peripherals. As for the target market of this chip, Kendryte summarizes it as:
Kendryte in Chinese means researching intelligence. The main application field of
this chip is in the field of Internet of Things. The chip provides AI solutions to
add intelligence to this.
Of relevance there is the KPU processor which the K210 datasheet describes as:
KPU is a general-purpose neural network processor with built-in convolution,
batch normalization, activation, and pooling operations. It can detect faces or
objects in real time.
In the same PDF we can find more detailed information about what this KPU’s feature set is:
- Targeted towards convolutional neural networks (CNNs).
- Supports CNN kernels of dimensions 1×1 or 3×3.
- Supports any kind of activation function.
This all gives us a clear hint that what we’re dealing with here is a bit of silicon that is aimed at speeding up the processing of convolutional neural networks (CNNs), which are commonly used in areas involving machine vision. In a nutshell, convolutions are applied as filters, enhancing some features like edges, lines, and points, and then these features are used as inputs for deeper layers in a neural network. Machine vision, and more specifically object- and face-recognition, is probably what they’ll be good for.
So where’s the intelligence in this contraption?
Making Machines See with CNNs
The point of CNNs is to add something akin to a retina to a computer system, much like how other algorithms like Recursive Neural Networks (RNNs) along with Hidden Markov Models (HMMs) are used for natural language recognition. In essence it adds something akin to the senses and (depending on the type) part of a cortex associated with that sense. (If a CNN is doing what a human retina does, and a CNN is an artificial intelligence, do I actually have “smart retinas”?)
Much like a retina, a CNN used for machine vision is used to reduce the amount of raw image information. With the K210 SoC the ‘KPU’ peripheral is then used to offload CPU-intensive operations to dedicated hardware in order to speed up the processing. This is essentially the same as using a video processor to accelerate similarly parallel tasks with general-purpose computing on graphics processing units (GPGPU) processing, as made popular by CUDA and OpenCL.
A CNN consists of a number of consecutive elements, generally the input layer, a number of convolution layers with each its own pooling layer and finally a fully connected layer which works much like a classical artificial neural network (ANN). So-called kernels are applied to the convolution layer, with the kernel (a usually 2×2 or 3×3-sized grid) applied to the layer. This kernel functions essentially the same as a CUDA or OpenCL kernel, in that it applies the same instruction(s) to many data instances (SIMD).
The essential goal of a CNN is to reduce the raw data density (red, green and blue channels, for RGB data), so that the fully connected layer ends up with just the essential data, such as the rough outlines or shape of an object, that can then be classified by a trained neural network like a feedforward neural network. This network would output a probabilistic result based on its training data. More complicated implementations would use a fully connected layer that also has feedback to improve its classification.
It’s Scalars and Vectors All the Way Down
So the K210 is essentially a vector processor. It’s optimized for the particular math of taking convolutions and building weighted sums. And it does this very fast. It’s like a GPU without the graphics hardware.
Or conversely, a GPU is like an AI accelerator. Even a low-end graphics card or built-in GPU in an older laptop is also an ‘AI engine’ that is many times more powerful than that found in a K210, NVidia Jetson system, and similar embedded ‘machine learning’ or ‘machine vision’ targeting systems. With some knowledge of algebra and a GPGPU framework (or using GLSL shaders if you’re hardcore) you too can be using your GPU for all of those ‘AI’ applications. Or doing materials science, or whatever. It’s just math.
This Isn’t the Future Yet?
Unfortunately, the reality painted for us by marketing departments is quite unlike that of the reality outside of those corporate walls. Even though decades of research has given us new ways to process information and categorize input faster than ever before, we do not have little brains embedded in the hardware we buy.
What we do have is a wonderful application of algebra and vector processors, the latter of which have become more powerful and affordable than ever, largely courtesy of the GPU-fueled development. Because of graphics cards, vector processing capability has expanded rapidly, in some ways outpacing CPU development. For scientific, medical, and many other fields this has been an enormous boon. Maybe vector processors will one day underlie the first artificial intellects, as they’d likely get called by then, but for now we can at least have your car tell you whether it thinks it saw a cat or a small child.
Have you put AI co-processors to good use? Let us know in the comments.