Asking machines to make music by themselves is kind of a strange notion. They’re machines, after all. They don’t feel happy or hurt, and as far as we know, they don’t long for the affections of other machines. Humans like to think of music as being a strictly human thing, a passionate undertaking so nuanced and emotion-based that a machine could never begin to understand the feeling that goes into the process of making music, or even the simple enjoyment of it.
The idea of humans and machines having a jam session together is even stranger. But oddly enough, the principles of the jam session may be exactly what machines need to begin to understand musical expression. As Sara Adkins explains in her enlightening 2019 Hackaday Superconference talk, Creating with the Machine, humans and machines have a lot to learn from each other.
To a human musician, a machine’s speed and accuracy are enviable. So is its ability to make instant transitions between notes and chords. Humans are slow to learn these transitions and have to practice going back and forth repeatedly to build muscle memory. If the machine were capable, it would likely envy the human in terms of passionate performance and musical expression.
In a recent study by a team of researchers at MIT, self driving cars are being programmed to identify the social personalities of other drivers in an effort to predict their future actions and drive safer on roads.
It’s already been made evident that autonomous vehicles lack social awareness. Drivers around a car are regarded as obstacles rather than human beings, which can hinder the automata’s ability to identify motivations and intentions, potential signifiers to future actions. Because of this, self-driving cars often cause bottlenecks at four-way stops and other intersections, perhaps explaining why the majority of traffic accidents involve them getting rear-ended by impatient drivers.
The research taps into social value orientation, a concept from social psychology that classifies a person from selfish (“egoistic”) to altruistic and cooperative (“prosocial”). The system uses this classification to create real-time driving trajectories for other cars based on a small snippet of their motion. For instance, cars that merge more often are deemed as more competitive than other cars.
When testing the algorithms on tasks involving merging lanes and making unprotected left turns, the behavioral predictions were shown to improve by a factor of 25%. In a left-turn simulation, the automata was able to wait until the approaching car had a more prosocial driver.
Even outside of self-driving cars, the research could help human drivers predict the actions of other drivers around them.
Given the accuracy of Moore’s Law to the development of integrated circuits over the years, one would think that our present day period is no different from the past decades in terms of computer architecture design. However, during the 2017 ACM Turing Award acceptance speech, John L. Hennessy and David A. Patterson described the present as the “golden age of computer architecture”.
Compared to the early days of MS-DOS, when designing user- and kernel-space interactions was still an experiment in the works, it certainly feels like we’re no longer in the infancy of the field. Yet, as the pressure mounts for companies to acquire more computational resources for running expensive machine learning algorithms on massive swaths of data, smart computer architecture design may be just what the industry needs.
Moore’s law predicts the doubling of transistors in an IC, it doesn’t predict the path that IC design will take. When that observation was made in 1965 it was difficult or even impossible to envision where we are today, with tools and processes so closely linked and widely available that the way we conceive processor design is itself multiplying.
Watching a sport can be a bit odd if you aren’t familiar with it. Most Americans, for example, would think a cricket match looked funny because they don’t know the rules. If you were not familiar with baseball, you might wonder why one of the coaches was waving his hands around, touching his nose, his ears, and his hat seemingly at random. Those in the know however understand that this is a secret signal to the player. The coach might be telling the player to steal a base or bunt. The other team tries to decode the signals, but if you don’t know the code that is notoriously difficult. Unless you have the machine learning phone app you can see in the video below.
If you are not a baseball fan, it works like this. The coach will do a number of things. Perhaps touch his cap, then his nose, brush his left forearm, and touch his lips. However, the code is often as simple as knowing one attention signal and one action signal. For example, the coach might tell you that if they touch their nose and then their lips, you should steal. Touching their nose and then their ear is a bunt. Touching their nose and then the bill of their cap is something else. Anything they do that doesn’t start with touching their nose means nothing at all. If the signal is this easy, you really don’t even need machine learning to decode it. But if it were more complicated — say, the gesture that occurs third after they touch their nose unless they also kick dirt at which point it means nothing — it would be much harder for a human to figure out.
Ecclesiastes 1:9 reads “What has been will be again, what has done will be done again; there is nothing new under the sun.” Or in other words, 5G is mostly marketing nonsense; like 4G, 3G, and 2G was before it. Let’s not forget LTE, 4G LTE, Advance 4G, and Edge.
Technically, 5G means that providers could, if they wanted to, install some EHF antennas; the same kind we’ve been using forever to do point to point microwave internet in cities. These frequencies are too lazy to pass through a wall, so we’d have to install these antennas in a grid at ground level. The promised result is that we’ll all get slightly lower latency tiered internet connections that won’t live up to the hype at all. From a customer perspective, about the only thing it will do is let us hit the 8Gb ceiling twice as faster on our “unlimited” plans before they throttle us. It might be nice on a laptop, but it would be a historically ridiculous assumption that Verizon is going to let us tether devices to their shiny new network without charging us a million Yen for the privilege.
So, what’s the deal? From a practical standpoint we’ve already maxed out what a phone needs. For example, here’s a dirty secret of the phone world: you can’t tell the difference between 1080p and 720p video on a tiny screen. I know of more than one company where the 1080p on their app really means 640 or 720 displayed on the device and 1080p is recorded on the cloud somewhere for download. Not a single user has noticed or complained. Oh, maybe if you’re looking hard you can feel that one picture is sharper than the other, but past that what are you doing? Likewise, what’s the point of 60fps 8k video on a phone? Or even a laptop for that matter?
Are we really going to max out a mobile webpage? Since our device’s ability to present information exceeds our ability to process it, is there a theoretical maximum to the size of an app? Even if we had Gbit internet to every phone in the world, from a user standpoint it would be a marginal improvement at best. Unless you’re a professional mobile game player (is that a thing yet?) latency is meaningless to you. The buffer buffs the experience until it shines.
So why should we care about billion dollar corporations racing to have the best network for sending low resolution advertising gifs to our disctracto cubes? Because 5G is for robots.
Nothing spoils your mood quite like your windscreen wipers not feeling it when the beat drops. Every major car manufacturer is focused on trying to build the electric self driving vehicle for the masses, yet ignoring this very real problem. Well [Ian Charnas] is taking charge, and has successfully slaved his car’s wipers to beat of its stereo.
Starting with the basics, [Ian] first needed to control the speed of the wiper motor. This was done using a custom power supply adapted from another project. The brain of the system is a Raspberry Pi 3B+ which runs a phase locked loop algorithm to sync the music and the motor. Detecting the beat turned out to be the most difficult part of the project, and from the research [Ian] did, there is no standard solution. He ended up settling on “madmom“, a Python audio and music signal processing library, which runs a neural net to detect the beat in real time. The Raspi sends the required PWM and Enable signals to an Arduino over serial, which in turn controls the power supply. The entire system was neatly integrated in the car, with a switch in the dash that connects the motor to the new power supply on demand, to allow the wipers to still be used normally (and safely).
[Esther Rietmann] and colleagues built a Telepresence Robot to allow work at home teammates to have a virtual, but physical presence in the office. A telepresence robot is like a tablet mounted on a Roomba, providing motion capability in addition to an audio/video connection. Built during a 48 hour hackathon, it is a bit crude under the hood and misses out on some features, such as a bidirectional video feed. But overall, it pretty much does what is expected from such a device.
The main structure is build from cheap aluminium profiles and sheets. A Raspberry Pi is at the heart of the electronics hardware, with a servo mounted Pi-camera and speaker-microphone pair taking care of video and audio. The two DC motors are driven by H-bridges controlled from the Pi and an idle swivel caster is attached as the third wheel. The whole thing is powered by a power bank. The one important thing missing is an HDMI display which can show a video feed from the remote laptop camera. That may have been due to time constraints, but this feature should not be too difficult to add as a future upgrade. It’s important for both sides to be able to see each other.
The software is built around WebRTC protocol, with the WebRTC Extension from UV4L doing most of the heavy lifting. The UV4L Streaming Server not only provides its own built-in set of web applications and services, but also embeds a general-purpose web server on another port, allowing the user to run and deploy their own custom web apps. This allowed [Esther Rietmann]’s team to build a basic but functional front-end to transmit data from the remote interface for controlling the robot. The remote computer runs a Python control script, running as a system service, to control the drive motors and camera servo.
The team also played with adding basic object, gesture and action recognition features. This was done using PoseNet – a machine learning model, which allows for real-time human pose estimation in the browser using TensorFlowJS – allowing them to demonstrate some pose detection capability. This could be useful as a “follow me” feature for the robot.
Another missing feature, which most other commercial telepresence robots have, is a sensor suite for collusion avoidance, object detection and awareness such as micro switches, IR / ultrasonic detectors, time of flight cameras or LiDAR’s. It would be relatively easy to add one or several sensors to the robot.
If you’d like to build one for yourself, check out their code repository on Github and the videos below.