Artificial intelligence (AI) is undergoing somewhat of a renaissance in the last few years. There’s been plenty of research into neural networks and other technologies, often based around teaching an AI system to achieve certain goals or targets. However, this method of training is fraught with danger, because just like in the movies – the computer doesn’t always play fair.
It’s often very much a case of the AI doing exactly what it’s told, rather than exactly what you intended. Like a devious child who will gladly go to bed in the literal sense, but will not actually sleep, this can cause unexpected, and often quite hilarious results. [Victoria] has created a master list of scholarly references regarding exactly this.
The list spans a wide range of cases. There’s the amusing evolutionary algorithm designed to create creatures capable of high-speed movement, which merely spawned very tall creatures that generated these speeds by falling over. More worryingly, there’s the AI trained to identify toxic and edible mushrooms, which simply picked up on the fact that it was presented with the two types in alternating order. This ended up being an unreliable model in the real world. Similarly, the model designed to assess malignancy of skin cancers determined that lesions photographed with rulers for scale were more likely to be cancerous.
[Victoria] refers to this as “specification gaming”. One can draw parallels to classic sci-fi stories around the “Laws of Robotics”, where robots take such laws to their literal extremes, often causing great harm in the process. It’s an interesting discussion of the difficulty in training artificially intelligent systems to achieve their set goals without undesirable side effects.
We’ve seen plenty of work in this area before – like this use of evolutionary algorithms in circuit design.