An Algorithm For De-Biasing AI Systems

A fundamental truth about AI systems is that training the system with biased data creates biased results. This can be especially dangerous when the systems are being used to predict crime or select sentences for criminals, since they can hinge on unrelated traits such as race or gender to make determinations.

A group of researchers from the Massachusetts Institute of Technology (MIT) CSAIL is working on a solution to “de-bias” data by resampling it to be more balanced. The paper published by PhD students [Alexander Amini] and [Ava Soleimany] describes an algorithm that can learn a specific task – such as facial recognition – as well as the structure of the training data, which allows it to identify and minimize any hidden biases.

Testing showed that the algorithm minimized “categorical bias” by over 60% compared against other widely cited facial detection models, all while maintaining the same precision of detection. This figure was maintained when the team evaluated a facial-image dataset from the Algorithmic Justice League, a spin-off group from the MIT Media Lab.

The team says that their algorithm would be particularly relevant for large datasets that can’t easily be vetted by a human, and can potentially rectify algorithms used in security, law enforcement, and other domains beyond facial detection.

Software: It Is All In The Details

Who’s the better programmer? The guy that knows 10 different languages, or someone who knows just one? It depends. Programming is akin to math, or perhaps it is that we treat some topics differently than others which leads to misconceptions about what makes a good programmer, mathematician, or engineer. We submit that to be a great programmer is less about the languages you know and more about the algorithms and data structures you understand. If you know how to solve the problem, mapping it to a particular computer language should be almost an afterthought. While there are many places that you can learn those things, there is a lot more focus on how to write the languages,  C++ or Java or Python or whatever. We were excited, then, to see [Jeff Erickson] is publishing his algorithms book distilled from teaching at the University of Illinois, Urbana-Champaign for a number of years. The best part? You can read the preprint version online now and it will remain online even after the book goes to print.

When you were in school, you probably learned math in two ways: there was the mechanics (4×4=16) and then there were the word problems (Johnny has 10 candy bars and eats 4, how many are left?). Word problems are usually the bane of the student’s existence, yet they are much more realistic. Your boss has (probably) never come in your office and asked you what 147 divided by 12 is. If she did, you could hand her a calculator. The real value comes in being able to synthesize the right math for the right problem and — if you are lucky — gaining intuition about it (doubling the price will only increase profit by 10%). Software is pretty much the same, for example no one rushes into your cubicle and says “Quick! We need a for loop written!” You get a hazy set of requirements if you are lucky, and you then need to map that into something that computers can do. For that reason, we’ve always been more of a fan of learning about algorithms and data structures rather than specific language features.

Continue reading “Software: It Is All In The Details”

Algorithms For Visual Learners

Computer programming is a lot like chess. It is fairly simple to teach people the moves. But knowing how the pieces move isn’t the reason you can win. You have to understand how the pieces work together. It is easy to learn the mechanics of a for loop or a Java interface. But what makes programs work are algorithms. There are many books and classes dedicated to algorithms, but if you are a visual learner, you might be interested in a site that shows visualizations of algorithms called VisuAlgo.

The site is from [Dr. Steven Halim] and is meant for students at the National University of Singapore, but it is available “free of charge for Computer Science community on earth.” We suspect if any astronauts or cosmonauts wanted to use it in space, they’d be OK with that, too.

The animations and commentary take you through algorithms ranging from the common — sorting and linked lists — to the obscure — Steiner and Fenwick trees. Each animation frame has some commentary, so it isn’t just pretty pictures. The site is available in many languages, too.

Many of the animations allow you to set up problems and execute them using a C-like pseudo language. When it executes, you can watch the execution pointer and a box comments on the current operation. For example, in the linked list unit, you can create a random doubly linked list and then search it for a particular value. Not only can you see the code, but the graphical representation of the list will update as the code runs.

The site allows you to register for free to get additional features, but we didn’t and it was still a great read about many different data structures. Also, a few of the commentary slides require you to show you are actually a computer science professor — we assume there’s some copyright issue involved because it is only a few.

This site is a great example of how many free educational resources are out there on the web. It isn’t just computer science either. MITx — or more generally, edX — has some great hardware classes and many other topics