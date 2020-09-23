With social media and online services are now huge parts of daily life to the point that our entire world is being shaped by algorithms. Arcane in their workings, they are responsible for the content we see and the adverts we’re shown. Just as importantly, they decide what is hidden from view as well.
Important: Much of this post discusses the performance of a live website algorithm. Some of the links in this post may not perform as reported if viewed at a later date.
Recently, [Colin Madland] posted some screenshots of a Zoom meeting to Twitter, pointing out how Zoom’s background detection algorithm had improperly erased the head of a colleague with darker skin. In doing so, [Colin] noticed a strange effect — although the screenshot he submitted shows both of their faces, Twitter would always crop the image to show just his light-skinned face, no matter the image orientation. The Twitter community raced to explore the problem, and the fallout was swift.
Intentions != Results
Twitter users began to iterate on the problem, testing over and over again with different images. Stock photo models were used, as well as newsreaders, and images of Lenny and Carl from the Simpsons, In the majority of cases, Twitter’s algorithm cropped images to focus on the lighter-skinned face in a photo. In perhaps the most ridiculous example, the algorithm cropped to a black comedian pretending to be white over a normal image of the same comedian.
Many experiments were undertaken, controlling for factors such as differing backgrounds, clothing, or image sharpness. Regardless, the effect persisted, leading Twitter to speak officially on the issue. A spokesperson for the company stated “Our team did test for bias before shipping the model and did not find evidence of racial or gender bias in our testing. But it’s clear from these examples that we’ve got more analysis to do. We’ll continue to share what we learn, what actions we take, and will open source our analysis so others can review and replicate.”
There’s little evidence to suggest that such a bias was purposely coded into the cropping algorithm; certainly, Twitter doesn’t publically mention any such intent in their blog post on the technology back in 2018. Regardless of this fact, the problem does exist, with negative consequences for those impacted. While a simple image crop may not sound like much, it has the effect of reducing the visibility of affected people and excluding them from online spaces. The problem has been highlighted before, too. In this set of images of a group of AI commentators from January of 2019, the Twitter image crop focused on men’s faces, and women’s chests. The dual standard is particularly damaging in professional contexts, where women and people of color may find themselves seemingly objectified, or cut out entirely, thanks to the machinations of a mysterious algorithm.
Former employees, like [Ferenc Huszár], have also spoken on the issue — particularly about the testing process the product went through prior to launch. It suggests that testing was done to explore this issue, with regards to bias on race and gender. Similarly, [Zehan Wang], currently an engineering lead for Twitter, has stated that these issues were investigated as far back as 2017 without any major bias found.
It’s a difficult problem to parse, as the algorithm is, for all intents and purposes, a black box. Twitter users are obviously unable to observe the source code that governs the algorithm’s behaviour, and thus testing on the live site is the only viable way for anyone outside of the company to research the issue. Much of this has been done ad-hoc, with selection bias likely playing a role. Those looking for a problem will be sure to find one, and more likely to ignore evidence that counters this assumption.
Efforts are being made to investigate the issue more scientifically, using many studio-shot sample images to attempt to find a bias. However, even these efforts have come under criticism – namely, that using an source image set designed for machine learning and shot in perfect studio lighting against a white background is not realistically representative of real images that users post to Twitter.
Twitter’s algorithm isn’t the first technology to be accused of racial bias; from soap dispensers to healthcare, these problems have been seen before. Fundamentally though, if Twitter is to solve the problem to anyone’s satisfaction, more work is needed. A much wider battery of tests, featuring a broad sampling of real-world images, needs to be undertaken, and the methodology and results shared with the public. Anything less than this, and it’s unlikely that Twitter will be able to convince the wider userbase that its software isn’t failing minorities. Given that there are gains to be made in understanding machine learning systems, we expect research will continue at a rapid pace to solve the issue.
I still find it a little creepy that android phones do facial reconstruction when you try to crop an image and there’s no way in the settings to turn off the functionality.
[ ^^^ And the auto correct is still horrendous. ]
Software glitch/error… yes. Problem, yes… Racist, give me a break. It’s an inadvertent software glitch due to the differences in human beings, light reflection and detection. I am so sick of every dang thing being chalked up to racism… but only when it affect black folks.
+1
I feel like this shows a misunderstanding of what people mean by “racism” when they talk about issues like this. Something can still have “racist” outcomes even if it’s not designed to be racist (bias in training datasets in a big one).
Is it a big deal that twitter’s cropping system might not detect black faces? Probably not, in the grand scheme of things. But can it be called “racist”? Well, the system itself is obviously not intentionally racist, but in the outcomes of the system it shows a preference for white faces. This is what people mean when they call it “racist” or say that it shows racial bias – in the outcomes of it, it does. When people talk about this in relation to people’s views they call it unconcious bias, in the case of an AI I’m not sure you can use that term because the thing isn’t concious, but it’s the same deal
I agree completely. It’s highly unlikely that a programmer sat down and intentionally trained the algorithm to prefer images of one race over another, and we’ve seen this kind of bias in other ML applications based on the datasets they use. But engineers are responsible for the results of this, even if this specific feature is inadvertent.
I think we are white washing the term a bit here no? If we call some accidental data processing side effect the same as someone flying the Nazi flag while burning someone at the cross while wearing a white hood. Emmm maybe these things are not the same and we should reserve different language for different things.
Bias in training data sets is a thing but let’s not assume even the formation of the dataset came from racism. What if a Chinese company collected the origonal dataset. Well there are not a lot of people with dark skin complextions or african bone structure in China so shockingly the dataset will be biased to identify the differences among chinese individuals. This was simply a by product of convenience nothing evil, and is it reasonable to force all companies to adhere to all possible variances of people in their training data. What if the dataset has a bias to a very specific sub culture found only in the forests of Brazil. It’s not exactly reasonable to expect them to capture this data point so you know shit happens.
The Own-Race Bias for Face Recognition is a very real thing. It stems from nothing more then as a baby I looked mostly at my parents and relatives who are from a ethnicity to myself so I will be better capable at picking out differences among individuals of that ethnicity. This is true of nearly everyone (excluding adoptions which are a very interesting case study). It’s well known that african, asians, europeans, and native americans have very different defining facial features and so I learned on a specific trait which simply doesn’t help when looking at other ethnicities. This is a bias but it is not a malicious one and only my actions from this can be malicious. Twitter can have this accidental bias which no reasonable training dataset would have picked up, this does not make them evil, them refusing to address the problem would have been but that is clearly not what’s happening.
How exactly is accidental racism not racism?
But it _is_ racist. It’s preferring one race over another. That’s the definition of racism.
“Racist” isn’t some monolithic identity, it’s anything that favors or degrades one race over another. Any action, any policy, any statement, any process.
To call this “not racist’ either demonstrates that you do not understand what racism really is, or says some very unpleasant things about your beliefs.
AI downfalls like these are going to have very interesting and controversial implications in field like autonomous driving.
That is a scary outlook.
Sensor fusion where not all one’s eggs are in the same basket.
Can someone explain what exactly I’m supposed to be seeing here? All I’m seeing is a bunch of duplicate images where the left or right is higher than the other image…
Yeah. Less than perfect images were chosen for this article.
The gimmick is you make a very long image, with two people. Like a strip of individual photos. B/c the aspect ratio is too long, it has to crop. The issue is the resulting image.
The images chosen here are all “afters” with no “befores”, which hides the issue… Hold on.
Before:
After:
And then you run it with top/bottom switched and find the same result. And then you run it with one white guy in the middle of 10 black guys, and you get the same result.
I have no idea if the “positive” results displayed on Twitter were cherry-picked or not, so I’m a little hesitant to draw conclusions. But these surely do demonstrate the effect.
If the above is legit, makes you wonder about the training dataset and lack of testing/quality control. As already mentioned, software does not exhibit “racism“, but that does not keep social media from commenting.
How classy of you to use Lenny and Carl for illustration <3
The test images are bigger than the area twitter uses to display them with two people of different skin colour
So there is an algorithm that crops the images.
This algorithm could just use the center of the image or could focus on something ‘interesting’.
And this is where people with lighter skin are prefered and so we are talking of some kind of racism.
The reason could simply be that lighter skin colors have more contrast, so that faces are easier to identify.
“Regardless of this fact, the problem does exist, with negative consequences for those impacted. While a simple image crop may not sound like much, it has the effect of reducing the visibility of affected people and excluding them from online spaces. ”
Flip side of the coin. Less likely to fall into the surveillance state trap. Sure we want to fix this?
