See What ‘They’ See In Your Photos

December 17, 2024

Once upon a time, a computer could tell you virtually nothing about an image beyond its file format, size, and color palette. These days, powerful image recognition systems are a part of our everyday lives. They See Your Photos is a simple website that shows you just how much these systems can interpret from a regular photo.

The website simply takes your image submission, runs it through the Google Vision API, and spits back out a description of the image. I tried it out with a photograph of myself, and was pretty impressed with what the vision model saw:

The photo is taken in a lush green forest, with tall trees dominating the background. The foreground features a person, who appears to be the subject of the photograph. The lighting suggests it might be daytime, and the overall color palette is heavily saturated with shades of green, almost artificial in appearance. There’s also some dried vegetation visible to the left, suggesting a natural setting that is possibly a park or woodland area.

The subject is a young to middle-aged Caucasian male with shoulder-length, light-colored hair. He seems serious, perhaps pensive or slightly uneasy. His clothing —a green and yellow checkered shirt over a green and black striped shirt—suggests a casual or outdoorsy lifestyle. He might be of middle to lower-middle class economic standing. It looks like he’s crouching slightly, possibly for the picture. The image lacks metadata on the camera device used or the time the photo was taken. He appears to be alone in the photo, indicating an individualistic or solitary experience.

The saturation level of the greens and yellows is unusually high, hinting at possible digital editing post-capture. There is a very slight blur, particularly noticeable in the background which could be from a smaller aperture or shallow depth of field when captured, creating a focus on the subject. The color alteration and seemingly intentional focus on the subject suggest it may not be a candid shot but rather a posed photograph, possibly with an artistic or stylistic goal.

*The system doesn’t get satire or memes, though.*

The model did very well—easily determining both the vague type of locale , and the fact that my shirt implies I don’t have a Ferrari at home in my garage. It also picks up on the fact that it was a posed photograph with certain artistic intent.

Back in 2014, the webcomic XKCD stated that it would be an inordinately difficult task for a computer to determine if a digital photo contained a bird. These days, a computer model can tell us what’s in a photo down to the intimate details, and even make amusing assertions as to the lives of the subjects in the image and their intentions. We’ve come a long way, to be sure.

Machine vision is still far from perfect—there are ways to fool systems and areas in which they still don’t stack up to real humans. The only thing we know for certain is that these systems will continue to improve.

46 thoughts on “See What ‘They’ See In Your Photos”

Commenter says:

December 17, 2024 at 4:54 am

le upvote

Report comment

Reply
Truth says:

December 17, 2024 at 4:55 am

I wonder what it would make of a full face mask
ref: https://www.schneier.com/blog/archives/2024/12/full-face-masks-to-frustrate-identification.html

Report comment

Reply
1. Truth says:
  
  December 17, 2024 at 5:03 am
  
  There are many celebrity full face masks, but you can also get your own custom full face mask (sock) created/printed for less than a tenner. They are currently good enough to fool most AI, but not close inspection by any human.
  
  Report comment
  
  Reply
2. Beaker says:
  
  December 17, 2024 at 5:05 am
  I mean, the article has a link to the site, you had the video. Let me do that for you. I took paused the video, grabbed and cropped a screen capture, and uploaded it:
```
The image shows a person wearing a mask of another person's face. The foreground is dominated by the mask, which covers the person's entire head and upper body. The background is a simple white tiled wall, suggesting a bathroom or a similar indoor setting. The lighting in the room seems artificial, possibly from an overhead light fixture. The tiles on the wall appear to be slightly misaligned, indicating a potential flaw in the construction or installation of the wall.

The person wearing the mask appears to be of a middle-aged, and potentially male, given the hands shown. The person's emotional expression is not visible due to the mask; however, their actions of carefully adjusting the mask might suggest they're either trying to perfect its fit or perhaps playing around with it for amusement. Their ethnicity and economic status cannot be determined from the image alone. The photo appears to be taken using a smartphone or similar handheld device, likely in the recent past given the quality of the image. There is a very subtle reflection visible on the mask, hinting that the room may have more mirrors or reflective surfaces.

The level of detail in the mask itself is quite striking; the printed face is remarkably realistic. The way the mask seems to blend into the shape of the wearer's head shows a good fit. The overall appearance suggests this is a custom-made mask or a high quality commercially produced one. The slight fuzziness of the mask's edges when compared to the person's hands and surrounding areas might indicate the mask is made of a flexible fabric, rather than a rigid material. The casual setting and the playful nature of the action in the photograph also indicate a sense of intimacy in the image.
```
  Report comment
  
  Reply
  1. Nomad says:
    
    December 17, 2024 at 5:54 am
    
    Another test
    
    “The image shows a close-up view of a person wearing a dark red hoodie. The background is blurry and indistinct, suggesting an indoor setting. The foreground is dominated by the person’s face and upper body. The person’s hands are visible, adjusting the hood of the hoodie. There’s a faint watermark or logo visible in the upper right corner of the image.
    
    The individual appears to be a man with a beard, likely in his late twenties or thirties. His expression is somewhat serious, perhaps even slightly concerned. He is of Caucasian descent. The image seems to have been captured using a smartphone camera, given the relatively close-up, unprofessionally-lit shot and the slightly grainy quality. His eyes suggest a level of alertness, perhaps indicating he is engaged in an activity demanding attention.
    
    The lighting in the image is soft and diffuse, making it challenging to discern precise details. The color saturation is slightly muted, adding to the somewhat subdued mood. The focus is predominantly on the person’s face, with a shallow depth of field blurring the background. The overall image quality is acceptable for a casual snapshot or video capture, not indicative of high-end professional equipment. The man’s attire points towards a relaxed or informal setting. There appears to be some slight blurring or motion blur, suggesting a slightly shaky camera during capture, typical of handheld smartphone recordings.”
    
    Report comment
    
    Reply
Steven-X says:

December 17, 2024 at 5:08 am

Looks like she got her team.

Report comment

Reply
1. Lewin Day says:
  
  December 18, 2024 at 3:27 pm
  
  ha! wouldn’t it be wild if AI all developed from a bird project years ago…
  
  Report comment
  
  Reply
PPJ says:

December 17, 2024 at 5:09 am

Maybe good idea for captcha:
Look at the picture and choose inoffensive prefix ;-)

Report comment

Reply
1. Eric says:
  
  December 17, 2024 at 5:57 am
  
  Human would be my answer since it’s often impossible to tell which one of the 48 genders the pictured person is.
  
  Report comment
  
  Reply
TG says:

December 17, 2024 at 5:26 am

Interestingly enough, AI can very accurately determine race from rather tightly cropped and data-sanitized images of X-rays as well

Report comment

Reply
1. anon says:
  
  December 17, 2024 at 5:30 am
  
  It’s not very surprising. It’s a pretty well documented fact that as a child we learn how to identify unique people based on certain facial features. For different ethnicities different parts of the face become “important”. So if a child grows up in say rural China they will learn what distinguishes Chinese faces but then struggle greatly with separating out people from say central Africa as what makes the face unique is not something they have learned to look at. As such the computer model just learned these different traits for the different races.
  
  Report comment
  
  Reply
2. jim says:
  
  December 17, 2024 at 6:16 am
  
  You’re re-creating phrenology through the fake-objective perspective of AI.
  
  Report comment
  
  Reply
Bob the builder says:

December 17, 2024 at 5:28 am
Grok can do this too. You can also let it explain posts on X. It will find context, translate parts, explains what the included image means and can even create a new image based on all the rest of the information combined. I think it’s pretty impressive how much it can do and how fast the technology is going. It can now even take a post without context and explain it with the context based on other posts and previous posts by the user. It can render images on how it thinks the person behind an account looks based on the persons post history. It’s pretty crazy.

Here is the description that Grok gave me based on the person in the post above. I don’t know how accurate it is.
```
The individual in the image appears to be a man based on the visible features like the jawline and hair. His attire, a plaid shirt over a striped shirt, is typical of casual, possibly rural or outdoorsy fashion, suggesting he might be someone who spends considerable time outdoors or enjoys a rugged lifestyle. This could be indicative of interests in hiking, camping, or other nature-related activities.

His hair, long and wavy, might suggest a preference for non-conformity or an artistic flair, as long hair on men can sometimes be associated with creative or alternative lifestyles. The style of his hair being somewhat untamed further supports the idea of a free spirit or someone who doesn't adhere strictly to conventional grooming standards.

Given the green filter or the natural setting's reflection on the image, he might be someone who has a deep connection with nature. This could range from being an environmentalist, a photographer who specializes in nature, or simply someone who finds solace or inspiration in the wilderness.

His neutral expression combined with the direct gaze might hint at a personality that is introspective or thoughtful. He could be the type of person who enjoys deep conversations, perhaps about philosophy, nature, or life's bigger questions. This expression might also suggest a moment of contemplation or focus, indicating he might be someone who values personal space and time to think.

The setting and his attire might also imply he's not overly concerned with high fashion or urban trends, preferring comfort and functionality in his clothes. This could point towards a lifestyle that values experience and adventure over material appearance, potentially someone who is more about doing than showing.

In summary, this man could be described as a nature lover, possibly with a creative or artistic bent, who values introspection and authenticity over societal norms of appearance. He might be the kind of person who finds joy in the simple things in life, like the beauty of the forest, and who might have a job or hobbies that allow him to be close to nature or express his creative side.
```
Report comment

Reply
1. lsjob says:
  
  December 17, 2024 at 5:46 am
  
  someone who doesn’t adhere strictly to conventional grooming standards.
  who values introspection and authenticity over societal norms of appearance.
  
  Yet while I try to get any of the tools to write a roast on purpose, I get an error of some sort. This one goes straight to dirty smelly hippie. Is this another case of techbros vs you know, actually going outside and meeting regular folk?
  
  Report comment
  
  Reply
  1. Bob the builder says:
    
    December 17, 2024 at 7:30 am
    
    I didn’t read it like dirt smelly hippie. I thought it was about the hair and outfit. His hair isn’t brushed and his shirt isn’t closed up, which would fit the conventional grooming standards part. I didn’t ask it to insult him, I just asked to describe him and the surroundings and this is what it told me. The prompt was “take a look at this picture and describe the person you see and the environment the person is in, take note of details in the picture and write a story about it”. If you actually tell it to be insulting it will start to insult on purpose. Fun mode can be quite entertaining, but I didn’t use that. Grok can very easily roast people if you ask it to.
    
    Report comment
    
    Reply
2. Ostracus says:
  
  December 17, 2024 at 6:38 am
  
  I could see it as a future aid towards determination of post and poster social score.
  
  Report comment
  
  Reply
3. Lewin Day says:
  
  December 18, 2024 at 3:29 pm
  
  “somewhat untamed further supports the idea of a free spirit or someone who doesn’t adhere strictly to conventional grooming standards.”
  
  is a hilarious line with some wild implications. it seems to suggest only straight hair meets “conventional grooming standards” which… other social commentators can run with
  
  Report comment
  
  Reply
Johannes Burgel says:

December 17, 2024 at 6:04 am

I tried it with a picture of a Capybara firing a machine gun I’d generated with Google ImageFX earlier, and got the following output:

“The image shows a pattern of vertical stripes in shades of dark purple and greyish-black. There is no discernible foreground or background, as the pattern fills the entire frame. The consistent repetition of the stripes creates a sense of depth, although there are no objects or features to indicate a specific location or setting. The subtle variation in the shade of purple suggests that there might be a gradient effect present, though not immediately apparent.

There are no people or other life forms present in this image. It lacks any characteristics that allow for inference about emotions, racial characteristics, ethnicity, age, economic status, or lifestyle. Because of this, there are no activities depicted either.”

This is so bad, it’s not even wrong.

Report comment

Reply
1. pelrun says:
  
  December 17, 2024 at 6:17 am
  
  That sounds more like a problem in the conversion from the image format into the internal format required for feeding to the network, not a fault of the network itself.
  
  Report comment
  
  Reply
2. jim says:
  
  December 17, 2024 at 6:20 am
  
  I wouldn’t be surprised if Google was putting some kind of AI fingerprint on outputs. Training future models on AI-generated work would degrade quality, so it’s in their interest to know if the images they’re scraping are real or not.
  
  Report comment
  
  Reply
  1. Truth says:
    
    December 17, 2024 at 7:23 am
    
    I tried with an AI generated face from https://thispersondoesnotexist.com/ and it worked. Although it is probably not using any google trained network.
    
    Report comment
    
    Reply
    1. Elliot Williams says:
      
      December 18, 2024 at 4:20 am
      
      Write a script to do it automatically! Keep feeding the bots!
      
      Report comment
      
      Reply
      1. Nick says:
        
        May 28, 2025 at 5:16 am
        
        Write a script to do it automatically! Keep feeding the bots!
        
        This. The only thing better than not being in an invasive database is to poison said database until it becomes useless. I like to make deliberate mistakes in captchas too. Don’t give away good training data for free.
        
        Report comment
3. ben says:
  
  December 17, 2024 at 7:41 am
  
  Are you blocking canvas access in your browser and/or have anti-fingerprinting features enabled?
  
  Report comment
  
  Reply
Drone says:

December 17, 2024 at 6:20 am

Well my comment containing the word “Jetpack” was not auto-banned this time. But it took a cacheless browser refresh just to see my post in the first place. What a mess…

Report comment

Reply
1. Drone says:
  
  December 17, 2024 at 6:23 am
  
  And yet another cacheless browser refresh was required just to see my post…
  
  Report comment
  
  Reply
  1. Drone says:
    
    December 17, 2024 at 6:27 am
    
    Same again – Cachless refresh needed just to see my post in Firefox :-(
    
    Report comment
    
    Reply
2. easy says:
  
  December 17, 2024 at 6:26 am
  
  Are you new? It has always taken a cacheless browser refresh to see your posts. Next you will complain that you can’t reply to a post without cookies enabled.
  
  Report comment
  
  Reply
  1. Drone says:
    
    December 17, 2024 at 6:42 am
    
    @easy: I can pretty much guarantee I’ve been posting here here on HaD much longer than you have! And yes, this cachless refresh problem is not new, but now it has become a persistant annoyance. I don’t care about cookies, for the most part they are automatically “managed” on my machine. I try to never block ads here on HaD, they deserve to be paid.
    
    Report comment
    
    Reply
    1. Nick says:
      
      May 28, 2025 at 5:17 am
      
      And the “email me new comments” switch is still broken.
      
      Report comment
      
      Reply
Greg A says:

December 17, 2024 at 6:39 am

there was only one question on my mind and the answer is blue-black:

The image shows a blue and black dress displayed on a mannequin. The dress has horizontal stripes of lace and a matching bolero jacket. The background is blurry but appears to be a clothing store, with other items subtly visible. The foreground is dominated by the dress itself, with a focus on the texture and pattern of the fabric. The lighting is somewhat dim, typical of indoor retail spaces. The slight sheen on the fabric suggests a satin or similar material.

The photograph seems to be taken by a customer casually capturing the dress in a store. There is no specific information about the photographer or the time the picture was taken. However, judging by the clothing, it seems to be a moderately priced outfit in a standard retail store, suggesting that the customer might be of middle-class economic status and likely interested in fashion. The photograph does not provide any details regarding the people involved, but it’s reasonable to assume that the photographer is an individual who is interested in the dress or the store’s fashion offerings. The way the dress is presented on the mannequin hints that it might be a new collection, or a dress that is being heavily promoted.

Upon closer inspection, one can observe the fine details of the lacework on the dress. The way the lace is interwoven with the fabric is quite intricate. The stripes appear to be a consistent width throughout the dress. The color balance in the image suggests a slightly cool tone, which enhances the appearance of the blue and black hues. The photo appears to have been taken with a common cell phone camera. The image resolution isn’t very high, indicating it wasn’t captured by a professional camera.

Report comment

Reply
BT says:

December 17, 2024 at 7:34 am

On a picture I tried it on: “The girl’s cardigan has a subtle imperfection—a small, almost invisible hole near the button”.

Too smartypants for its own good – it’s the buttonhole!

Report comment

Reply
1. Elliot Williams says:
  
  December 18, 2024 at 4:25 am
  
  They are looking for people to run cardigan ads for.
  
  Report comment
  
  Reply
Just sayin' says:

December 17, 2024 at 7:43 am

If there were any “intelligence” here, it would know that a photo that has overall green lighting, will make other objects appear green, and therefore no determination can be made about the subject’s clothing color.

Even more disturbing is the fact that it says “indicating an individualistic or solitary experience” since they are alone in the photo. It doesn’t consider that maybe someone else is taking the photo? Who trained this AI anyway?

Report comment

Reply
spritle says:

December 17, 2024 at 8:00 am

Someone please ban all this stuff immediately. Humans are perfectly excellent at drawing incorrect conclusions and locking in on inaccurate opinions themselves without authoritative seeming answers from some well meaning but truly ignorant “intelligence” doing it for them.

Report comment

Reply
1. Thovthe says:
  
  December 17, 2024 at 12:01 pm
  
  I think the people is a position to ban it would still use it regardless. Other ideas?
  
  Report comment
  
  Reply
sondr says:

December 17, 2024 at 1:59 pm

FWIW this reads a little dishonest; they are prompting the API to include information about people, “age, race, emotion, or economic status” Try it with a non-human subject, for instance:

“The spectrogram itself doesn’t depict people or their activities; it’s a representation of an audio recording. Therefore, there’s no information about the people involved or their characteristics like age, race, emotion, or economic status. It is likely that the audio was recorded digitally using a computer and processed for visualization. The specific camera or device is not directly observable; rather, the image itself is a product of software that processes audio data. The overall visual structure of the spectrogram suggests it’s a relatively long audio recording.”

Report comment

Reply
1. Spritle says:
  
  December 17, 2024 at 4:54 pm
  
  Great work. These systems ought to reveal those instructions they are given up front, it would go a long way toward addressing my concern stated above.
  
  Report comment
  
  Reply
2. fluffy says:
  
  December 17, 2024 at 11:07 pm
  
  When I feed it images I’ve photographed that don’t include any humans, it consistently tries to guess at my ethnicity, gender, and economic status, despite anything that would indicate that. These guesses seem to be full of ridiculous amounts of bias; for example, apropos of nothing it always guesses the photographer (me) to be Caucasian (correct), upper class (no), and male (nope), usually based on cues involving lighting and interests demonstrated.
  
  Report comment
  
  Reply
  1. Elliot Williams says:
    
    December 18, 2024 at 4:23 am
    
    But this is what it’s told to do. These are the datapoints that Google is interested in. Wealth, health condition, willingness to purchase…
    
    I like the experiment of feeding it white noise to reveal what the prompt is!
    
    Report comment
    
    Reply
    1. sondr says:
      
      December 18, 2024 at 9:24 am
      
      Those points are what ente.io configured their query to google APIs with; that’s where this comes off as dishonest.
      If it was “give an accurate description of the photo” the results would be very different.
      
      You too can pay out the nose to make a misleading website to ~make commentary on visual perception AIs~ market your product.
      
      Report comment
      
      Reply
      1. Elliot Williams says:
        
        December 19, 2024 at 11:39 am
        
        Good point! I think we missed the point initially that this website was “selling” a private image-storage service!
        
        Report comment
fluffy says:

December 17, 2024 at 10:59 pm

I fed it a few drawings of some of my characters and it did a pretty good job of describing them! I also found the ominous framing of the website around it to be absolutely hilarious as a result.

Report comment

Reply
phuzz says:

December 18, 2024 at 1:32 am

It’s taken me until now to realise that you’re the same Lewin who writes for the Autopian ;)

Report comment

Reply
Hailey says:

December 18, 2024 at 12:50 pm

Thats it I’m going amish

Report comment

Reply
Ian says:

December 18, 2024 at 8:31 pm

Those “amusing assertions” are a lot less amusing when the system is being used to racial/social/class/income profile you.

What odds will someone I’ve me on whether this is already being used in places to alert employees about people who look “undesirable”, are “theft risks”, or even just people who might “ruin” the aesthetic of your store and need to be escorted off the property?

I’m sorry ma’am but your credit rating and purchase history mean you shouldn’t be here. You need to leave before your presence disgusts our REAL customers. The police have already been called.

Report comment

Reply