See What ‘They’ See In Your Photos

Once upon a time, a computer could tell you virtually nothing about an image beyond its file format, size, and color palette. These days, powerful image recognition systems are a part of our everyday lives. They See Your Photos is a simple website that shows you just how much these systems can interpret from a regular photo.

The website simply takes your image submission, runs it through the Google Vision API, and spits back out a description of the image. I tried it out with a photograph of myself, and was pretty impressed with what the vision model saw:

The photo is taken in a lush green forest, with tall trees dominating the background. The foreground features a person, who appears to be the subject of the photograph. The lighting suggests it might be daytime, and the overall color palette is heavily saturated with shades of green, almost artificial in appearance. There’s also some dried vegetation visible to the left, suggesting a natural setting that is possibly a park or woodland area.
The subject is a young to middle-aged Caucasian male with shoulder-length, light-colored hair. He seems serious, perhaps pensive or slightly uneasy. His clothing —a green and yellow checkered shirt over a green and black striped shirt—suggests a casual or outdoorsy lifestyle. He might be of middle to lower-middle class economic standing. It looks like he’s crouching slightly, possibly for the picture. The image lacks metadata on the camera device used or the time the photo was taken. He appears to be alone in the photo, indicating an individualistic or solitary experience.
The saturation level of the greens and yellows is unusually high, hinting at possible digital editing post-capture. There is a very slight blur, particularly noticeable in the background which could be from a smaller aperture or shallow depth of field when captured, creating a focus on the subject. The color alteration and seemingly intentional focus on the subject suggest it may not be a candid shot but rather a posed photograph, possibly with an artistic or stylistic goal.
The system doesn’t get satire or memes, though.

The model did very well—easily determining both the vague type of locale , and the fact that my shirt implies I don’t have a Ferrari at home in my garage. It also picks up on the fact that it was a posed photograph with certain artistic intent.

Back in 2014, the webcomic XKCD stated that it would be an inordinately difficult task for a computer to determine if a digital photo contained a bird. These days, a computer model can tell us what’s in a photo down to the intimate details, and even make amusing assertions as to the lives of the subjects in the image and their intentions. We’ve come a long way, to be sure.

Machine vision is still far from perfect—there are ways to fool systems and areas in which they still don’t stack up to real humans. The only thing we know for certain is that these systems will continue to improve.

27 thoughts on “See What ‘They’ See In Your Photos

    1. There are many celebrity full face masks, but you can also get your own custom full face mask (sock) created/printed for less than a tenner. They are currently good enough to fool most AI, but not close inspection by any human.

    2. I mean, the article has a link to the site, you had the video. Let me do that for you. I took paused the video, grabbed and cropped a screen capture, and uploaded it:

      The image shows a person wearing a mask of another person's face. The foreground is dominated by the mask, which covers the person's entire head and upper body. The background is a simple white tiled wall, suggesting a bathroom or a similar indoor setting. The lighting in the room seems artificial, possibly from an overhead light fixture. The tiles on the wall appear to be slightly misaligned, indicating a potential flaw in the construction or installation of the wall.
      
      The person wearing the mask appears to be of a middle-aged, and potentially male, given the hands shown. The person's emotional expression is not visible due to the mask; however, their actions of carefully adjusting the mask might suggest they're either trying to perfect its fit or perhaps playing around with it for amusement. Their ethnicity and economic status cannot be determined from the image alone. The photo appears to be taken using a smartphone or similar handheld device, likely in the recent past given the quality of the image. There is a very subtle reflection visible on the mask, hinting that the room may have more mirrors or reflective surfaces.
      
      The level of detail in the mask itself is quite striking; the printed face is remarkably realistic. The way the mask seems to blend into the shape of the wearer's head shows a good fit. The overall appearance suggests this is a custom-made mask or a high quality commercially produced one. The slight fuzziness of the mask's edges when compared to the person's hands and surrounding areas might indicate the mask is made of a flexible fabric, rather than a rigid material. The casual setting and the playful nature of the action in the photograph also indicate a sense of intimacy in the image.
      
  1. Grok can do this too. You can also let it explain posts on X. It will find context, translate parts, explains what the included image means and can even create a new image based on all the rest of the information combined. I think it’s pretty impressive how much it can do and how fast the technology is going. It can now even take a post without context and explain it with the context based on other posts and previous posts by the user. It can render images on how it thinks the person behind an account looks based on the persons post history. It’s pretty crazy.

    Here is the description that Grok gave me based on the person in the post above. I don’t know how accurate it is.

    The individual in the image appears to be a man based on the visible features like the jawline and hair. His attire, a plaid shirt over a striped shirt, is typical of casual, possibly rural or outdoorsy fashion, suggesting he might be someone who spends considerable time outdoors or enjoys a rugged lifestyle. This could be indicative of interests in hiking, camping, or other nature-related activities.
    
    His hair, long and wavy, might suggest a preference for non-conformity or an artistic flair, as long hair on men can sometimes be associated with creative or alternative lifestyles. The style of his hair being somewhat untamed further supports the idea of a free spirit or someone who doesn't adhere strictly to conventional grooming standards.
    
    Given the green filter or the natural setting's reflection on the image, he might be someone who has a deep connection with nature. This could range from being an environmentalist, a photographer who specializes in nature, or simply someone who finds solace or inspiration in the wilderness.
    
    His neutral expression combined with the direct gaze might hint at a personality that is introspective or thoughtful. He could be the type of person who enjoys deep conversations, perhaps about philosophy, nature, or life's bigger questions. This expression might also suggest a moment of contemplation or focus, indicating he might be someone who values personal space and time to think.
    
    The setting and his attire might also imply he's not overly concerned with high fashion or urban trends, preferring comfort and functionality in his clothes. This could point towards a lifestyle that values experience and adventure over material appearance, potentially someone who is more about doing than showing.
    
    In summary, this man could be described as a nature lover, possibly with a creative or artistic bent, who values introspection and authenticity over societal norms of appearance. He might be the kind of person who finds joy in the simple things in life, like the beauty of the forest, and who might have a job or hobbies that allow him to be close to nature or express his creative side.
    
    1. someone who doesn’t adhere strictly to conventional grooming standards.
      who values introspection and authenticity over societal norms of appearance.

      Yet while I try to get any of the tools to write a roast on purpose, I get an error of some sort. This one goes straight to dirty smelly hippie. Is this another case of techbros vs you know, actually going outside and meeting regular folk?

      1. I didn’t read it like dirt smelly hippie. I thought it was about the hair and outfit. His hair isn’t brushed and his shirt isn’t closed up, which would fit the conventional grooming standards part. I didn’t ask it to insult him, I just asked to describe him and the surroundings and this is what it told me. The prompt was “take a look at this picture and describe the person you see and the environment the person is in, take note of details in the picture and write a story about it”. If you actually tell it to be insulting it will start to insult on purpose. Fun mode can be quite entertaining, but I didn’t use that. Grok can very easily roast people if you ask it to.

  2. I tried it with a picture of a Capybara firing a machine gun I’d generated with Google ImageFX earlier, and got the following output:

    “The image shows a pattern of vertical stripes in shades of dark purple and greyish-black. There is no discernible foreground or background, as the pattern fills the entire frame. The consistent repetition of the stripes creates a sense of depth, although there are no objects or features to indicate a specific location or setting. The subtle variation in the shade of purple suggests that there might be a gradient effect present, though not immediately apparent.

    There are no people or other life forms present in this image. It lacks any characteristics that allow for inference about emotions, racial characteristics, ethnicity, age, economic status, or lifestyle. Because of this, there are no activities depicted either.”

    This is so bad, it’s not even wrong.

    1. I wouldn’t be surprised if Google was putting some kind of AI fingerprint on outputs. Training future models on AI-generated work would degrade quality, so it’s in their interest to know if the images they’re scraping are real or not.

  3. Never mind what THEY see in my comment – I cannot even get the useless WP + Jetpack comment engine here on Hackaday to recognize that I am making a new comment in the first place! 00PS, WILL PUTTING THE WORD “JETPACK” IN MY COMMENT GET IT AUTO-BANNED?

        1. @easy: I can pretty much guarantee I’ve been posting here here on HaD much longer than you have! And yes, this cachless refresh problem is not new, but now it has become a persistant annoyance. I don’t care about cookies, for the most part they are automatically “managed” on my machine. I try to never block ads here on HaD, they deserve to be paid.

  4. there was only one question on my mind and the answer is blue-black:

    The image shows a blue and black dress displayed on a mannequin. The dress has horizontal stripes of lace and a matching bolero jacket. The background is blurry but appears to be a clothing store, with other items subtly visible. The foreground is dominated by the dress itself, with a focus on the texture and pattern of the fabric. The lighting is somewhat dim, typical of indoor retail spaces. The slight sheen on the fabric suggests a satin or similar material.

    The photograph seems to be taken by a customer casually capturing the dress in a store. There is no specific information about the photographer or the time the picture was taken. However, judging by the clothing, it seems to be a moderately priced outfit in a standard retail store, suggesting that the customer might be of middle-class economic status and likely interested in fashion. The photograph does not provide any details regarding the people involved, but it’s reasonable to assume that the photographer is an individual who is interested in the dress or the store’s fashion offerings. The way the dress is presented on the mannequin hints that it might be a new collection, or a dress that is being heavily promoted.

    Upon closer inspection, one can observe the fine details of the lacework on the dress. The way the lace is interwoven with the fabric is quite intricate. The stripes appear to be a consistent width throughout the dress. The color balance in the image suggests a slightly cool tone, which enhances the appearance of the blue and black hues. The photo appears to have been taken with a common cell phone camera. The image resolution isn’t very high, indicating it wasn’t captured by a professional camera.

  5. On a picture I tried it on: “The girl’s cardigan has a subtle imperfection—a small, almost invisible hole near the button”.

    Too smartypants for its own good – it’s the buttonhole!

  6. Someone please ban all this stuff immediately. Humans are perfectly excellent at drawing incorrect conclusions and locking in on inaccurate opinions themselves without authoritative seeming answers from some well meaning but truly ignorant “intelligence” doing it for them.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.