Flat Camera Uses No Lens

Early cameras and modern cameras work pretty much the same way. A lens (or a pinhole acting as a lens) focuses an image onto a sensor. Of course, sensor, in this case, is a broad term and could include a piece of film that–after all–does sense light via a chemical reaction. Sure, lenses and sensors get better or, at least, different, but the basic design has remained the same since the Chinese built the camera obscura around 400BC (and the Greeks not long after that).

Of course, the lens/sensor arrangement works well, but it does limit how thin you can make a camera. Cell phone cameras are pretty skinny, but there are applications where even they are too thick. That’s why researchers at Rice University are working on a new concept design for a flat camera that uses no lens. You can see a video about the new type of camera below.

The idea is simple: take a conventional sensor and place a mask over it that has a grid-like arrangement of apertures. The resulting image doesn’t match what you would see, but it provides enough information that a computer can reconstruct the picture.

At Hackaday, we’re no strangers to homebrew camera builds (including one built around an Arduino and a single pixel). However, none of those had the promise of being super thin.

51 thoughts on “Flat Camera Uses No Lens

  1. Thats pretty neat. When I was making my astrocam and pointing the raw sensor around my room you could see “ghosts” and I wondered if you could statistically tease out the image if you know the exact characteristics of the sensors pixels.

    Now all we need is the industrial printable batteries and color e-ink displays!

    1. You could apply Wiener deconvolution to these “ghostly” pictures. Having uncompressed data and after tuning parameters it is possible to obtain clarity that allows You to recognize objects in scene.

      1. It’s way cooler than Wiener decon. It exploits sparse solutions via convex optimization. For more info check out:

        http://statweb.stanford.edu/~donoho
        http://statweb.stanford.edu/~candes
        http://web.ece.rice.edu/richb/

        You can actually solve problems that were generally considered impossible.

        Under very specific constraints solving Ax=y given y and A will find the unique solution provided x is sparse.The proof is hard, but the implementation is particularly easy. The same procedure applies to predicting what movies you might like. Most of the proofs involve the randomness of A as a constraint.

        FWIW Wiener is the L2 solution. Donoho is the L1 solution.

        1. But that assumes A is sparse. Seeing as how A is a matrix mapping the output of a particular pixel to, let’s say, light received by the array from a particular direction (e.g., corresponding to an ‘image’ pixel), and how each pixel of your average CCD or CMOS array is generally identically sensitive across angles to all the other pixels, well… I’m going to guess that A is not sparse at all.

    2. What you were seeing there was the directionality property of a silicon photocell – basically the same thing as viewing angles are to an LCD monitor.

      Each cell records a different amplitude for the same point light source based on the slight angular difference, which causes vignetting in photos and the camera software has to deal with that. Without a lens, and without the sensor firmware trying to compensate for the effect, you could reconstruct the average location of a light source or an object.

    1. There’s a flat mask in front of the camera that diffracts the light somewhat and causes different pixels on the sensor to see the same points in the scenery slightly differently. You know the properties of the mask and you can “back-project” the information to figure out what the sensor is seeing.

      Another way to look at it is to say that the mask is an incredibly thin lens – it just doesn’t project the image in a coventional way.

        1. No, not a fresnel lens at all, no lens at all, and Dax isn’t right either, there’s no diffraction happening. Basically they created a “grid of pinhole cameras” pointed in slightly different directions. It’s a grid of a huge number of tiny crappy pinhole cameras plus an imperial shitton of mathematics.

          1. “Because an Imperial Shitton is 5/4 of a Standard one…so, a lot”

            As opposed to the Metric Shitton, which is always divisible by ten and only used in Canada by the Trailer Park Boys.

    2. So in theory if you had enough pixels and enough processing and enough colour resolution, you could extrapolate a three dimensional image.
      You could say divide the image sensor into four and compare the four images of the same scene, each taken from a slightly different angle.

      1. Check out Lytro — it’s a light-field camera that does basically that. It uses a grid of tiny lenses and a very-high-resolution sensor to take “pictures” that can then be re-focused or even have the viewpoint shifted slightly.

    1. Most people associate coded apertures with those hexagonal arrays and non-visible light. The flatcam folk’s filter looks exactly like a Modified Uniformly Redundant Array, offset a bit.

    1. Not exactly, the zone plates still have a focal length. The description of the technology should be, an imaging system that uses a virtual lens derived mathematically from the effects of a coded mask over the sensing array.

      Imagine a camera that recorded the fourier transform of an image and you had to do the inverse FFT to get the image, except this one operates spatially and not in the frequency domain.

  2. Nitpicking correction, there is diffraction happening, but maybe it doesn’t need to be taken into account.
    Given that modern cameras usually have a metric tonne of pixels there might be some advantages to eliminating the lens- you don’t need to focus for one.

  3. Let me see if I’ve got this right:
    The camera has no lens, so it cannot be a ton of lenses. The camera works by measuring light that has undergone diffraction. That’s what the mask is for – to diffract the light in a controlled way.

    Since the amount of diffraction is known, math is used to compute the amount of diffraction it has undergone and this is then used to reconstruct the image. I imagine color contrast plays an important role.

    The images did have a weird depth of field.

  4. A simplest way to get insight into how coded aperture works is to consider the application of turning low FPS footage into high FPS footage. This is done by vibrating a cut-out of something that looks like a 2D bar code at a high speed in front of a normal video camera. With one sweep of the cut-out per frame, you get an out-of-focus, motion-blurred shadow encoded in each pixel. This information can be used to reconstruct everything else which got motion-blurred in the frame.

      1. It looks like a Fourier or wavelet mask. I’m sure I have seen this somewhere in the last ten years. Perhaps related to a camera a Stanford that could look through a hedge – like moving your head back and forth to build an image through a lattice. I don’t see how it works up against the pixels unless maybe parts cover fractions of a pixel.

  5. If the metric shit-tonne of math involved in this interests you, I dug up a paper from Berkeley: http://accelconf.web.cern.ch/AccelConf/ibic2013/talks/weal1_talk.pdf

    It’s also got pictures of different masks that work; like the zone-plate/fresnel that most photographers are aware of, modified uniform random arrays (MURA), no-two-hole-touching MURA, and more. Good paper if you want to recreate it yourself and equations with ⊗ don’t make your head spin. Me and tensor products, though, ugh.

    1. Okay, so ⊗ of sets, fine. G(x,y)⊗H(x,y) just starts to make my head go a bit wobbly. A(x,y)⊗Ã(x,y)=δ and I’m out. I’m sure it has a rational meaning that could be explained in a few sentences that the maths are just shorthand for, but I’ll take the word problem thank you.

    1. Looks like there’s some pics in the video. But yeah, it may just be a lab curiosity:

      “Rice’s hand-built prototypes use off-the-shelf sensors and produce 512-by-512 images in seconds, but the researchers expect that resolution will improve as more advanced manufacturing techniques and reconstruction algorithms are developed.”

      Have been alive long enough to have seen the equivalent of this statement many times before, and how it usually plays out over time. Rephrased more realistically:

      “It works, but the image quality sucks. Seriously, there’s a huge penalty for doing this. We have no idea how to make it work better. Assuming standard camera technology advances far enough without being limited by inviolable physical or mathematical laws, it could work better; possibly one day even producing an image that would be considered useful today. But compared to images produced by those future competing technologies, it will still suck, so it may never find use beyond a few niche applications.”

  6. Quick spelling correction, in the first paragraph, “but the basic design has remainded the same since”, it should be “remained”. Also, the last sentence of the last paragraph is either missing a period, or some of the text got cut off. I’ll go with the latter, since that’s a rather awkward way to end a paragraph and the article as a whole. Great article otherwise!

  7. Isn’t the concept of light field photography that you use a tiny lens for each picture? And doesn’t that have the same effect? So does that mean this is not new since you can already make sensors very flat?

    Related link: https://en.wikipedia.org/wiki/Microlens

    Quote: “Wafer-level optics (WLO) enables the design and manufacture of miniaturized optics at the wafer level using advanced semiconductor-like techniques. The end product is cost effective, miniaturized optics that enable the reduced form factor of camera modules for mobile devices”

    And on a sidenote: that first guy in the video has the speech mannerisms of Richard Feynman, odd to see

  8. Interesting technology, but the results look pretty ordinary so far. They probably need to move to a custom chip with a processor under each pixel, and a lot more pixels. Then again with the speed things progress these days we could see consumer products within 5 years based on the idea.

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.