Early cameras and modern cameras work pretty much the same way. A lens (or a pinhole acting as a lens) focuses an image onto a sensor. Of course, sensor, in this case, is a broad term and could include a piece of film that–after all–does sense light via a chemical reaction. Sure, lenses and sensors get better or, at least, different, but the basic design has remained the same since the Chinese built the camera obscura around 400BC (and the Greeks not long after that).
Of course, the lens/sensor arrangement works well, but it does limit how thin you can make a camera. Cell phone cameras are pretty skinny, but there are applications where even they are too thick. That’s why researchers at Rice University are working on a new concept design for a flat camera that uses no lens. You can see a video about the new type of camera below.
The idea is simple: take a conventional sensor and place a mask over it that has a grid-like arrangement of apertures. The resulting image doesn’t match what you would see, but it provides enough information that a computer can reconstruct the picture.
At Hackaday, we’re no strangers to homebrew camera builds (including one built around an Arduino and a single pixel). However, none of those had the promise of being super thin.
51 thoughts on “Flat Camera Uses No Lens”
Thats pretty neat. When I was making my astrocam and pointing the raw sensor around my room you could see “ghosts” and I wondered if you could statistically tease out the image if you know the exact characteristics of the sensors pixels.
Now all we need is the industrial printable batteries and color e-ink displays!
You could apply Wiener deconvolution to these “ghostly” pictures. Having uncompressed data and after tuning parameters it is possible to obtain clarity that allows You to recognize objects in scene.
“Wiener deconvolution” is right up there with “herringbone wang tiles” on my list of favorite tech terms.
It’s way cooler than Wiener decon. It exploits sparse solutions via convex optimization. For more info check out:
You can actually solve problems that were generally considered impossible.
Under very specific constraints solving Ax=y given y and A will find the unique solution provided x is sparse.The proof is hard, but the implementation is particularly easy. The same procedure applies to predicting what movies you might like. Most of the proofs involve the randomness of A as a constraint.
FWIW Wiener is the L2 solution. Donoho is the L1 solution.
But that assumes A is sparse. Seeing as how A is a matrix mapping the output of a particular pixel to, let’s say, light received by the array from a particular direction (e.g., corresponding to an ‘image’ pixel), and how each pixel of your average CCD or CMOS array is generally identically sensitive across angles to all the other pixels, well… I’m going to guess that A is not sparse at all.
It does NOT presume A is sparse. In general A is quite dense and the system of equations is underdetermined. Donoho’s 2004 papers give the key details. Also look at Candes’ papers from the same time period.
for the single pixel version.
NB the mathematical concept is very similar to the Coded Aperture Mask, but the implementation is different (L1 vs L2).
What you were seeing there was the directionality property of a silicon photocell – basically the same thing as viewing angles are to an LCD monitor.
Each cell records a different amplitude for the same point light source based on the slight angular difference, which causes vignetting in photos and the camera software has to deal with that. Without a lens, and without the sensor firmware trying to compensate for the effect, you could reconstruct the average location of a light source or an object.
How does the mask work? Is it similar to how Coded-Aperture Masks work on a telescope?
So you have no physical lens, but from a large number of data points, you mathematically simulate a lens ?
There’s a flat mask in front of the camera that diffracts the light somewhat and causes different pixels on the sensor to see the same points in the scenery slightly differently. You know the properties of the mask and you can “back-project” the information to figure out what the sensor is seeing.
Another way to look at it is to say that the mask is an incredibly thin lens – it just doesn’t project the image in a coventional way.
So a bit like a Fresnel lens ?
No, not a fresnel lens at all, no lens at all, and Dax isn’t right either, there’s no diffraction happening. Basically they created a “grid of pinhole cameras” pointed in slightly different directions. It’s a grid of a huge number of tiny crappy pinhole cameras plus an imperial shitton of mathematics.
Because an Imperial Shitton is 5/4 of a Standard one…so, a lot.
Seems like it shares some concepts with a light field camera.
“Because an Imperial Shitton is 5/4 of a Standard one…so, a lot”
As opposed to the Metric Shitton, which is always divisible by ten and only used in Canada by the Trailer Park Boys.
The metric shitton is used in the entire world except for the US… (no, the entire world is not “the US and Canada”)
So in theory if you had enough pixels and enough processing and enough colour resolution, you could extrapolate a three dimensional image.
You could say divide the image sensor into four and compare the four images of the same scene, each taken from a slightly different angle.
Check out Lytro — it’s a light-field camera that does basically that. It uses a grid of tiny lenses and a very-high-resolution sensor to take “pictures” that can then be re-focused or even have the viewpoint shifted slightly.
Coded aperture masks aren’t new – ESA’s been flying one for a decade and a half http://sci.esa.int/integral/19990-spi-coded-mask/
Most people associate coded apertures with those hexagonal arrays and non-visible light. The flatcam folk’s filter looks exactly like a Modified Uniformly Redundant Array, offset a bit.
Not exactly, the zone plates still have a focal length. The description of the technology should be, an imaging system that uses a virtual lens derived mathematically from the effects of a coded mask over the sensing array.
Imagine a camera that recorded the fourier transform of an image and you had to do the inverse FFT to get the image, except this one operates spatially and not in the frequency domain.
Nitpicking correction, there is diffraction happening, but maybe it doesn’t need to be taken into account.
Given that modern cameras usually have a metric tonne of pixels there might be some advantages to eliminating the lens- you don’t need to focus for one.
metric tonne? I thought we’d decided it was a metric shitton?
Let me see if I’ve got this right:
The camera has no lens, so it cannot be a ton of lenses. The camera works by measuring light that has undergone diffraction. That’s what the mask is for – to diffract the light in a controlled way.
Since the amount of diffraction is known, math is used to compute the amount of diffraction it has undergone and this is then used to reconstruct the image. I imagine color contrast plays an important role.
The images did have a weird depth of field.
Digital Brute Force To Make Pics
A simplest way to get insight into how coded aperture works is to consider the application of turning low FPS footage into high FPS footage. This is done by vibrating a cut-out of something that looks like a 2D bar code at a high speed in front of a normal video camera. With one sweep of the cut-out per frame, you get an out-of-focus, motion-blurred shadow encoded in each pixel. This information can be used to reconstruct everything else which got motion-blurred in the frame.
maybe for some object recognition tasks you not even need to reconstruct the image
The mask looks to about 10cm in front of the lens so the flat part of this is what? It can be folded flat when not used perhaps – like a pinhole? Need their longer video.
The actual mask is directly on the sensor, but needs to be very tiny. Moving the mask away from the lens allows it to be large enough to cut and adjusted by hand while testing.
It looks like a Fourier or wavelet mask. I’m sure I have seen this somewhere in the last ten years. Perhaps related to a camera a Stanford that could look through a hedge – like moving your head back and forth to build an image through a lattice. I don’t see how it works up against the pixels unless maybe parts cover fractions of a pixel.
If the metric shit-tonne of math involved in this interests you, I dug up a paper from Berkeley: http://accelconf.web.cern.ch/AccelConf/ibic2013/talks/weal1_talk.pdf
It’s also got pictures of different masks that work; like the zone-plate/fresnel that most photographers are aware of, modified uniform random arrays (MURA), no-two-hole-touching MURA, and more. Good paper if you want to recreate it yourself and equations with ⊗ don’t make your head spin. Me and tensor products, though, ugh.
Okay, so ⊗ of sets, fine. G(x,y)⊗H(x,y) just starts to make my head go a bit wobbly. A(x,y)⊗Ã(x,y)=δ and I’m out. I’m sure it has a rational meaning that could be explained in a few sentences that the maths are just shorthand for, but I’ll take the word problem thank you.
To repeat an age old comment – pics or it didn’t happen. I wish HAD would run articles on real events and not lab curiosities.
Looks like there’s some pics in the video. But yeah, it may just be a lab curiosity:
“Rice’s hand-built prototypes use off-the-shelf sensors and produce 512-by-512 images in seconds, but the researchers expect that resolution will improve as more advanced manufacturing techniques and reconstruction algorithms are developed.”
Have been alive long enough to have seen the equivalent of this statement many times before, and how it usually plays out over time. Rephrased more realistically:
“It works, but the image quality sucks. Seriously, there’s a huge penalty for doing this. We have no idea how to make it work better. Assuming standard camera technology advances far enough without being limited by inviolable physical or mathematical laws, it could work better; possibly one day even producing an image that would be considered useful today. But compared to images produced by those future competing technologies, it will still suck, so it may never find use beyond a few niche applications.”
Quick spelling correction, in the first paragraph, “but the basic design has remainded the same since”, it should be “remained”. Also, the last sentence of the last paragraph is either missing a period, or some of the text got cut off. I’ll go with the latter, since that’s a rather awkward way to end a paragraph and the article as a whole. Great article otherwise!
Really? How interesting…
Isn’t the concept of light field photography that you use a tiny lens for each picture? And doesn’t that have the same effect? So does that mean this is not new since you can already make sensors very flat?
Related link: https://en.wikipedia.org/wiki/Microlens
Quote: “Wafer-level optics (WLO) enables the design and manufacture of miniaturized optics at the wafer level using advanced semiconductor-like techniques. The end product is cost effective, miniaturized optics that enable the reduced form factor of camera modules for mobile devices”
And on a sidenote: that first guy in the video has the speech mannerisms of Richard Feynman, odd to see
Looks like they fined tuned their Compressive Sensing experiments (explained here http://dsp.rice.edu/cscamera ), getting rid of any lenses. By pseudo-randomly masking the light, one can reconstruct the most likely input signal using a Compressive Sensing “L1” algorithm (like Least Squares (“L2”)).
Fingerprint sensors just became cameras…
Would be a concern, weren’t it so that anything with those sensors already has cameras or cameras pointed at them from various angles.
Interesting technology, but the results look pretty ordinary so far. They probably need to move to a custom chip with a processor under each pixel, and a lot more pixels. Then again with the speed things progress these days we could see consumer products within 5 years based on the idea.
> we could see consumer products within 5 years based on the idea
but, we didn’t :(
Pretty awesome stuff. Will not be applicable for regular imaging application due to limited angular resolution. Their paper is a joy to read!
If you had a cylindrical lens (ie, with the entire surface acting as a flat camera) that was about the width of a human head, you might be able to reconstruct a seamless 3-d view of the surroundings.
spherical flat camera, that’d be rad but i wonder if the tech can do that, or if flatness is a prerequisite for it working
Have a look at Pelican Imaging and their amazing PiCam:
As far as I can tell, that’s *it*, that is *the* solution for cameras for phones, and probably for auto-drive cars, too. I don’t know why Sony or Samsung hasn’t acquired them for $500m.
Their website is a little odd, though, which always raises suspicions.
You forget the Arabs in your introduction
Almost seems similar to synthetic radar aperture technology…
Flatcam still uses mozi’s pin hole camera principle
Please be kind and respectful to help make the comments section excellent. (Comment Policy)