This sounds like it would make things easier if I can be sloppy and don't have to worry about focus or aperture when I'm taking the photo, and wait until I edit the photo to decide. But it's a bit more complicated than that. First, I'll discuss what it means for something to be "in focus" in a photo, and lead into what a "light field" is and how I think refocusing works.
First, a disclaimer: I don't know the real tech specs of the Lytro Illum beyond what they share on their website. I've only read their manual and watched a bunch of their tutorials on vimeo. I haven't read the Ren Ng 2006 Stanford dissertation. I've looked over the 1996 Levoy/Hanrahan(Stanford) SIGGRAPH 1996 Paper (pdf). (I think Hanrahan was Ng's advisor) I might oversimplify something, or get a "16" cross-wired here or there... but I'll try to cite my sources.
What does it mean to be in-focus?Before getting into the Illum, it may be useful to discuss what focus really means.
|Figure 1: Construction lines showing the blue point would
focus on the sensor and the red point would focus behind
Consider two points at different distances from the camera: a blue point and a red point. Approximate the camera with just a lens and an image sensor. In Figure 1, the lens is set to focus on the (farther) blue point.
|Figure 2: The blue dot will show up in-focus on the image sensor.The
red dot will be defocused. The wider aperture will be more defocused.
Also, we can see that a narrower aperture lets fewer light rays in and lands on a narrower section of the image plane. It still is not completely in focus, but it is less defocused than a wide aperture(it has a smaller "circle of confusion").
To the right of the image sensor, I approximated what you might see with a narrow and a wide aperture. In the wide aperture, the light from the red dot is spread so wide that it is barely even visible, but the blue dot is still sharp.
|Figure 3: Move lens forward to focus on red dot. The blue dot would
focus in front of the sensor so it would be defocused on the sensor.
So far, I've been describing everything as objects outside the camera projecting light onto the image sensor.
|Figure 4: Consider a point on the image sensor and gather the rays
from the full aperture.
In Figure 4, consider another point on the image sensor. In a normal camera, the light recorded at this pixel is the total of all light through the aperture, shown as the red shaded region (this is the wide aperture).
These rays continue out into the world until they hit something. Only a very very small portion of these will hit our old red dot, so the red dot only makes a small contribution to the value at this pixel. If there was something at the convergence point, that would make up most of the value at this pixel.
There could be many objects in the world that are contributing to the value at this pixel. I first presented the idea of being in-focus as the light coming from different directions from a point in the world all coming together at a point on the image sensor and that blur/defocus came from the light from an object at a different depth being spread out across a larger area of the sensor. Now, I suggest another way to think of it - starting from the image sensor. If all the light arriving at a pixel is coming from the same source, the point will be in-focus. If the light is getting tiny contributions from lots of different objects, it will be out-of-focus.
When viewing a photograph from a normal camera, every pixel contains the contributions of light passing through the whole aperture. But you cannot break down what all the objects were(or how far away they were) that contributed to the value in that pixel.
The Light Field and Refocusing
|Figure 5: Lytro uses a microlens array to record a section
of rays instead of the full aperture.
Lytro adds a microlens array in front of the image sensor (Figure 5). At every pixel, there are actually several microlenses gathering light from different directions. (This is just to illustrate the idea. I don't know if the sections overlap, or if it tries to divide the rays evenly or bias to the center...)
|Figure 6: Trace rays that are gathered at the image sensor.
First, pretend that the lens was focused on the red dot(Figure 6). From the point of the image sensor, gather the rays that land at our pixel from the virtual lens at the red position(shown as the three dashed red lines). To the left of the red-focused lens, there are the rays coming into the camera that we're interested in.
|Figure 7: Trace rays back through the actual lens,
going back through light field.
The blue rays are the rays that were actually recorded when we took the photo. (If you're following along in the Levoy/Hanrahan paper, I'm only at Section 2 - visualizing the plenoptic function / light slab.)
To simulate the lens focused on the red dot, then for our pixel of interest, we use the rays from the solid red lines. That is the basic idea behind refocusing.
We also can see that it doesn't always work. In Figure 7, there is a blue ray that's pretty close to the top red ray, but as we go further down, the available (blue) ray samples aren't quite lined up with the red rays. For these, we have to find nearby rays and interpolate, so the result will not be as accurate.
Also, once we have recorded the light field, we don't need to know anything about the scene in front of the camera.
The Lytro Implementation of the Light Field
We can't have an infinite number of ray directions. In practical terms, the more samples we acquire, the more memory and disk storage space we will need.
According to the Lytro Illum Technical Specifications, the Illum has a 40 MegaRay sensor. Their software produces a 2450x1634 (4 Megapixel) image from one of their files. So a quick division would say they average 10 rays per pixel in two dimensions. (My above diagrams have only been one dimension.)
10 rays/pixel is not a very large number. Their software actually creates an "image stack" of 7 images, so I think that's the number of planes you can really focus on. When I've looked at the individual images in the stack, it looks like they're dividing it up some other way though because I don't see much difference in what is in focus.
In the Lytro Illum User Manual or the Lytro Support description of "refocusable range", see the section on "Depth Composition Features / The refocusable range". If I'm interpreting their diagram and their article about sharpness correctly, it looks like there are really 2 real depths that you can refocus on with maximum sharpness - the near peak, and the far peak. The diagram seems to indicate you can refocus in a little wider range, but that it won't be as sharp. Something I thought was interesting was that in the sharpness article, they say the primary (+0) focus is actually a "low resolution point."
Resolution and Sensor SizeTo put the technical specification into context, compare the resolution to my other cameras. I'm getting most of this from Wikipedia.: (For reference, I think the Galaxy s3 camera is comparable to an iPhone 5. * )
|3504x2336 (8.2 Megapixels)
|CMOS 22.5x15 mm
|Canon 5D Mark 2
|5616x3744 (21 Megapixels)
|CMOS 36x24 mm
|Samsung Galaxy s3
|3264x2448 (8 Megapixels)
|CMOS 8.47 x ? mm
|2450x1634 (4 Megapixels,
|CMOS 6.4 x 4.8 mm
I'm a little disappointed in these numbers.
I like to have a little breathing space in the resolution for editing (cropping, straightening/rotating). Also, images are just sharper if I take a higher resolution image and resize it down. I was starting to feel the limitations(resolution, sensor size, ISO) of the 30D when I upgraded to the 5Dm2. Both the resolution and sensor size of the Illum are smaller than than my 30D, which is 8 years older!
I'm not a huge Megapixels guy, but I think the sensor size is important - and the Illum sensor is even smaller than my camera phone! Intuitively, I think a bigger sensor lets you have more space per pixel to gather light, and would have better performance in low light conditions - higher clean ISO.
40 MegaRays is a blessing and a curse. In a normal camera the number of samples is roughly the number of pixels (I'm going to ignore R,G,B, and Bayer Patterns for the purpose of this discussion.) In the Illum, the samples are actually the rays. As discussed above, this is what enables the refocusing capability, and for that, more samples (rays) are better. But you have to fit those on the sensor, and so they are squishing almost twice the samples of the 5Dm2 into a space that is less than 1/5 the linear dimensions (1/28 the area).
The Illum sensor is tiny - even smaller than my camera phone, and in practice, I did find that the low light performance was disappointing. The camera claims a top ISO of 3200, and I found even 1600 was noisy. Since the resolution was already pretty low, there wasn't a lot room to clean it up through resizing. I don't think the low light performance of their sensor justified the tiny sensor size. (Speculation) I think a noisy image comes back to bite me when their software generates the depth map, but that discussion will be a later entry.
A Noisy Image
The camera advertises a top ISO of 3200. With Canon cameras, a top ISO of 3200 usually means that I can shoot at ISO 1600 and still be pretty clean. The above shot was shot at ISO 1600, but it's still got a lot of noise (most visibly, the magenta noise in the dark areas).
There are a few things to focus on:
- curtains and lights in the back
There are serious problems in the depth map. Darker means closer and brighter means farther.
- Look at the dancers themselves. The heads and bodies are approximately the same distance away. The faces and costumes register at a closer depth. Their skin should be at the same depth, but instead, it's the brighter tone, meaning it is registering at the same depth as the curtains behind them.
- There is not a distinct depth/tone for the feet. The feet should be closer/darker in the depth map than anything else. But instead, the depth map says that they are the same depth as the curtains behind the dancers.
- The edges of the depth map are splotchy and don't define the real depth edges of the objects in the scene.
However, it has its share of problems too.
- What is the blob on the floor underneath the dancers? Only part of the floor registered as close, but it should go all the way across the frame.
- There is not a lot of separation between the rear couple and the back wall. Granted, they are close to the back wall. But ideally, we should see a smooth gradient on the floor going all the way back, but instead, there are two artificial blobs.
- On the closer couple, the number on his back is the only thing that was identified as being on a separate plane.
- The dark part of the depth map above his head makes no sense. Everything above his head is on the back wall, and should be brighter.
Parting ThoughtsThere are a lot of cool ideas that went into the Lytro Illum camera and its ability to refocus. In this entry, I've tried to discuss how it works in terms of the technology and hardware, and what limitations I've experienced.
In future entries, I will discuss the user interface of the camera (probably will be a shorter one), and the processing software (probably will be a couple longer ones).
Other links: (Other people know way more about this than I do.)
- (video) Paul Debevec discussing light fields and reflectance fields
- (video) Someone from Lytro discusses their cameras This is followed by a round table discussion. A lot of it is really talking about the huge amounts of data involved with a light field.
- Lytro Blog: What is a Light Field? - Very good explanation of a light field, and they put way more work into their diagrams of the micro-lens array than I'm willing to.
- Lytro Illum Technical Specifications
- Lytro Illum User Manual
- Lytro highlights these Academic Papers
- LightField Forum (Actually, I haven't studied this too much myself, but after a quick skim, I see they have better diagrams than I do, and it looks like they're on top of latest news and specs on these cameras.)
* The iPhone 5 and the Galaxy s3 came out around the same time, and both use Sony BSI cameras. iPhone 5 has a Sony Exmor R IMX135. The ifixit.com teardown shows the Galaxy s3 has something in the Sony BSI family, but they say it is not the same as the IMX 145 or IMX 105, so I don't know what it is.