Inspired by Wheatstone’s original stereoscope and augmenting it with modern factored light field synthesis, we present a new near-eye display technology that supports focus cues. These cues are critical for mitigating visual discomfort experienced in commercially-available head mounted displays and providing comfortable, long-term immersive experiences.
Axa Assurance Maroc - Insurer Innovation Award 2024
The Light Field Stereoscope | SIGGRAPH 2015
1. The Light Field Stereoscope:
Immersive Computer Graphics via Factored
Near-Eye Light Field Displays with Focus Cues
Fu-Chung Huang1,2 Kevin Chen1 Gordon Wetzstein1
1Stanford University
2Now at NVIDIA Research
2.
3. Top View
Vergence & Accommodation Match!
Left Eye Right Eye
(Rotation) (Focus)
RealWorld
Parallax
Over Pupil
40. Multiplicative Two-layer Modulation
𝑡1
𝑡2
l = (ϕ1 𝑡1)o(ϕ2 𝑡2)Reconstruction:
Input: 4D light field for each eye
𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝛽l − ϕ1 𝑡1 o ϕ2 𝑡2
2
{𝑡1, 𝑡2} 𝑠. 𝑡. 0 ≤ 𝑡1, 𝑡2 ≤ 1
𝑡1 ← 𝑡1 o
ϕ1
𝑇(𝛽l o (ϕ2 𝑡2))
ϕ1
𝑇 l o ϕ2 𝑡2 +𝜖
for layer t1
[Wetzstein et al 2012]
41. Input light field to the eyeMultiplicative Two-layer Modulation
𝑡1
𝑡2
Left Eye: Front Panel Left Eye: Rear Panel
Front Panel Rear Panel
Output: 2 layers for each eye
42. Front Focus Mid Focus Rear Focus
Front Object Mid Object Rear Object
Multiplicative Two-layer Modulation
𝑡1
𝑡2
Front Panel Rear Panel
43. Multilayer Displays
Akeley et al. [2004]
Love et al. [2009]
Narain et al.[2015]
Optical
Overlay
Temporal
Multiplexing
51. Content Generation
• 5x5x2 Stereo Light Field
• GPU Factorization
• 3 ~ 5 Iterations per frame
• 25(views) x 640x800 x 2(eyes)
• 5 ~ 10ms per iterations
Left eye Right eye
61. Technical Details in the Paper
Good Parameter
Space
Limited by Diffraction
100 200 300 400 500
2
3
5
6
4
Front Panel PPI
NumberofViews
Minimum Views to
Support Accommodation
IPD / 2 * M2
IPD / 2 * M1
Center
Center
Diffraction Analysis Light Field Distortion Asymmetric Image Formation
65. IPD / 2 * M2
IPD / 2 * M1
Center
Center
Top View Light Field Side View Light Field
Notas do Editor
Depth of field is a powerful tool to tell stories,
But it also important for the eye to tell the difference in depth, even for a single eye.
However, this focus cue is missing in the current generation Head mounted display
In this talk, we will show you how to enable this for a comfortable visual experience.
In real world, objects emits light field with parallax over the pupil that contains enough depth information,
so our eye can not just verge to the target, which is a rotation, but also accommodate or focus to it.
In this case, the two actions are always matched
In the current VR head-mounted,
There’s ony one display plane, and has no parallax,
so the image always looks flat, that the eye cannot accommodate to the true depth.
The mismatch between vergence and accommodation can sometimes cause discomfort, eye-strain, and even nausea
In this work, we build a VR head-mount that is capable of emitting light field,
allowing our eye to truly focus
The prototype is also inexpensive and easy to build
We all the resources and instruction online
And we think this is gonna change the experience for future VR.
So for the current generation VR headset, we all know that consumer VR is rising,
and will be coming to us pretty soon.
While the majority of the advertisement is all about gaming
you can also immerse yourself in real world event or places that you have never been
It can also be used in collaborative work
In Educations
help people treating Post Traumatic Stress Disorder
On medical training or even remote surgery like the Da Vinci project
Where doctors can spent hours in surgery and you really want them to be comfortable with that.
All these applications sound exciting,
but, there’s still a catch, it’s not perfect yet;
Here is a safety warning from one of the recent VR device.
It lists some symptoms that doesn’t sounds very pleasant,
and there are many causes. But some of the symptoms like
Eye-strain, blurred or double vision, Nausea, discomfort, or fatigue.
are related to the Vergence-accommodation conflict,
So why do we have these in HMD?
In real-life, when looking at an object,
the eyes focus at some distance
And the eyes also verge or rotate so the two retinal images match.
In this case, the vergence and the accommodation are coupled together
When the object gets closer,
The eyes accommodate more and also converge more,
so that the two cues are still coupled.
In an immersive Head-Mounted, we show two tiny objects on the panel near the eyes
(click)After the magnification, the object appear to be on the right spot.
(click)When the object moves further away
We increases the separation or disparity
And the eyes rotate away or diverge from each other.
But since the eye is still focusing on the original depth,
This separation decouples the vergence from the accomodation.
(CLICK) And since our brain is so used to the coupling, the artificial separation leads to all kind of discomfort and problems
Why do we even care about the problem? This diagram shows the importance of difference depth cue at different distance range
(click)When objects are far away, we usually use the relative size and Arial perspective to determine the relative depth.
(click)When things are close to us, we use motion parallax and stereopsis to discriminate the depth, and most HMDs support up to this.
(click)However, Vergence and accommodation play the key role, when things get really close, like within our arm length
And this is the range that we use our hands to manipulate with objects.
If we can solve the problem, we can allow for more comfortable and useful interaction for future VR.
This is our solution to support accommodation.
In addition to the traditional head-mount components,
(click) we only add a 2nd panel and a spacer that doesnt affect the design,
And we leave the rest to computation
Here is an example of what the eyes can see.
You can focus on the foreground, leaving other places out-of-focused
Or you can also focus on the background, and leave the foreground out-of-focus.
Note that we can actually provide a continuous 3D space for the eye to focus, not just the 2 panels.
All these new focusing capability are done with our naked eye
So before we jump into how this work, let me briefly review the history
The very first idea of making 3D is around 1838, and the stereoscope was a big hit in that time.
Then it took more than a hundred years to have the first computer assisted head-mount.
Fast-forward to now, consumer VR is exploding and is around us.
However, looking forward, there still a lot of challenges,
and in this work, we focus on improving the visual experience by providing accommodation cue
There has been some work along this line using deformable mirrors, varifocal lens, additive multilayers, or integral imaging,
but the form factor or resolution is not really satisfying.
Learning from our prior Tensor display research,
we try to address the vergence-accommodation problem using multiplicative multilayers.
To enable an virtual experience similar to real world,
we need to understand the visible light field to the eye.
The important message here is that
objects at different depth have different visible light field,
that contains enough parallax even for just one eye to focus differently.
To replicate this light field using a display,
prior work needs many multiplicative layers running at hi-speed, to allow for a wide viewing angle.
But the situation is different here!
The head is relative stationary to the headmount, and the eye box only span a small angle to the display
If we compare the two cases,
Traditional TV cares a lot about brightness
The display is shared, so in general the content is high rank, meaning you need multiple layers, and temporally multiplexing multiple frames.
In VR headmount, the eye can adapt to the reduced brightness;
And since the eye is relatively fixed to the display, the experience is kind of personal
So the content is of low-rank, allowing us to implement with 2 panels without temporal multiplexing.
This is one of our prototype with 2 LCD panels running at 60Hz that emits light field,
so the eye can have the correct focus cue to form a comfortable 3D perception
And here’s the focusing example that the eye can freely refocus.
(say no eye tracking advantage, depth of field is completely done with the eye…)
The next question is: how do we generate contents on the 2 display panels that gives us the visible light field
To allow for such experience, we first generate all possible views visible to the eye.
This is our input
For the multiplicative 2 layers modulation,
we know that each light ray is a multiplication of the pixels on the 2 panels
And we can define each view being a set of rays entering the eye at different location
So this is the central view
And the right most view.
I also want to mention that This is the parallax of the light field over the pupil that allows us to focus naturally
A complete description of the light field can be expressed using a matrix from the 2 panels,
(click) and this allows a inverse matrix factorization problem, that we know how to solve it efficiently from our prior work
The details please refer to our paper.
Here is the factorization found using our algorithm.
Objects are mostly assigned to its nearest plane,
(click) but objects in between planes are distributed with some strange patterns
That you can not just focus on objects on the planes,
where you can get a focus on them without a problem,
(click) But also for objects in between planes, and you can still get a reasonable focus on them
There are many different ways to approximate light field, with different degrees of freedom.
You may wonder how that’s different from ours
Here is an example of a dark object in the front and a a bright object in the back.
Their edges are just touching each other to the eye
When looking from the right side of the eye,
there is some separation between the 2 due to parallax,
But when you look from the left side of the eye,
If you are using additive multilayer,
the bright background shine through the front occluder.
This is because you can only add light, leaving an incorrect light field
here is a real-time rendering example showing the incorrect light field.
(Click) Using a multiplicative method, we can actually have the front panel blocking the shine through light,
Giving the correct light field.
So let me some implementation details and show you some results
We built our prototype from Adafruit’s design, you can find info on their website
We bought 2 aspheric lens from eBay, they are about $10 each
Our initial design use a 5 inch toshiba panel, but the diffraction behavior is not good enough
So we switch to Chimei 7inch panel, and also cheaper, about $35 each
The universal board is about $30 each
You can find them all on ebay
This is the latest prototype we built at NVIDIA, and we are also showing this at our ETECH booth.
Please come try it out.
We render 5x5x2 light field using OpenGL for real-time rendering, or PovRay for off-line renderer.
We can also pull off light fields from the light field camera.
(click) We factorize the content on GPU using CUDA.
(click) We implemented 3 algorithms described in cascaded display, they all run the same, converge around 3 to 5 iterations.
Each iteration we solves 25 views with each image 640x800 and 2 eye,
And each iteration takes between 5 to 10ms, depending on the card.
And here is the results.
In a traditional headmount, everything looks sharp in focus, but it also looks kinda flat.
In the light field headmount, we can focus on the foreground, leaving the background out-of-focused,
Or look at the background, leaving the foreground defocused.
When you look at the traditional headmount, when the eye diverge, and also decrease the accommodation,
the image will also look blurred, you will experience this in our etech booth
This is actually very disturbing when we have too much high frequency in the scene/foreground
This is actually very disturbing when we have too much high frequency in the scene/foreground
Finally like we mentioned earlier, enabling interaction with hand is important for future VR,
and we can focus on foreground right hand
Or background left hand
And This is the place where we think people are going to find light field VR interesting by designing real world experience.
we use a translational stage to shoot two image,
but now this time using your own eye instead of using computer software
Please also refer to the paper for the diffraction analysis, light field is distortion, and image formation,
That we don’t have time to cover here
There some limitations to our solution
First is the reduction in brightness since we stack light-attenuation LCD panel.
Fortunately human adaptation to brightness is extremely good for VR experience but not AR.
Another issue is latency, currently we render a total of 50 views each frame, and it’s a big performance hit to the rendering pipeline
We believe the next-gen engine incorporating shading reuse will solve the problem
Finally, diffraction is the biggest challenging for future HMD, and we are here on the limit of geometric optics.
2 pictures
Including insight in human vision and human perception into the hardware and computation
Can enable better visual experience
Like Vision-correcting display last year