Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Stereoscopic 3D: Generation Methods
and Display Technologies for Industry
and Home Entertainment
Raymond Phan

Ph.D. Candidate
Multimedia and Distributed Computing (MDC) Research Laboratory
Department of Electrical and Computer Engineering
Ryerson University 1
Human Computer Interaction Guest Lecture
Thursday, March 8th, 2012

Outline of Presentation
• Introduction
– Stereoscopy / 3D Vision
• What is 3D all about??
• Depth and Disparity
• Some methods on generating 3D content
– Conversion from 2D imagery / video to 3D
• Cut and Paste Technique
• Depth Based Image Rendering – Recover Depth Maps
– Automated Methods – Using motion, focus cues, perspective
– Semi-Automated Methods

Human Computer Interaction Guest Lecture 2

Outline of Presentation – (2)
– Acquiring 3D content directly
• Stereo Rigs, Multi-camera Setup
• 3D cameras
• Displaying 3D Content
– Anaglyphs (very retro)
– 3D Theatres with polarized glasses
• RealD (most popular), IMAX
– Shutter glass technology
• nVidia 3D Vision, XpanD 3D, DLP projection systems,
DLP TVs

Outline of Presentation – (3)
– Interference Filter Technology
• Based on projecting colours of different wavelengths to
each eye  Dolby 3D, Panavision 3D
– Autostereoscopic Systems
• Technology without the use of glasses
– Parallax Barriers, Lenticular Arrays
– Single view vs. Multi-view systems
• Seen in the Nintendo 3DS, Fujifilm FinePix Real 3D
cameras, etc.
• Applications
• Conclusions

Introduction
• So… what is stereoscopy / 3D vision?
– Creating the illusion of depth in an image or video
– Take images on flat displays, and make it “look real”


Introduction – (2)
• Need to know some basic things first:
– Objects seen with the left eye are separated by
horizontal distances with the right eye  disparity
– Greater/smaller the distance, the closer/farther the
object  depth


Generating 3D: 2D – 3D – (1)
• 1st Method: Cut and Paste Technique
– Used in IMAX’s 3D
DMR process

– 35 mm frames  High res. digital  Left-eye frames
– Right-eye frames  Left frames objects are manually
shifted horizontally to create this new frame

– Remember disparity (close/far)! The closer the object,
the farther the shift needs to be
– Main Disadvantage: Very time consuming!
• Currently done on a frame-by-frame basis
• Due to this, only ~10 minutes of 35 mm video is 3D
converted  Takes ~1 month to complete whole process
• Our MDC Project with IMAX: Goal  Perform 2D to 3D
movie conversion faster
• Use a semi-automatic process to extract objects, and do
this every 10, or 20 frames or so
• In between frames, “guess” the best estimate of where the
objects are

• 2nd Method: Depth-Based Image Rendering (DBIR)
– 3D Content  1 2D Image + Depth Map
– Depth Map: Image containing depth of each image
• Closer Pixels == Light values, Farther Pixels == Dark values
– Orig. Image  Left View. Right view  Use depth
map (d(x,y)) to calculate shifted pixel from left view
Equation to
generate view
Right(x,y) =
Left(x+d(x,y),y)

• Commonly known as 2D to 3D Conversion
– Goal of 2D to 3D Conversion: Use an image and
determine what the best depth map is
– We use this depth map for conversion
– Use original single view image / frame as the left
view, and the depth map to create the right view
• There are two main methods to do this:
– Automated Methods  Automatically examine
features in an image or frame and infer depth


– Semi-Automated Methods  User-guided
• Mark certain areas of the image / frame on what you
think the depths should be at these locations
• Algorithm determines the rest of the depths
• Question: How do we know for sure that
we’re marking the proper depths?
– Been shown that as long as you mark depths in a
perceptually consistent way, perception is good
• Automated Methods:
– Popular Methods: Motion, Focus and Perspective

• Motion: Main Principle
– Objects that are closer
move faster
– Objects that are far
move slower
• Find motion vectors
– Find how much a pixel moves from one frame to
the next  Calculate displacement vector
– Larger vector == Closer depth and vice-versa

• Potential Problems:
– Sometimes, far away objects move just as fast too
– Motion estimation (calculating motion vectors)
can be subject to error (i.e. very fast motion)
– If the image / frame is noisy, will corrupt
measurements
• Depth from focus: Main Principle
– Take multiple pictures of the same scene
– Each is taken with different camera parameters

• We basically change the focal length of the
camera
– Focal length : Distance from the image plane
to the surface to capture
– Crudely, we can change the focal length by
adjusting the zoom of your lens
• After, we find the amount of blur of an object
– In this aspect, sharper surfaces are closer, and
farther objects are more blurry

• We find a correlation between the depth, and
the amount of blur over the surfaces
– Finding multiple images at different focal lengths
is a must!
• Problems:
– Needs > 1 of same shot
• May not have such info
– Math is just too crazy
– Method rarely used now!

• Depth from perspective: Main Principle
– We use parallel lines and vanishing points in an
image or frame to give us a sense of depth
– Examples: Railroad Tracks, Tunnels, Roadsides
– These entities give us a sense of depth where they
appear to converge at a single point
– This single point would be the farthest point in the
image and the farthest depth


• Problems:
– Only a subset of
images / frames
fall into this
category
– Can only deal with
outdoor or with
scenes that have
perspective within them
• Not all images belong here!

• Semi-automatic methods:
– Mark some areas in an image / frame on what you
think the best depth should be
– Use this info to determine the rest of the depths
– This is the area that I am focusing on right now
• We can consider this as a case of multiple
object image segmentation
– Each “object” is a user-marked depth
– We decompose the rest of the image into
different objects  i.e. different depths

•

• This method allows the user to fully control
the depth perception and experience
• Potential Problem:
– Takes more time because of user interaction and
computational complexity increases

• Another way to generate depth maps:
– Specialized hardware
• Example: ZCam  Measures depth using bounced infra-red
light off of objects read in by a camera sensor

– Problem: Hardware is expensive!

Direct 3D Acquisition – (1)
• Can directly acquire 3D information:
– Grabbing both left and right eye images / video
• 1st Method: Stereo Rigs
– Tripod with 2 cameras, separated by eye distance

– Drawbacks:
• Need 2 cameras! Synchronization!
• Difficult to separate cameras by eye distance

• We can also use multi-camera stereo rigs
– Each pair of cameras is positioned at a different
point to capture the same scene

– Each viewpoint captures the objects in a different
way so that we can assemble all these together to
view a 3D object without glasses (more later)


• Example: MERL 3DTV system (w/o glasses)

– 16 cameras and projectors for 16 viewpoints
– Depending on where you stand, you see a
different viewpoint  Just like in real-life!

• 2nd Method: 3D Cameras
– Specialized cameras specifically designed to take left
and right eye images


• Non Digital 3D Cameras take left and right
images on two separate rolls of film
• Digital 3D Cameras (e.g. Fujifilm’s W1) take
left and right images and generate two
separate image files
• IMAX and specialized 3D video cameras
operate in the same way
– Two separate rolls of film
– For IMAX, the cameras are large as the film is
larger. Why? For higher resolution

Displaying 3D Content – (1)
• Left & Right eye images are created
– How do we display these so we can perceive 3D?
– Many technologies exist to display 3D imagery and
video
• Let’s start off with the most basic one: Anaglyphs
– Left & right is filtered with separate colour filters
– Example: If you had a red colour filter, you determine
how much red a pixel has and that’s the output
– Each colour filter is chromatically different
• One filter cannot have any similarity in colour to the other

• When one side is filtered with one colour, you
must choose the other filter to be a
contrasting colour
– How do we choose? Trichromacy theory states
that all colours are made up of Red, Green & Blue
– We basically choose the colour filters from this set
• Examples:
– Red and Cyan (Green + Blue) Filters
– Red and Green Filters
– Red and Blue Filters, etc.

• After you filter each image separately, you
superimpose the results onto one image
• To view the images, you use anaglyph glasses,
where each side is of the same filter you used
– i.e. if you used Red for the left, and Cyan for the
right, we use anaglyph glasses that are of the
same order
– Here, the image with the red filter goes to the left
eye, and the cyan image goes to the right eye

• As such, because we’re seeing two separate
images for two eyes, we thus perceive 3D


• Advantages:
– Great for viewing without 3D technology
– Anaglyph glasses are pretty cheap
• Problems:
– Range of colours can be limited, as the
predominant colours in the images are of the
colour filters you applied
– Doesn’t work will if the range of colours in the
image are limited


• 2nd method: 3D films in theatres with polarized
glasses
– 2 projectors  Left & Right video projected
simultaneously on the theatre screen
– Views filtered with orthogonal polarizing filters
– Viewers wear low-cost polarized eyeglasses
– Each lens is orthogonally polarized with the other


• What’s polarization!?
– Light can be viewed as a propagating wave
– Polarization determines the orientation of a wave’s
oscillations
– When passed through a
polarizing filter, orientation of
the light’s propagation changes
by forcing it through a slit
– Consequence – Not all light passes through
– Left view passed through a horizontal polarized filter
– Right view passed through a vertical polarized filter

– Both views are shown simultaneously on a silver
perforated screen to preserve polarization
– Glasses  Left lens has a horizontal filter
Right lens has a vertical filter
– Left blocks right view, and
right blocks left view!
• Drawbacks:
– Need to keep your head level
– Tilting your head causes the left and right views to
bleed into each other
– Image is darker, as only some of the light is sent

• There is a way to combat “head level” issue
– Circular Polarization  Used in RealD technology
– IMAX used former method  Now they changed
– RealD is used in standard 3D theatres
– IMAX has the bigger screen, and better sound!
• What is circular polarization?
– We change the way the
wave propagates in a
circular motion


• Each lens of the 3D projector continuously
changes polarization direction
• 3D glasses: Circularly polarized liquid crystal
that automatically adjust its polarization
– How is this possible?
– One lens is circularly polarizing clockwise, while
the other is polarizing counter-clockwise
– One lens is designed to filter clockwise images,
and counter-clockwise images for the other
– Each lens receives correct corresponding image

• 3rd Method: Using shutter glasses
– Most popular in current 3DTVs on the market
– Also used in DLP Projection Systems
• Shutter glasses principle:
– Lenses are usually made of LCDs
– Used to separate the left and right views
– Lenses contain liquid crystals that block or pass
light in sync with an IR sensor, connected @ display
– Voltages are applied to the lenses so that one eye
blocks light, but the other one allows it through

– Alternate this shutting off in sync with the image
displayed on the screen to show 3D, via IR sensor
– TV / monitor displays the left image, right lens is
blocked  Allows left eye image to be seen
– After, we do for right image, with left lens blocked 
Allows right eye image to be seen


• Is used in nVidia 3D Vision Kit & XpanD 3D
– XpanD 3D: Company that markets shutter glass 3D
technology to homes and theatres
• Currently > 1000 theatres with shutter glass tech.
– nVidia 3D Vision: Kit for an nVidia video card
• IR sensor connected to video card to control views
• Only works with a compatible 3D monitor
• Advantages:
– No silver screen and keeping your head level
• Disadvantages: Shutter glasses are expensive!
– Need to replace batteries, high maintenance

• DLP 3DTV technology further explained
– DLP: Digital Light Processing
– Backbone: Digital Micromirror Device
• Tiny mirrors direct light
• Device can have over 1 million mirrors!
– Each micromirror is either ON or OFF
• ON reflect light out towards screen
• OFF do not reflect out towards screen
(absorb it instead)
– Each mirror in the DLP 3DTV is
controlled by a pixel in the image
to display to the screen


• For DLP 3DTVs, mirrors == diamond configuration
– One mirror displays two pixels of input data: How!?
• Each mirror shows one pixel, then does a half-pixel shift
downwards and shows the other pixel immediately below
• @ twice the normal frame so you can’t see the change

• Wait! Aren’t we losing 50% of the data?

– No! The half-pixel shifting ensures same resolution
• Called SmoothPicture algorithm
– Saves bandwidth: Use same bandwidth for 3D images
– For a 2D image, the input data is the image itself
– For showing 3D, the left-eye image is shown first,
then the right-eye image is shown after ½-pix shifting
– LCD shutter glasses are in sync during each shift
• Drawbacks:
– Obviously, the TV is expensive
– Shutter glasses are high maintenance, and expensive

• Next method: Interference Filter Technology
– Used in Dolby 3D and Panavision 3D systems
– A multispectral colour filter is used to filter
specific wavelengths of red, green and blue,
directed to the left eye
– Another colour filter used to filter different
wavelengths of red, green and blue, directed to
the right eye
– This uses glasses too  Designed to filter the
same wavelengths in tune with each colour filter


• This process is called: wavelength
multiplex visualization
• Advantages:
– No silver screen required
– Works with conventional screens
– Is not restricted to just theatres
• Disadvantages:
– Glasses are more expensive
– Colour filters must be very accurate

• Last but not least: Autostereoscopic Displays
– View 3D content without glasses
– Currently seen in small gaming systems and small
commercial 3D cameras
• Nintendo 3DS and view screen of the Fujifilm W1
– Currently not available publicly for larger screens
– Common problem with autostereoscopic: Good
for viewing over small screens, but larger screens
tend to make people dizzy or cause discomfort
– Research currently performed to minimize this

• Principle: Uses either lenticular sheets or
parallax barrier sheets
– Impose the left and right images
onto narrow alternating strips
– Half the columns show the
corresponding columns in the left
image, and other half show the
corresponding right image cols.
– In the figure, they’re represented
as green and pink respectively

– After we use a screen that either
blocks every other strip 
Parallax Barrier
– Or can use lenses of same size
as the strips so that we can bend
the left and right strips and make
it appear to fill the entire image
– Either of these will allow the left
and right images to be directed
to the correct eye
– You just need to stand in the right spot!

• This can work with multi-view systems too
– The technology can be modified to
display a different viewpoint of the
scene
• Remember multi-view stereo rigs?
– When you stand in a different
position, you will get a different
perspective of the scene
• Just like what would happen in real-life!
– Achieve this directing the view of a particular
perspective to the right pairs of strips / lenses
47

• Current advocates for autostereoscopic tech.
– Sharp in 2004 designed their first
autostereoscopic LCD monitor
in 2004  Discontinued in 2007
– Similar  Philips WOWvx series
• Discontinued in 2009
– Hitachi  Designed autostereoscopic
mobile phone in 2009
– Nintendo 3DS  Uses parallax barrier
– Fujifilm W1 Viewscreen 
Uses lenticular sheets

• Advantages:
– Glass-free: No maintenance req’d on equipment
– Ideal for delivering to a large group of people
• Co-ordination is required for glass-based technology
– Proven good for small screens / mobile phones
• Disadvantages:
– Larger screens still experimental and expensive
– Larger screens require you to stand far back to
appreciate 3D content


Applications
• What can 3D be used for?
– Entertainment and Gaming (obviously!)
– Real-time 3D Video Teleconferencing
– Interactive Medical Surgery
– Interactive Training Sessions
– Virtual Model Exploration 
– Robot Navigation
– Fine Art Appreciation


Conclusions
• This presentation gave a basic overview of how
3D is made, and how we display 3D
• This presentation is not exhaustive!
– Many other methods to generate 3D material
• Much research performed in this area
– Several technical conferences in 3D: IEEE 3DTVCON,
IEEE 3DIM, SPIE Electronic Imaging
– Research group in Europe researching on
standardizing 3D to mobile phones:
http://sp.cs.tut.fi/mobile3dtv/

Thank You!
Questions?


Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (7)

Semelhante a Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment

Semelhante a Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment (20)

Último

Último (20)

Stereoscopic 3D: Generation Methods and Display Technologies for Industry and Home Entertainment