2d & 3d video processing for immersive applications emerging convergence of video, vision &...

43
2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Upload: kelsi-batts

Post on 22-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS

Emerging Convergence of Video, Vision & Graphics

Harpreet S. Sawhney

Rakesh Kumar

Page 2: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

ACKNOWLEDGEMENTS

Collaborative Work with:

Hai Tao

Yanlin Guo

Steve Hsu

Supun Samarasekera

Keith Hanna

Aydin Arpa

Rick Wildes

Page 3: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

TECHNICAL SUCCESS OF CONVERGENCE TECHNOLOGIES

PC based near real-time mosaicing

Automated Video Enhancement: VHS-to-DVD

Iris recognition, active vision

Image based modeling for Entertainment

Real-time Video Insertion

Page 4: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Immersive and Interactive Telepresence Modes of Operation

Observation Mode Interaction ModeConversation ModeUser observes a remote sitefrom any perspective.

User “walks” through site to view activities of interest“up close”.

Example: security, facility guards, sports & entertainment

Users talk and observe oneanother as if in the same room.

Users walk around yet maintaineye contact.

Example: immersive tele-conferencing

Remote users share a commonwork space.

Users observe each other’s handsas they manipulate shared objects, such as war room wall displays.

Example: mission planning, remote surgery

Page 5: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Quality of Service for Tele-presence

Critical Issues • High quality for immersive experience

– Artifact free recovery of 3D shape from video streams

– Efficient 3D video representation and compression

– High quality rendering of new views using 3D shape and video streams

– Bandwidth available in the Next Generation Internet

• Low latency for interactive applications– Real time 3D geometry recovery at the content server end

– Real time new view rendering at the browser client end

– Adaptive Stream management to handle user requests and network loads

– Error resilience and concealment to fill in missing packets

Page 6: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Convergence Technologies

… for immersive & interactive visual applications ... • Vision algorithms: High-quality 3D shape recovery

and dynamic scene analysis

• ASICs, high performance hardware: Real-time video processing

• Compact, low-cost cameras: CMOS cameras

• Low latency and high quality compression: Error resilience

• Real time view synthesis : Standard platforms, e.g. PCs

• Immersive Displays

Page 7: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Vision algorithm performance over time

2D Video Insertion

Coarse 3D Depth Recovery

Video registration to 3D site models

2D Stabilization

Alg

orith

m C

ompl

exity

1990

1993

1995

1998

Mosaicing for entertainment & surveillance

Real-time insertion inLive TV

Face Finding for Iris Recognition

Geo-registration visual databases

Time

High Quality 3d shape extraction

2000

ImmersiveTelepresence

Page 8: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

HW Performance/Size/Cost over time

• Sarnoff ACADIA ASIC performance • 100 MHz system clock, processes 100 million pixels/sec in each processing element• 10 billion operations / sec total IC performance• 800 MB/sec SDRAM interface using 64-bit bus

• Enables building smart 3D cameras for immersive applications.

VFE-1001992

VFE-2001997

ACADIA ASIC2000

Page 9: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Application Performance

• Parametric Motion : Stabilization & Mosaicing– 720x240 fields @ 60 Hz OR 720x480 frames @ 30 Hz

• Pyramid based Fusion : Dynamic Range, Focus Enhancement– 720x240 fields @ 60 Hz OR 720x480 frames @ 30 Hz

• Stereo Depth Extraction– 720x240 field 32 disparity levels in 4 ms (250 Hz)

– 720x240 field 60 disparity levels in 10 ms (100 Hz)

– 60 disparities on 1k x 1k images at 55 ms (18 Hz)

Page 10: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Sarnoff Compression Technology … Required algorithm components for tele-presence are emerging ...

MPEG4, Progressive Encoding

VideoPhone: H.263

Low Latency MPEG2 multiplexing service

MPEG2: Encoding and Transmission

Alg

orith

m C

ompl

exity

1993- 1996

1998-1999

1997-1998

1999

ICTV

Time

DIREC-TV & HDTV

LG Electronics

E-vue

Just Noticeable Difference (JND):MPEG2 Encoding and QualityMeasurement

1997-1998

Tektronix

Pyramid & Wavelet based Encoding

1988-1993Still Image Compression

Page 11: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

A FRAMEWORK FOR VIDEO PROCESSING

ALIGN

2D & 3D MODELS OF MOTION & STRUCTURE

MODEL-BASED IMAGE SEQUENCE ALIGNMENT

TEST

WARP/RENDER WITH 2D/3D MODELS

TEST ALIGNMENT QUALITY

SYNTHESIZE

CREATE OUTPUT REPRESENTATIONS

Page 12: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Core Vision Algorithmsfor (Real-time)

Motion & 3D Video Analysis

2D Immersive & Layered Representations

Stereo & Video Sequence Enhancement Multi-camera Immersive

Dynamic Rendering

Model-centric Video Visualization

Highlights of Sarnoff’s Video Analysis Technologies

… framework applied to a create immersive representations ...

Spherical MosaicsDynamic & Synopsis Mosaics

Hi-Q IBR based mixed resolution synthesisVideo Quality Enhancement for efficient compression

Dynamic model & video visualizationGeo-registration with reference image database

Hi-Q Depth extractionImage-based rendering with dynamic depth

Page 13: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

SPHERICAL MOSAICS

Sarnoff Library VideoCaptures almost the complete sphere

with 380 frames

TOPOLOGY INFERENCE & LOCAL-TO-GLOBAL ALIGNMENT

[Sawhney,Hsu,Kumar ECCV98, Szeliski,Shum SIGGRAPH98]

Page 14: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

SPHERICAL TOPOLOGY EVOLUTION

Page 15: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

SPHERICAL MOSAICSarnoff Library

Page 16: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

ACTIVE FOCUS OF ATTENTION WFOV/NFOV CONTROL

Page 17: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

DYNAMIC MOSAICS

Video Stream with deleted moving objectOriginal Video

Dynamic Mosaic Video

Page 18: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

SYNOPISIS MOSAICS

Page 19: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Low-Res Left

Synthesized High-Res Left

Original High-Res Right

ALIGNMENT & SYNTHESIS FOR HI-RES STEREO SYNTHESISA HIGH END APPLICATION OF IBMR

[Sawhney,Guo,Hanna,Kumar,Zhou,Adkins SIGGRAPH2001]

Page 20: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

THE PROBLEM SCENARIOINPUT OUTPUT

Left Eye(Typically 1.5K)

Right Eye(Typically 6K)

Page 21: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

3D & Motion Alignment Based Stereo Sequence Processing

t

t-1

t-2

t+1

t+2

Left Right

s t e r e of l

ow

f

ff

l

ll

o

o

o

w

w

w

Right

t-1

t

t+1

t+2

t+3

Left

• Highlights : – Scintillation effect is reduced.– Occlusion regions are better handled.

s t e r e of l

ow

f

ff

l

ll

o

o

o

w

w

w

Page 22: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

SYNTHESIS RESULT ON REAL FOOTAGE

Page 23: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

IMPLICATIONS FOR IMMERSIVE IBMR CAMERA CONFIGURATIONS

Lo-res camera

Hi-res camera

Multi-resolution camera configuration allows 3D capture at the highest resolutionas well as user-controlled large range of zooms without the need for

zoom control on the cameras.

Page 24: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Model-Centric Video VisualizationOR

Video-Centric Model Visualization [Hsu,Supun,Kumar,Sawhney CVPR00]

Original Video

Re-projection of video after merging with model.

Geo-registration of video to site model

Site model

Page 25: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Video to Site Model Alignment

• Model to frame alignment

REFINE

Correspondence-lessexterior orientationfrom 3D-2D line pairs

Page 26: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Oriented Energy Pyramid

0° 45°

90° 135°

• Goal: representation which indicates edge strength in the image at various orientations and scales

• Orientation selectivity: reduce false matches

• Coarse-to-fine: increase capture range

Page 27: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

This will be an animation ofthe gradual improvement of alignment

during the coarse to fineiterations

regsite_animation.avi

Pose Refinement Algorithm…iterative coarse to fine adjustment of pose ...

Page 28: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Geo-Registration Video to Reference Database Alignment

[Wildes et al. ICCV01]

Current Video 3D Reference Imagery

Page 29: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Registration : Radical Appearance Changes

Page 30: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Dynamic 3D Capture & Rendering…global modeling is not feasible...

• Recovering depth from local views• Depth refinement across multiple local views • New view synthesis using multiple local views

Cross view depth checking

Page 31: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

3D Shape/Depth Estimation from Multiple Views of a Scene

Stereo Pair

• Estimation of high quality, artifact free depth maps co-registered with video imagery for rendering new views.• Must work both outdoors and indoors

Page 32: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Multi-baseline depth estimation - requirements

Depth maps

New view rendering

A traditional stereo algorithm Global matching method

Thinstructures

Accurate boundaries

Accurate boundaries

[Tao,Sawhney,Kumar WACV00, ICCV01]

Page 33: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

New view rendering using local depth estimation

Color segmentation based stereo algorithm(2000)

Multi-window plane+ parallax algorithm(1998)

Local flow estim-ation(1992)

New view rendering

Page 34: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Main ideas

• Motivations– be able to handle textureless regions– handle object boundaries accurately– global visibility constraints should be enforced– Hypothesize reasonable depths for unmatched regions

• Solutions– Global matching method - an analysis-by-synthesis approach– Representation - smooth depth representation in homogeneous region– Search method - neighborhood depth hypotheses generation– Efficient algorithm - incremental warping– Scene constraints - prior functions

Page 35: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Color Segmentation

Original image (frame 12) Original image (left)

Color segmentation [Comanicius 97]

Page 36: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

New view rendering using local depth estimation

Color segmentation based stereo algorithm

True depthLeft image

new view rendering

Page 37: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Depth computation from 3 views

Depth map (frame 12)

Video frame 11 Video frame 12 Video frame 13

Color segmentation (frame 12)

Page 38: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Multiple View Depth Recovery and New View Rendering

New view rendering from multiple views.

New view rendering from a single view. left: from frame 212, right: from frame 215

Page 39: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Multiple view depth recovery and new view rendering

Original 14 video frames (frame 04-17)

Depth map of frame 12 and 15

New view rendering (71 frames)

Page 40: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

Immersive Visualization of a Dynamic Event

• Temporally consistent motion and 3D shape extraction• Scintillation free dynamic high-quality rendering

Page 41: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

AN IMMERSIVE IBMR GRAND CHALLENGE

Page 42: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar

AND IF WE DO IT RIGHT

Page 43: 2D & 3D VIDEO PROCESSING FOR IMMERSIVE APPLICATIONS Emerging Convergence of Video, Vision & Graphics Harpreet S. Sawhney Rakesh Kumar