point cloud compression & communicationl.web.umkc.edu/lizhu/docs/2019.pcc.pdfpoint cloud...

Z. Li, 2019

Point Cloud Compression & Communication

Zhu LiDirector, UMKC NSF Center for Big Learning

Dept of Computer Science & Electrical EngieeringUniversity of Missiouri, Kansas City

Email: [email protected], [email protected]: http://l.web.umkc.edu/lizhu

1

Z. Li, 2019

University of Missouri, Kansas City

Short Bio:

Research Interests:• Immersive visual communicaiton: light field, point cloud

and 360 video coding and low latency streaming• Low Light, Res and Quality Image Understanding• What DL can do for compression (intra, ibc, sr, inter,

end2end)• What compression can do for DL (compression,

acceleration)

signal processing and learning

image understanding visual communication mobile edge computing & communication

p.2

Multimedia Computing & Communication LabUniv of Missouri, Kansas City

Z. Li, 2019

Outline

• Short Self Intro & Research Overview • Point Cloud Capture and Applications• Geometry Compression• Graph Signal Processing and Attributes Compression• UMKC PCC work

– Static Geometry Compression: Plane Projection Approximation – Dynamic Geometry Compression: Kd-tree decomposition and

residual coding• Summary

3

Z. Li, 2019

Research Motivation & Highlights

2019

p.4

Z. Li, 2019

Devices Networks Applications

mmWave 5G

D2D

FD-MIMO4k/8k UHD Video Free Viewpoint TV

SDN/MEC

BIGDATA visual intelligence

Samsung VR/AR

Media Computing & Communication Horizon

p.5

Z. Li, 2019

Embedded Deep Learning for Image Understanding

• Automatic Image Tagging:• Mapping image pixel values to tags• Device based recognition:

• Distilling the knowledge into a compact form• Compact CNN model for knowledge distilling

– BIGDATA training set: >10k images per tag– training set augmentation:

» affine transforms, simulated lighting changes– Pruning filters from the CNN structure

p.6

Z. Li, 2019

Immersive Visual Communication

• Light Field Compression (6-DoF)– Sub-Aperture Image Based – Tensorial Display Decomposition Based

• Point Cloud Compression (6-DoF)– Video Based: Views + Depth– Binary Tree/Octree Geometry Coding + Graph Signal Compression

• 360 Video Compression (3-DoF)– Advanced Motion Model: Affine motion, Spherical motion models– Padding

p.7

Z. Li, 2019

Deep Learning in Video Compression

• Deep Learning Chroma prediction (from Luma)

• Results:– BD rate:

p.8

C1:128@16×16

ConvolutionKernel: 5×5 Convolution

Kernel: 3×3,5×5

C2: (96+32)@16×16

C3: (96+32)@16×16C4: 2@16×16

ConvolutionKernel: 1×1,3×3

ConvolutionKernel: 3×3

F1: 256@1×1F2: 196@1×1

F3: 128@1×1

Tile: 128@16×16

Fusion: 128@16×16

Full Connection Full Connection

Full Connection Tiling

Element wise Product

Input: 1@16×16

Input: 99@1×1

Down-sampled Reconstructed

Luma

Neighboring Reconstructed

Luma & Chroma

Predicted Chroma

deep chroma predicted blocks

Z. Li, 2019

Deep Learning in IBC prediction

• IBC with deep learning– IBC matching block fuse with intra ref

column/row pixels

– Performance :

p.9

Overall, 1.3%, 1.4%, 1.6% bits saving for Y, U, V component, respectively.

Peak performance on Luma component on BQTerrace achieves 3.6% BD-Rate reduction.

Z. Li, 2019

Low Light Image Denoising & Understanding

• Use cases– Low light imaging– Low light image understanding

• Deep Scaling & Residual learning

p.10

Z. Li, 2019

Results

• Low light image denoising

p.11

SID Unet

Ours

Z. Li, 2019

Outline

• Short Self Intro & Research Overview • Point Cloud Capture and Applications• Geometry Compression• Graph Signal Processing and Attributes Compression• UMKC PCC work• Summary

12

Z. Li, 2019

What is Point Cloud

• A collection of Un-ordered points with – Geometry: expressed as [x, y, z]– Color Attributes: [r g b], or [y u v]– Additional info: normal, timestamp, …etc.

• Key difference from mesh: no order or local topology info

p.13

Z. Li, 2019

Point Cloud Capture

• Passive: Camera array stereo depth sensor

• Active: LiDAR, mmWave, TOF sensors

p.14

Z. Li, 2019

Point Cloud Inter-Operability with Other Formats

p.15

• Provide true 6-DoF Content capacity

Z. Li, 2019

PCC in MPEG

• Part of the MPEG-Immersive grand vision

p.16

Z. Li, 2019

Point Cloud Compression - Geometry

• Can be lossy in both quantization and samples

p.17

Z. Li, 2019

Octree Based Point Cloud Compression

• Octree is a space partition solution– Iteratively partition the space into sub-blocks. – Encoding: 0 if empty, 1 if contains data points– Level of the tree controls the quantization error

p.18

Credit: Phil Chou, PacketVideo 2016

Z. Li, 2019

Lossless Compression of the Octree with Neural Network driven CABAC

• Tree Structure: – DFS scanning of the Octree node byte to have a byte stream– Compression of the byte stream via Arithmetic Coding, or

shallow neural network PAQ coding• Residual Coding:

– Range coding: coding the residual against a ref point (eg., centroids of octree leaf node centroids) – Plane/Surface approximation coding: compute the projection distances to a surface, surface can be polynomial or planar.

p.19

0 50 100 150 200 250 300-20

0

20

40

60

80

100

120

1402)uvw residual quantized

Z. Li, 2019

Scalable Point Cloud Geometry Coding

p.20

• Binary Tree embedded Quadtree (BTQT) coding: – Binary tree partition to have lossy geometry approximation– Refine each leaf node with Quadtree/Octree to offer scalable

details upto near lossless

Z. Li, 2019

Scalable Geometry Coding with BTQT

p.21

• Construct Binary Tree of Point Cloud • R1 = (2L-1)*(2+K) + 6*K, cost of signalling for resolution K bits

and binary tree depth L• Intra-Coding i.e. either Quadtree (flat surface) or

Octree(not flat)• R2 = 3*p + 3*q bits, for singalling norma at p bits and point at q

bits. q < K proportional to the leaf node size.

Z. Li, 2019

Quadtree/Octree

p.22

1 0 10

0 0 01 1 1 1 1

OctreeQuadtree

Flatness Criterian:

Z. Li, 2019

Scalable Coding with Quadtree/Octree

p.23

OctreeQuadtree

L = L1

L = L2 < L1

L = L3 < L2

Z. Li, 2019

Point cloud Visualization

• citytunnel dataset (MERL) – 1.5 km long section of a city, 21 millions point

p.24

Z. Li, 2019

Reconstructed-Zoomed

p.25

Z. Li, 2019

Result: Category 1 Geometry Coding Efficiency

p.26

Z. Li, 2019

Color Attributes Compression - DFS

• A Simple Solution: DFS scan the Octree, and compress the color attributes by reshaping it into some rectangular form

p.27

n

3

010000100011110100100

The order of color attributes scan matters !

Z. Li, 2019

Graph Signal Processing

• For signals sampled on an non-uniform grid, expressed as a graph, what are the tools for signal processing ?

p.28

Z. Li, 2019

Graph Signal

• For signals sampled on a graph– F={f(1), f(2), …, f(n)}

• Graph:

p.29

Z. Li, 2019

Laplace Operator

• What is Laplace Operator ? - 2nd order differentiation

p.30

Z. Li, 2019

Laplacian Matrix

• Consider1-D discrete signal

p.31

Z. Li, 2019

Laplacian Matrix on uniform grid (h=1)• 1-D discrete signal on uniform grid

• 2-D discrete signal

p.32

Z. Li, 2019

Graph Laplacian

• 2nd order operator on signals on graph

p.33

Z. Li, 2019

Graph Fourier Transform

• Accounting for non-uniform grid, expressed by graph Laplacian, can we have Fourier like signal analysis on graph ?

p.34

Z. Li, 2019

Normalized Graph Laplacian

• normalized by the vertex degree

p.35

Z. Li, 2019

Graph Laplacian Eigen Basis

p.36

Z. Li, 2019

Graph Lapalcian Eigen Vectors

• Graph Cut/Partition Interpretation

p.37

Z. Li, 2019

Fourier Analysis

• Recall that for uniformly sampled signal, we have Fourier analysis:

p.38

N: number of samplesxn: signal at time nXk: signal at freq k

Z. Li, 2019

Graph Fourier Transform

• Eigen vector of Laplacian provides GFT basis

p.39

Credit: Prof. Pascal Frossard, EPFL

Z. Li, 2019

Graph Spectral Freq

• Zero crossing count

p.40

Z. Li, 2019

FT is special case of GFT

• Consider 1-D 8 point discrete signal on a simple graph

– First AC Eigen Vector: u2 as a function of w1,2,

– Graph Laplacian Eigen vectors default to FFT if all graph weights are uniform, and DFT if connected into a circle.

p.41

Z. Li, 2019

Optimal Energy Compaction - KLT

• For signals on uniformly sampled grid, KLT is optimal energy compaction transform:

The direction that maximizes the variance is the eigenvector associated with the largest eigenvalue of Σ

p.42

a unitary transform with the basis vectors in A being the “orthonormalized” eigenvectors of Rx

• However, not cheap, need to signal this nxn unitary transform with good precision.

Z. Li, 2019

Graph Transform

• Trying to achieve compaction of energy by grouping signal points closer together into the same group

• Free lunch: geometry is reconstructed (vs KLT)• Design Parameters:

p.43

Z. Li, 2019

Point Cloud Color Attributes Coding

• Photometrical info are not sampled on uniform grid

– Ref: Cha Zhang, Dinei A. F. Florêncio, Charles T. Loop:Point cloud attribute compression with graph transform. ICIP 2014: 2066-2070

p.44

4x4x4 octree node: Affinity reflecting voxel distance

Z. Li, 2019

Graph Transform

• For the color attributes in the 4x4x4 node:– Graph Transform U from Laplacian

– U is invertable– Graph signal representation of photometric info I

– Quantization & Coding

p.45

Z. Li, 2019

Optimized Graph Transform

p.46

Z. Li, 2019

Nested Transform Attributes Coding

• Nested Transform of attributes on graph– 1st Transform: kd-tree partition to create super pixel

representation– L1=16, 2L1=65536 downsampled representation from local

GFT/DCT on leaf node pixels.

p.47

700k input points 65k superpixels

Z. Li, 2019

Nested DCT transform

• We then first apply DCT to the leaf nodes with 13 or 14 points– Each leaf node is then represented by the DC value are all the

leaf nodes are organized into a new kd-tree with depth 12– Each leaf node are with 16 points are organized into 4x4 blocks

p.48

Z-scan Raster-scan

Z. Li, 2019

Superpixel AC Prediction

• The coding of the AC coefficients– The AC coefficients are with less correlations– The AC coefficients are then quantized and entropy coded– According to the final results, the AC coefficients only take 12%

of the total energy but take 88% of the total bits

• Potential improvements– Decorrelate the AC coefficients– Increase the Kd depth and reduce the AC coefficients

p.49

Z. Li, 2019

Further enhance the Kd-tree depth

• Kd-tree depth equals to 17– Each leaf node will be with only 6 or 7 points– The DC image resolution is 512x256

p.50

Z-scan

Z. Li, 2019

Current results

• The R-D curve comparison of the Longdress– The proposed scheme can achieve some gains in low bitrate

case while suffers some losses in high bitrate case– AC prediction seems to be the problem.

p.51

Z. Li, 2019

Summary

• Point Cloud capture and compression can enable many applications in VR/AR, navigation and auto-driving

• Core enablers are computational geometry and graph signal processing tools.

• Geometry compression – space/data partition and prediction

• Attributes compression – signals defined on a graph, needs many new tools on graph signal sampling, filtering and smoothing.

• Related papers, check out: – http://l.web.umkc.edu/lizhu

p.52

Z. Li, 2019

• Q & A

53

point cloud compression & communicationl.web.umkc.edu/lizhu/docs/2019.pcc.pdfpoint cloud...

Documents