internship report: toward fast deep visual place recognition · internship report: otward fast deep...

24

Upload: others

Post on 10-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

Internship report: Toward Fast Deep Visual Place

Recognition

Rémi Piau

Supervisor: Dr. Daniel Cremers

Advisors: Nikolaus Demmel & Lukas Köstler & Nan Yang

CVG TUM, ENS Rennes

August 2020

Page 2: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

Table of contents

1 Requirement

VSLAM

Loop Closure

2 Loop Closure Detection Methods

Feature based

Deep Learning based

3 Benchmraking NetVLAD

NetVLAD Implementations

Metric

Against Classical Methods

Quantization for Speed ?

4 Post Processing Score Results

Greedy Double Windowed Mean

5 Conclusion

6 References

Page 3: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

VSLAMLoop Closure

Visual Simultaneous Localization And Mapping

Goal

Compute the map of the surrounding environnement and the route

taken inside this map at the same time.

Rémi Piau VPR and Post�ltering for Loop Closure 3 / 23

Page 4: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

VSLAMLoop Closure

VSLAM Drifts

Errors

Acquisition

Computation

Estimation

Source: Williams et al. [6]

Rémi Piau VPR and Post�ltering for Loop Closure 4 / 23

Page 5: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

VSLAMLoop Closure

Loop Closure to the Rescue!

Add map constrains when closing a loop to negate error.

Loop detection cannot be based on the computed position.

Rémi Piau VPR and Post�ltering for Loop Closure 5 / 23

Page 6: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

Feature basedDeep Learning based

Feature Based Visual Place Recognition

DBOW

HashBOW

HBST(Sources:Gálvez-López&Tardós [2],Schlegel&Grisett [4])(Implementation: Tim Stricker [5])

Rémi Piau VPR and Post�ltering for Loop Closure 6 / 23

Page 7: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

Feature basedDeep Learning based

Deep Learning Based Visual Place Recognition

Based on the Vector of Locally Aggregated Descriptors [3]

Source: Arandjelovi¢ et al. [1]

Rémi Piau VPR and Post�ltering for Loop Closure 7 / 23

Page 8: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

Feature basedDeep Learning based

Why NetVLAD

Quite popular

Deep learning implies GPU based

No benchmark on GPU & CPU

Rémi Piau VPR and Post�ltering for Loop Closure 8 / 23

Page 9: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Current Implementations

Original implementation proprietary matlab language

Tensor�ow 1 implementation without training

Pytorch implementation with training procedure

Keras implementation without training

Rémi Piau VPR and Post�ltering for Loop Closure 9 / 23

Page 10: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Our Implementation

Key points

Tensor�ow 2 Keras implementation

Training enabled

Weight transfer from TF1 implementation & same exposed

methods

Checked against reference & TF1 implementations

Quantization !

Rémi Piau VPR and Post�ltering for Loop Closure 10 / 23

Page 11: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Metric

How well?

Precision: number of queries with at least one correct result

retrieved divided by the number of queries

Recall: number of correct results divided by the number of

ground truth results

Ground Truth: 4 R = 12

How fast?

Runtime on di�erent phase of vpr recognition. Time in function of

image size

Rémi Piau VPR and Post�ltering for Loop Closure 11 / 23

Page 12: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Quantitative comparison

(Source: Oxford FABMAP City Centre Dataset)

Rémi Piau VPR and Post�ltering for Loop Closure 12 / 23

Page 13: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Precision

0 5 10 15 20 25

Query size

0.5

0.6

0.7

0.8

0.9

1.0

Prec

isio

n

DBOWHashBOWHBSTNetVLAD

Rémi Piau VPR and Post�ltering for Loop Closure 13 / 23

Page 14: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Recall

0 5 10 15 20 25

Query size

0.05

0.10

0.15

0.20

0.25

0.30

Rec

all

DBOWHashBOWHBSTNetVLAD

Rémi Piau VPR and Post�ltering for Loop Closure 14 / 23

Page 15: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Another Story: Score Over Time

Query Space

Tim

e

Ground truth score over time (lip6-in)

Rémi Piau VPR and Post�ltering for Loop Closure 15 / 23

Page 16: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Another Story: Score Over Time

Query Space

Tim

e

DBOW

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Query Space

Tim

e

NetVLAD

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Rémi Piau VPR and Post�ltering for Loop Closure 16 / 23

Page 17: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Quantization: Recall

0 5 10 15 20 25

Query size

0.2

0.4

0.6

Rec

all

gpu (tf1)cpulitegpulite with dynamic range quantizationlite with float16 quantizationlite with full integer quantization

Rémi Piau VPR and Post�ltering for Loop Closure 17 / 23

Page 18: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Quantization: Precision

0 5 10 15 20 25

Query size

0.5

0.6

0.7

0.8

0.9

Prec

isio

n

gpu (tf1)cpulitegpulite with dynamic range quantizationlite with float16 quantizationlite with full integer quantization

Rémi Piau VPR and Post�ltering for Loop Closure 18 / 23

Page 19: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Quantization: Speed on Small Dataset

Method Query Mean Time (s) Slower by

DBOW 0.00324 1x

GPU 3.46 1068x 1x

lite with �oat16 quantization 6.89 2127x 1.99x

lite 7.21 2227x 2.08x

CPU 10.1 3119x 2.91x

lite with dynamic range quantization 38.2 11809x 11.1x

lite with full integer quantization 552 170725x 160x

Rémi Piau VPR and Post�ltering for Loop Closure 19 / 23

Page 20: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

NetVLAD ImplementationsMetricAgainst Classical MethodsQuantization for Speed ?

Quantization: Speed vs Size

0 500 1000 1500 2000 2500

Square image border size (in px)

10−1

100

101

102

Ave

rage

tim

e(i

ns,

log

scal

e)

gpulite (no quantization)lite (dynamic range quantization)lite (float16 quantization)

[TODO: all curves (in generation)]Rémi Piau VPR and Post�ltering for Loop Closure 20 / 23

Page 21: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

Greedy Double Windowed Mean

Greedy Double Windowed Mean

Query Space

Tim

e

NetVLAD

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Query Space

Tim

e

NetVLAD with GDWM

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Rémi Piau VPR and Post�ltering for Loop Closure 21 / 23

Page 22: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

Conclusion

Take Over

NetVLAD has better precision than classical methods

Still a speed bottleneck for deep learning methods

Tensor�ow quantization make things slower on a desktop

computer

Post �ltering is a simple way to improve results at low cost

Future Work

Test further NetVLAD con�gurations

Improve post �ltering (delayed & inertia)

Rémi Piau VPR and Post�ltering for Loop Closure 22 / 23

Page 23: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

RequirementLoop Closure Detection Methods

Benchmraking NetVLADPost Processing Score Results

ConclusionReferences

References

[1] Arandjelovi¢, R., Gronat, P., Torii, A., Pajdla, T., and Sivic, J.NetVLAD: CNN architecture for weakly supervised place recognition.In IEEE Conference on Computer Vision and Pattern Recognition (2016).

[2] Galvez-López, D., and Tardos, J. D.Bags of binary words for fast place recognition in image sequences.IEEE Transactions on Robotics 28, 5 (2012), 1188�1197.

[3] Jégou, H., Douze, M., Schmid, C., and Pérez, P.Aggregating local descriptors into a compact image representation.In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2010),pp. 3304�3311.

[4] Schlegel, D., and Grisetti, G.Hbst: A hamming distance embedding binary search tree for feature-based visual place recognition.IEEE Robotics and Automation Letters 3, 4 (2018), 3741�3748.

[5] Stricker, T.E�cient techniques for accurate visualplace recognition.https://vision.in.tum.de/_media/members/demmeln/stricker2020ma.pdf.Accessed: 2020-08-10.

[6] Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I., and Tardos, J.An image-to-map loop closing method for monocular slam.In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems (2008),pp. 2053�2059.

Rémi Piau VPR and Post�ltering for Loop Closure 23 / 23

Page 24: Internship report: Toward Fast Deep Visual Place Recognition · Internship report: oTward Fast Deep Visual Place Recognition Rémi Piau Supervisor: Dr. Daniel Cremers Advisors: Nikolaus

Appendix

Greedy Double Windowed Mean

Parameters: �rst window size K , second window size K ′, thresholdT .

For a query result vector (rn,i )i∈1..S of size S at step n. and the

memory of K previous results (rn−j ,i )i∈1..S , j ∈ 1..K . We �rst

de�ne the mean vector as:

mn,i =1

K + 1

K+1

∑j=1

rn−j ,i

We de�ne the mask at step n for all i ∈ 1..S as:

Mn,i =

rn,i , if

min(K ′/2,S)

∑j=max(−K ′/2,1)

mn,j ≥ TS

∑j=0

mn,j

0, otherwise

Rémi Piau VPR and Post�ltering for Loop Closure 1 / 1