SlideShare uma empresa Scribd logo
1 de 37
Baixar para ler offline
Learning Scene Geometry for Visual Localization in Challenging Conditions
Nathan Piasco1,2, D´esir´e Sidib´e1, Val´erie Gouet-Brunet2 and C´edric Demonceaux1
20/05/2019
2019 IEEE International Conference on Robotics and Automation
1. VIBOT ERL,CNRS 6000, ImViA, Universit´e Bourgogne
Franche-Comt´e
2. LaSTIG, IGN, ENSG, Universit´e Paris-Est, F-94160
Saint-Mand´e, France
Introduction
Problem statement
We aim to recover
the position of an
unknown query.
2 / 18
Problem statement
We have access
to geo-localized
data.
2 / 18
Problem statement
We cast the image
localisation prob-
lem as an image-
retrieval problem.
2 / 18
Problem statement
We transfer the
position of the re-
trieved candidate
to the query.
2 / 18
Challenge in visual based localization
Drastic visual
changes occur due
to season/day-
night cycles.
3 / 18
Geometry to the rescue
However, geometric information still remains the same.
4 / 18
Geometry to the rescue
Unfortunately, geometric information is not always available.
4 / 18
Geometry to the rescue
How to use partial geometric information to improve image descriptor for localization?
4 / 18
Learning through missing modality
System components
CNN
CNN
CNN
Image
anchor
Positive
example
Negative
example
Triplet
lossDesc.
Desc.
Desc.
Desc.
We use a CNN as trainable global image descriptor1
.
1
Relja Arandjelovi´c, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic, NetVLAD: CNN architecture for weakly supervised place recognition,
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 5297–5307, 2017
5 / 18
System components
CNN
CNN
CNN
Image
anchor
Positive
example
Negative
example
Triplet
lossDesc.
Desc.
Desc.
Desc.
The deep image descriptor can be train with image triplet and triplet loss.
5 / 18
System components
CNN
CNN
CNN
Image
anchor
Positive
example
Negative
example
Triplet
lossDesc.
Desc.
Desc.
Desc.
Images/Depth
pair
CNN decoder
L1 Loss LθG
CNN encoder
We trained a encoder-decoder to produce depth map according to monocular images.
5 / 18
Our method: learning through missing modality
Image triplet
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
We use triplet loss to produce
strong image descriptor.
6 / 18
Our method: learning through missing modality
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Latent image representation is
given to a CNN decoder to
reproduce the scene geometry.
The CNN decoder is simply
trained by minimizing L1
distance between ground truth
and reconstructed depth.
6 / 18
Our method: learning through missing modality
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
We use another CNN to
produce strong depth map
descriptor.
6 / 18
Our method: learning through missing modality
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Final descriptor is obtained by
concatenating image and
depth map descriptors.
6 / 18
Training policy
Image triplet
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Data required to train the
descriptor part of our system:
7 / 18
Training policy
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Data required to train the
depth reconstruction part of
our system:
7 / 18
System deployment
NN Encoder
NN decoder
Compact descriptor
Nearest
Neighbor fast
indexing
Reference images + poses Most similar image, with pose
Unknown pose query
Inferred depth map
f mage poseg
NN Encoder
Reference descriptors
The depth information is only needed during the training step!
8 / 18
Localization in challenging conditions
Dataset for Localization in challenging conditions
Dataset: we train and test our proposal on RobotCar1
dataset
Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor
1
Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of
Robotics Research (IJRR), 2016
2
Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016
9 / 18
Dataset for Localization in challenging conditions
Dataset: we train and test our proposal on RobotCar1
dataset
Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor
Competitors: Hallucination network2
& same descriptor only trained with RGB
1
Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of
Robotics Research (IJRR), 2016
2
Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016
9 / 18
Dataset for Localization in challenging conditions
Dataset: we train and test our proposal on RobotCar1
dataset
Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor
Competitors: Hallucination network2
& same descriptor only trained with RGB
Metric: we compare the distance between the top ranked returned database image position and the
query ground truth position.
1
Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of
Robotics Research (IJRR), 2016
2
Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016
9 / 18
Results: long-term localization
Queries atabase
Example of queries and nearest image in
the database.
20 30 40 50
D - Distance to top 1 candidate (m)
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
Recall@D(%)
0 10 20
0.5
1
RGBtD (our) + NetVLAD (Rt)
RGB + NetVLAD (Rt)
RGBtD (our) + NetVLAD (A)
RGBtD (hall) + NetVLAD (A)
RGB + NetVLAD (A)
RGBtD (our) + MAC (R)
RGB + MAC (R)
RGBtD (our) + MAC (A)
RGBtD (hall) + MAC (A)
RGB+ MAC (A)
Competitors:
--- Only RGB [RGB]
-x- Hallucination network
[RGBtD (Hall)]
-o- Our proposal [RGBtD
(our)]
10 / 18
Results: cross-season localization
Queries atabase
Example of queries and nearest image in
the database.
20 30 40 50
D - Distance to top 1 candidate (m)
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Recall@D(%)
0 10 20
0.5
1
RGBtD (our) + NetVLAD (Rt)
RGB + NetVLAD (Rt)
RGBtD (our) + NetVLAD (A)
RGBtD (hall) + NetVLAD (A)
RGB + NetVLAD (A)
RGBtD (our) + MAC (R)
RGB + MAC (R)
RGBtD (our) + MAC (A)
RGBtD (hall) + MAC (A)
RGB+ MAC (A)
Competitors:
--- Only RGB [RGB]
-x- Hallucination network
[RGBtD (Hall)]
-o- Our proposal [RGBtD
(our)]
11 / 18
Failure case: Night to day
atabaseQueries
Example of queries and nearest image in
the database.
20 30 40 50
D - Distance to top 1 candidate (m)
0.05
0.1
0.15
0.2
Recall@D(%)
0 10 20
0.5
1
RGBtD (our) + NetVLAD (Rt)
RGB + NetVLAD (Rt)
RGBtD (our) + NetVLAD (A)
RGBtD (hall) + NetVLAD (A)
RGB + NetVLAD (A)
RGBtD (our) + MAC (R)
RGB + MAC (R)
RGBtD (our) + MAC (A)
RGBtD (hall) + MAC (A)
RGB+ MAC (A)
Competitors:
--- Only RGB [RGB]
-x- Hallucination network
[RGBtD (Hall)]
-o- Our proposal [RGBtD
(our)]
12 / 18
Improving night to day localization
Night images that serve
as input for the
generative network
Ground truth depth maps
Generated depth maps
Our network is not able to generate proper depth maps from night images.
13 / 18
Recall - Training policy
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Thanks to the design of our
method, we can improve
generation performances of
the decoder without impacting
the descriptors networks.
14 / 18
Recall - Training policy
Images/Depth
pair
Reconstructed
depth
CNN decoder
L1 Loss
Triplet Loss
Descriptors
concatenation
LθG
LθD;θI
Triplet Loss
LθI
Triplet Loss
LθD
CNN
CNN
Desc.
Desc.
Thanks to the design of our
method, we can improve
generation performances of
the decoder without impacting
the descriptors networks.
We fine tune our system with
pairs of night image/depth
map.
14 / 18
Improving night to day localization
Night images that
serve as input for
the generative
network
Ground truth
depth maps
Generated depth
maps
Generated depth
maps, with fine
tuning
15 / 18
Results: Night to day (fine tuning)
20 30 40 50
D - Distance to top 1 candidate (m)
0.05
0.1
0.15
0.2
Recall@D(%)
Night vs daytime (w/o fine tuning)
20 30 40 50
D - Distance to top 1 candidate (m)
0.05
0.1
0.15
0.2
0.25
Recall@D(%)
Night vs daytime (w/ fine tuning)
With fine tuning with a small amount of data, we are able to nearly double localization performances.
16 / 18
Conclusion
Conclusion
Summary:
• new global image descriptor trained with side depth modality,
• especially well suited to long term and cross season outdoor localization,
• promising results for challenging night to day image retrieval.
17 / 18
Conclusion
Summary:
• new global image descriptor trained with side depth modality,
• especially well suited to long term and cross season outdoor localization,
• promising results for challenging night to day image retrieval.
For future works:
• use other modalities,
• investigate less trivial descriptor fusion method,
• test on other visual localisation tasks (e.g. direct pose regression).
17 / 18
References
Relja Arandjelovi´c, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic, NetVLAD: CNN
architecture for weakly supervised place recognition, IEEE Transactions on Pattern Analysis and
Machine Intelligence (TPAMI), pages 5297–5307, 2017.
Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through
Modality Hallucination, In IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pages 826–834, 2016.
Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford
RobotCar dataset, The International Journal of Robotics Research (IJRR), 2016.

Mais conteúdo relacionado

Mais procurados

#FTMA15 第一回 鬼コース 全PDF
#FTMA15 第一回 鬼コース 全PDF#FTMA15 第一回 鬼コース 全PDF
#FTMA15 第一回 鬼コース 全PDFYoichi Ochiai
 
Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元Tatsuya Yokota
 
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用Deep Learning JP
 
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例Unityでお手軽ロボット開発「toio SDK for Unity」最新事例
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例UnityTechnologiesJapan002
 
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキングMasahito Zembutsu
 
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​SSII
 
音楽を見る:情報可視化技術の音楽情報処理への適用
音楽を見る:情報可視化技術の音楽情報処理への適用音楽を見る:情報可視化技術の音楽情報処理への適用
音楽を見る:情報可視化技術の音楽情報処理への適用Takayuki Itoh
 
2015年12月PRMU研究会 対応点探索のための特徴量表現
2015年12月PRMU研究会 対応点探索のための特徴量表現2015年12月PRMU研究会 対応点探索のための特徴量表現
2015年12月PRMU研究会 対応点探索のための特徴量表現Mitsuru Ambai
 
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...Yoshitake Takebayashi
 
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生Takuma_OKAMOTO
 
Python nlp handson_20220225_v5
Python nlp handson_20220225_v5Python nlp handson_20220225_v5
Python nlp handson_20220225_v5博三 太田
 
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化The Whole Brain Architecture Initiative
 
広告について
広告について広告について
広告についてMooming
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現NU_I_TODALAB
 
Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Shohei Kumagai
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"Deep Learning JP
 
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some PreliminaryDeep Learning JP
 
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~Kei Nakagawa
 
深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴Yuya Unno
 
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜SSII
 

Mais procurados (20)

#FTMA15 第一回 鬼コース 全PDF
#FTMA15 第一回 鬼コース 全PDF#FTMA15 第一回 鬼コース 全PDF
#FTMA15 第一回 鬼コース 全PDF
 
Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元
 
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用
[DL輪読会]マテリアルズインフォマティクスにおける深層学習の応用
 
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例Unityでお手軽ロボット開発「toio SDK for Unity」最新事例
Unityでお手軽ロボット開発「toio SDK for Unity」最新事例
 
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング
普通の人が勉強会で発表するために必要な準備のすべて~入門パブリック・スピーキング
 
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
SSII2022 [SS1] ニューラル3D表現の最新動向〜 ニューラルネットでなんでも表せる?? 〜​
 
音楽を見る:情報可視化技術の音楽情報処理への適用
音楽を見る:情報可視化技術の音楽情報処理への適用音楽を見る:情報可視化技術の音楽情報処理への適用
音楽を見る:情報可視化技術の音楽情報処理への適用
 
2015年12月PRMU研究会 対応点探索のための特徴量表現
2015年12月PRMU研究会 対応点探索のための特徴量表現2015年12月PRMU研究会 対応点探索のための特徴量表現
2015年12月PRMU研究会 対応点探索のための特徴量表現
 
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...
単一事例研究法と統計的推測:ベイズ流アプローチを架け橋として (文字飛び回避版はこちら -> https://www.slideshare.net/yos...
 
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生
[招待講演] スピーカアレイを用いた空間フーリエ変換に基づく局所再生
 
Python nlp handson_20220225_v5
Python nlp handson_20220225_v5Python nlp handson_20220225_v5
Python nlp handson_20220225_v5
 
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化
第7回WBAシンポジウム:予測符号化モデルとしての 深層予測学習とロボット知能化
 
広告について
広告について広告について
広告について
 
音情報処理における特徴表現
音情報処理における特徴表現音情報処理における特徴表現
音情報処理における特徴表現
 
Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用Partial least squares回帰と画像認識への応用
Partial least squares回帰と画像認識への応用
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
 
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
 
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~
ブラックリッターマン法による リスクベースポートフォリオの拡張 ~リスクベースポートフォリオの課題と拡張への取り組み~
 
深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴深層学習フレームワークChainerの特徴
深層学習フレームワークChainerの特徴
 
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜
SSII2020TS: 物理ベースビジョンの過去・現在・未来 〜 カメラ・物体・光のインタラクションを モデル化するには 〜
 

Semelhante a ICRA Nathan Piasco

Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithmcsandit
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsDavid Silver
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionDmytro Mishkin
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdfmokamojah
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayNAVER Engineering
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Yu Huang
 
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODEC
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODECEFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODEC
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODECSwisscom
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...Susang Kim
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Lviv Data Science Summer School
 
Multispectral Domain Invariant Image for Retrieval-based Place Recognition
Multispectral Domain Invariant Image for Retrieval-based Place RecognitionMultispectral Domain Invariant Image for Retrieval-based Place Recognition
Multispectral Domain Invariant Image for Retrieval-based Place RecognitionSejong University
 
An accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionAn accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionFederico Magliani
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Matthias Trapp
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformersNEERAJ BAGHEL
 

Semelhante a ICRA Nathan Piasco (20)

TransNeRF
TransNeRFTransNeRF
TransNeRF
 
Design and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding AlgorithmDesign and Implementation of Variable Radius Sphere Decoding Algorithm
Design and Implementation of Variable Radius Sphere Decoding Algorithm
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
EIS_REVIEW_1.pptx
EIS_REVIEW_1.pptxEIS_REVIEW_1.pptx
EIS_REVIEW_1.pptx
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge Finalists
 
Ad24210214
Ad24210214Ad24210214
Ad24210214
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image Description
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf10.1109@ICCMC48092.2020.ICCMC-000167.pdf
10.1109@ICCMC48092.2020.ICCMC-000167.pdf
 
Depth estimation do we need to throw old things away
Depth estimation do we need to throw old things awayDepth estimation do we need to throw old things away
Depth estimation do we need to throw old things away
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODEC
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODECEFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODEC
EFFICIENT STEREO VIDEO ENCODING FOR MOBILE APPLICATIONS USING THE 3D+F CODEC
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
 
Multispectral Domain Invariant Image for Retrieval-based Place Recognition
Multispectral Domain Invariant Image for Retrieval-based Place RecognitionMultispectral Domain Invariant Image for Retrieval-based Place Recognition
Multispectral Domain Invariant Image for Retrieval-based Place Recognition
 
An accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognitionAn accurate retrieval through R-MAC+ descriptors for landmark recognition
An accurate retrieval through R-MAC+ descriptors for landmark recognition
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)Efficient LDI Representation (TPCG 2008)
Efficient LDI Representation (TPCG 2008)
 
Generating super resolution images using transformers
Generating super resolution images using transformersGenerating super resolution images using transformers
Generating super resolution images using transformers
 

Último

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 

Último (20)

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 

ICRA Nathan Piasco

  • 1. Learning Scene Geometry for Visual Localization in Challenging Conditions Nathan Piasco1,2, D´esir´e Sidib´e1, Val´erie Gouet-Brunet2 and C´edric Demonceaux1 20/05/2019 2019 IEEE International Conference on Robotics and Automation 1. VIBOT ERL,CNRS 6000, ImViA, Universit´e Bourgogne Franche-Comt´e 2. LaSTIG, IGN, ENSG, Universit´e Paris-Est, F-94160 Saint-Mand´e, France
  • 3. Problem statement We aim to recover the position of an unknown query. 2 / 18
  • 4. Problem statement We have access to geo-localized data. 2 / 18
  • 5. Problem statement We cast the image localisation prob- lem as an image- retrieval problem. 2 / 18
  • 6. Problem statement We transfer the position of the re- trieved candidate to the query. 2 / 18
  • 7. Challenge in visual based localization Drastic visual changes occur due to season/day- night cycles. 3 / 18
  • 8. Geometry to the rescue However, geometric information still remains the same. 4 / 18
  • 9. Geometry to the rescue Unfortunately, geometric information is not always available. 4 / 18
  • 10. Geometry to the rescue How to use partial geometric information to improve image descriptor for localization? 4 / 18
  • 12. System components CNN CNN CNN Image anchor Positive example Negative example Triplet lossDesc. Desc. Desc. Desc. We use a CNN as trainable global image descriptor1 . 1 Relja Arandjelovi´c, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic, NetVLAD: CNN architecture for weakly supervised place recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 5297–5307, 2017 5 / 18
  • 14. System components CNN CNN CNN Image anchor Positive example Negative example Triplet lossDesc. Desc. Desc. Desc. Images/Depth pair CNN decoder L1 Loss LθG CNN encoder We trained a encoder-decoder to produce depth map according to monocular images. 5 / 18
  • 15. Our method: learning through missing modality Image triplet Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. We use triplet loss to produce strong image descriptor. 6 / 18
  • 16. Our method: learning through missing modality Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Latent image representation is given to a CNN decoder to reproduce the scene geometry. The CNN decoder is simply trained by minimizing L1 distance between ground truth and reconstructed depth. 6 / 18
  • 17. Our method: learning through missing modality Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. We use another CNN to produce strong depth map descriptor. 6 / 18
  • 18. Our method: learning through missing modality Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Final descriptor is obtained by concatenating image and depth map descriptors. 6 / 18
  • 19. Training policy Image triplet Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Data required to train the descriptor part of our system: 7 / 18
  • 20. Training policy Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Data required to train the depth reconstruction part of our system: 7 / 18
  • 21. System deployment NN Encoder NN decoder Compact descriptor Nearest Neighbor fast indexing Reference images + poses Most similar image, with pose Unknown pose query Inferred depth map f mage poseg NN Encoder Reference descriptors The depth information is only needed during the training step! 8 / 18
  • 23. Dataset for Localization in challenging conditions Dataset: we train and test our proposal on RobotCar1 dataset Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor 1 Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of Robotics Research (IJRR), 2016 2 Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016 9 / 18
  • 24. Dataset for Localization in challenging conditions Dataset: we train and test our proposal on RobotCar1 dataset Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor Competitors: Hallucination network2 & same descriptor only trained with RGB 1 Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of Robotics Research (IJRR), 2016 2 Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016 9 / 18
  • 25. Dataset for Localization in challenging conditions Dataset: we train and test our proposal on RobotCar1 dataset Deep architectures: Resnet18 or Alexnet encoder combined with NetVLAD or MAC descriptor Competitors: Hallucination network2 & same descriptor only trained with RGB Metric: we compare the distance between the top ranked returned database image position and the query ground truth position. 1 Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of Robotics Research (IJRR), 2016 2 Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016 9 / 18
  • 26. Results: long-term localization Queries atabase Example of queries and nearest image in the database. 20 30 40 50 D - Distance to top 1 candidate (m) 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 Recall@D(%) 0 10 20 0.5 1 RGBtD (our) + NetVLAD (Rt) RGB + NetVLAD (Rt) RGBtD (our) + NetVLAD (A) RGBtD (hall) + NetVLAD (A) RGB + NetVLAD (A) RGBtD (our) + MAC (R) RGB + MAC (R) RGBtD (our) + MAC (A) RGBtD (hall) + MAC (A) RGB+ MAC (A) Competitors: --- Only RGB [RGB] -x- Hallucination network [RGBtD (Hall)] -o- Our proposal [RGBtD (our)] 10 / 18
  • 27. Results: cross-season localization Queries atabase Example of queries and nearest image in the database. 20 30 40 50 D - Distance to top 1 candidate (m) 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 Recall@D(%) 0 10 20 0.5 1 RGBtD (our) + NetVLAD (Rt) RGB + NetVLAD (Rt) RGBtD (our) + NetVLAD (A) RGBtD (hall) + NetVLAD (A) RGB + NetVLAD (A) RGBtD (our) + MAC (R) RGB + MAC (R) RGBtD (our) + MAC (A) RGBtD (hall) + MAC (A) RGB+ MAC (A) Competitors: --- Only RGB [RGB] -x- Hallucination network [RGBtD (Hall)] -o- Our proposal [RGBtD (our)] 11 / 18
  • 28. Failure case: Night to day atabaseQueries Example of queries and nearest image in the database. 20 30 40 50 D - Distance to top 1 candidate (m) 0.05 0.1 0.15 0.2 Recall@D(%) 0 10 20 0.5 1 RGBtD (our) + NetVLAD (Rt) RGB + NetVLAD (Rt) RGBtD (our) + NetVLAD (A) RGBtD (hall) + NetVLAD (A) RGB + NetVLAD (A) RGBtD (our) + MAC (R) RGB + MAC (R) RGBtD (our) + MAC (A) RGBtD (hall) + MAC (A) RGB+ MAC (A) Competitors: --- Only RGB [RGB] -x- Hallucination network [RGBtD (Hall)] -o- Our proposal [RGBtD (our)] 12 / 18
  • 29. Improving night to day localization Night images that serve as input for the generative network Ground truth depth maps Generated depth maps Our network is not able to generate proper depth maps from night images. 13 / 18
  • 30. Recall - Training policy Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Thanks to the design of our method, we can improve generation performances of the decoder without impacting the descriptors networks. 14 / 18
  • 31. Recall - Training policy Images/Depth pair Reconstructed depth CNN decoder L1 Loss Triplet Loss Descriptors concatenation LθG LθD;θI Triplet Loss LθI Triplet Loss LθD CNN CNN Desc. Desc. Thanks to the design of our method, we can improve generation performances of the decoder without impacting the descriptors networks. We fine tune our system with pairs of night image/depth map. 14 / 18
  • 32. Improving night to day localization Night images that serve as input for the generative network Ground truth depth maps Generated depth maps Generated depth maps, with fine tuning 15 / 18
  • 33. Results: Night to day (fine tuning) 20 30 40 50 D - Distance to top 1 candidate (m) 0.05 0.1 0.15 0.2 Recall@D(%) Night vs daytime (w/o fine tuning) 20 30 40 50 D - Distance to top 1 candidate (m) 0.05 0.1 0.15 0.2 0.25 Recall@D(%) Night vs daytime (w/ fine tuning) With fine tuning with a small amount of data, we are able to nearly double localization performances. 16 / 18
  • 35. Conclusion Summary: • new global image descriptor trained with side depth modality, • especially well suited to long term and cross season outdoor localization, • promising results for challenging night to day image retrieval. 17 / 18
  • 36. Conclusion Summary: • new global image descriptor trained with side depth modality, • especially well suited to long term and cross season outdoor localization, • promising results for challenging night to day image retrieval. For future works: • use other modalities, • investigate less trivial descriptor fusion method, • test on other visual localisation tasks (e.g. direct pose regression). 17 / 18
  • 37. References Relja Arandjelovi´c, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic, NetVLAD: CNN architecture for weakly supervised place recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), pages 5297–5307, 2017. Judy Hoffman, Saurabh Gupta, and Trevor Darrell, Learning with Side Information through Modality Hallucination, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 826–834, 2016. Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman, 1 year, 1000 km: The Oxford RobotCar dataset, The International Journal of Robotics Research (IJRR), 2016.