More Related Content More from Toru Tamaki (20) 文献紹介:A Survey of Deep Learning-Based Object Detection1. A Survey of
Deep Learning-Based
Object Detection
Jiao, Licheng and Zhang, Fan and Liu, Fang and Yang,
Shuyuan and Li, Lingling and Feng, Zhixi and Qu, Rong
IEEE Access, 2019
,
2022/06/17
7. R-CNN Fast R-CNN
◼R-CNN [Girshick+, CVPR2014]
• CNN
• SVM
•
• CNN
•
◼Fast R-CNN [Girshick, ICCV2015]
•
• RoI region of interest pooling
• region proposal
8. R-CNN
◼Faster R-CNN [Ren+, NeurIPS2015]
• RPN region proposal network multi-scale anchors
Fast R-CNN
•
◼Mask R-CNN [He+, ICCV2017]
• ResNet [He+, CVPR2016] -FPN
[Lin+, CVPR2017]
• RoI pooling RoIAlign
• 1
◼Cascade R-CNN
[Cai and Vasconcelos, CVPR2018]
• IoU
RoIAlign
10. SSD Single Shot Detection
◼ DBox
• BBox NMS
Localization, confidence
38
38
19
19
19
19
10
10
5
5
3
3
1
1
Non-
maximum
suppression
conv
conv
conv
conv
conv
conv
300 300
[Liu+, ECCV2016]
11. NMS Non-maximum Suppression
◼BBox
• confidence score BBox
• BBox IoU
confidence score BBox
Non-
maximum
suppression
BBox
[Liu+, ECCV2016]
𝐼𝑜𝑈 =
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑂𝑣𝑒𝑟𝑙𝑎𝑝
𝐴𝑟𝑒𝑎 𝑜𝑓 𝑈𝑛𝑖𝑜𝑛
12. one-stage
◼Feature Pyramid Networks
• RetinaNet [Lin+, ICCV2017]
• Focal Loss
• M2Det [Zhao+, AAAI2018]
• Multi-Level FPN
◼RefineDet [Zhang+, CVPR2018]
• one-stage two-stage
RetinaNet RefineDet
14. Relatioal Networks [Hu+, CVPR2018]
◼SSD NMS BBox
•
◼object relation module
•
•
• end to end BBox object relation module
15. DCNv2 [Zhu+, CVPR2019]
◼DCN [Dai+, ICCV2017]
• receptive field
◼Modulated deformable convolution
• Modulation deformable RoI pooling
standard convolution deformable convolution
3 3
20. ◼PASCAL VOC
◼COCO
• COCO mAP
◼ImageNet
◼VisDrone 2018
◼Open Images
◼Pedestrian detection datasets
• Caltech
• KITTI
• CityPersons
• TDC
• EuroCity Persons
21. AP mAP COCO mAP
◼Precision Recall IoU 0.5
• Precision =
BBox(IoU≥0.5)
BBox (all)
• Recall =
BBox(IoU≥0.5)
Gt BBox (all)
◼AP Average Precision
• AP =
0
1
p r dr
• Recall vs Precision AP
•
◼mAP
• AP
• COCO IoU = [0.5, 0.55, … , 0.95] mAP
BBox / BBox
BBox / BBox
23. ◼FPN
• MASK R-CNN, NAS-FPN, FCOS [Tian+, ICCV2019]
◼SSD
• WeaveNet [Chen+, arXiv2017] ESSD [Zheng+, arXiv2018]
◼
• RefineDet, R-DAD [Bae, AAAI2019]
◼
• Attention mechanism [Zhang & Kim, CVPR2019]
• SSD [Kong+, ECCV2018]
◼
• DCN DCNv2 15
24. loss
◼IoU loss
• Unit Box [Yu+, ACM MM 2016]
◼ BBox regression loss
• BBox
[He+, CVPR2019]
• Softer-NMS [He+, arXiv2019]
◼
• Axially Localized Detection
[Cabriel+, nature
communicaitions2019]
◼one-stage
• Hard negative mining
[Bucher+, arXiv2016]
◼ Hard mining
• IoU-balanced sampling
[Pang+, CVPR2019]
◼loss
• RetinaNet
• AP-loss
[Chen+, CVPR2019]
25. NMS
◼NMS
• Relation Networks 14
◼ BBox Gt BBox IoU
• IoU-Net learning [Jiang+, ECCV2018]
◼IoU Confidence score
• Fitness NMS [Tychsen-Smith & Petersson, CVPR2018]
◼NMS
• Softer-NMS [He+, arXiv2019]
26. 1
◼
•
◼SSD
• [Jeong+, arXiv2017]
• Context-Aware SSD
[Xiang+, arXiv2018]
◼GAN [Goodfellow, NeurIPS2014]
• Perceptual GAN [Li+, CVPR2017]
◼
◼
• Face Attention Network
[Wang+, arXiv2017]
◼
• Reputation loss
[Wang+, IEEE Access 2018]
• Occlusion-aware R-CNN
[Zhang+, ECCV2018]
29. ◼
• YOLO YOLO9000 [Redmon & Farhadi, CVPR2017]
• WeaveNet [Chen+, arXiv2017] ESSD [Zheng+, arXiv2018]
• Pelee [Wang+, NeurIPS2018]
◼
• RetinaNet 12
• RFBNet [Liu+, ECCV2018]
• pRF
RFBNet RFB module
30. ◼
• ScrachDet [Zhu+, CVPR2019]
•
◼
• DetNet [Li+, ECCV2018]
•
• Light-Head R-CNN [Li+, arXiv2017]
• two-stage