SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
Bringing Personal Robots Home [S9360]
Integrating Computer Vision & Human–Robot
Interaction for Real-World Applications
NVIDIA GTC 2019 (Mar 18, 2019)
Jun Hatori, Preferred Networks
Requirements for Robots
Industrial Personal
Cost high low
Environment fixed, known,
structured
dynamic,
unstructured,
unseen
Users experts non-experts
Goal automation intelligence,
personalization
Requirements for Robots
Industrial Personal Key technology
Cost high low hardware
Environment fixed, known,
structured
dynamic,
unstructured,
unseen
computer
vision
Users experts non-experts human–robot
interaction
Goal automation intelligence,
personalization
task planning
Requirements for Robots
Industrial Personal Key technology
Cost high low hardware
Environment fixed, known,
structured
dynamic,
unstructured,
unseen
computer
vision
Users experts non-experts human–robot
interaction
Goal automation intelligence,
personalization
task planning
A variety of real-world
environments
PR1: Wyrobek et al. 2008
Key Technologies
● Computer Vision: Generalization to different environments and tasks
○ Object detection of thousands of categories
○ Support unseen environments and unseen objects
● Human–robot interface between humans and robots
○ Intuitive interface with spoken and visual language interpretation
○ Spoken and visual feedback from robots
Two Projects
● Interactive picking robot
● Autonomous tidying-up robot
Interactively Picking Real-World Objects
https://projects.preferred.jp/interactive-robot/
Challenges
● Variety of Expressions
“a bear doll”, “the animal plushie”,
“that fluffy thing”, “up-side-down grizzly”
“grab X”, “bring together X and Y”,
“move X to a diagonal box”
● Ambiguity and errors
“that brown one”, “a dog doll?”
Human: the one next to the eraser box.
Robot: I got it.
Human: hey can you move that brown fluffy
thing to the bottom right?
Robot: which one do you mean?
Proposed Model
embedding
MLP
speech (transcription)
CNN (+feat.)
MLP
cropped images
!pick the brown fluffy thing and put
in the lower bin.
embedding
LSTM
vision (RGB)
SSD
Destination
LSTM
MLP
Target Obj.
Handling Ambiguous Commands
● Trained with hinge loss for correct sentence–object pairs [Yu+ 2017]
● Instruction is considered ambiguous if margin is below threshold
CNN
MLP
CNN
MLPMLP
LSTM
!pick the brown
fluffy thing and put it
in the lower right bin.
2nd 1st
margin
Interactive Picking Dataset
grab the human face
labeled object and …
move the pop red can
from the top …
move the pink horse
plushie …
put the box with a 50
written on it that is …
Publicly available as PFN-PIC dataset:
https://github.com/pfnet-research/picking-instruction
1200 scenes
(26k objects in total)
100 types of commodities
unconstrained 73k instructions
(vocabulary size: 5000)
Results
single instruction
88.0%
Accuracy of target object matching
Results
4.7% improvement (39% error reduction) by interactive clarification
single instruction interactive
88.0% 92.7%
Accuracy of target object matching
Summary
● We proposed an interactive picking system that can be controlled by
unconstrained spoken language instructions.
● We achieved an object matching accuracy of 92.7%.
● Accuracies for unseen objects are not sufficient (~70%).
* Hatori+ 2018. Interactively Picking Real-World Object with Unconstrained Spoken Language Instructions. ICRA-2018 Best Paper on HRI.
Tidying-up Robot
https://projects.preferred.jp/tidying-up-robot/
CEATEC JAPAN 2018 (Oct 16–19, 2018)
Environment
● Furnished living room
○ Coffee table, coach,
bookshelf, trash bins,
laundry bag, toy box
● Two Toyota HSRs working in
parallel
Object Recognition
● Sensors
○ HSR’s head camera (RGBD)
○ 4 ceiling cameras (RGB)
● Supported objects: ~300
● PFDet as CNN base model
○ 2nd place accuracy at Google
AI Open Images Challenge –
Object Detection (Sep, 2018)
PFDet: Basic Architecture [1]
● Feature Pyramid Network (FPN) (SENet-154 and SE-ResNeXt-101)
● Multi-node batch normalization
● Non-maximum weighted (NMW) suppression [2]
● Global context
○ Additional FPN block
○ PSP (pyramid spatial pooling) module
○ Context head [3]
[1] Akiba+ 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track.
[2] Zhou+. CAD: Scale invariant framework for real-time object detection. ICCVW 2017.
[3] Zhu+. CoupleNet: Coupling global structure with local parts for object detection. ICCV 2017.
PFDet: High Scalability
Hardware: In-house GPU Cluster
NVIDIA Tesla V100
(32GB) × 512
Infiniband
Scalability Results
● Training of 16 epochs completed in 33 hours
● Scaling efficiency is 83% compared to 8 GPUs
Software Framework
Data Collection
System Performance
● Object detection
○ Accuracy: 0.90 mIoU (segmentation mask)
● Robot system (actual measurement at CEATEC)
○ Tidying-up Speed: 1.9 object / minute
○ Grasp success rate: ~90%
Robustness of Object Detection
Sparse Dense
Typical Errors
Mango vs. lemonMis-recognition on
humans Whiteout
False negative in clutter
34
Human–Robot Interaction (HRI)
● From user to robot
○ Update where the current item should be stored
○ Inquire about object locations
● From robot to user
○ Spoken and audio feedback
○ Tablet App for monitoring
■ User can also provide feedback
■ AR-based visualization
● Technologies involved: speech recognition, NLP, gesture, AR
Needs English subtitles
Needs English subtitles
Tablet UI
Remaining Challenges with Tidying-up
● Standalone computation (no external sensor or computer)
● Recognition of unlimited items in domestic environments
● Generalization to unseen environments
● Easy setup
Robots as Interface with Physical World
● Domestic robots can track household items while tidying-up,
connecting everything in physical world to the virtual world.
● Potential applications:
○ E-commerce
○ Recommendations on items purchase or disposal
Key Takeaways
● Robust computer vision and intuitive human–robot interface are
prerequisites for successful personal robot applications.
● Some of simple domestic tasks like tidying-up are getting close to a
production level.
● Robots are interface with physical world, computerizing household
items and connect them to online services.
Thank you!
Interactive picking: https://pfnet.github.io/interactive-robot/
Tidying-up robot: https://projects.preferred.jp/tidying-up-robot/en/
Related talks
● S9380 - The Frontier of Define-by-Run Deep Learning Frameworks
Wed, Mar 20, 11:00 AM - 11:50 AM – SJCC Room 210E
● S9738 - Using GPU Power for NumPy-syntax Calculations
Tue, Mar 19, 2:00 PM - 02:50 PM – SJCC Room 210F

Mais conteúdo relacionado

Semelhante a [GTC 2019] Bringing Personal Robots Home: Integrating Computer Vision and Human–Robot Interaction for Real-World Applications

IoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTIoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTClemente Giorio
 
Implementation of humanoid robot with using the concept of synthetic brain
Implementation of humanoid robot with using the concept of synthetic brainImplementation of humanoid robot with using the concept of synthetic brain
Implementation of humanoid robot with using the concept of synthetic braineSAT Journals
 
Implementation of humanoid robot with using the
Implementation of humanoid robot with using theImplementation of humanoid robot with using the
Implementation of humanoid robot with using theeSAT Publishing House
 
Cloud Robotics for Human-Robot Dialogues
Cloud Robotics for Human-Robot DialoguesCloud Robotics for Human-Robot Dialogues
Cloud Robotics for Human-Robot DialoguesKomei Sugiura
 
COMP 4010 Lecture 9 AR Interaction
COMP 4010 Lecture 9 AR InteractionCOMP 4010 Lecture 9 AR Interaction
COMP 4010 Lecture 9 AR InteractionMark Billinghurst
 
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utente
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utenteHumans vs. the Internet of Things: conciliare tecnologie ed esperienza utente
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utenteFulvio Corno
 
AI & Robotics PPT For Schools Students.pptx
AI & Robotics PPT For Schools Students.pptxAI & Robotics PPT For Schools Students.pptx
AI & Robotics PPT For Schools Students.pptxDEVENDRA SHRIVASH
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET Journal
 
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWARE
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWAREFIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWARE
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWAREFIWARE
 
Future of Robotics Technology.pptx
Future of Robotics Technology.pptxFuture of Robotics Technology.pptx
Future of Robotics Technology.pptxApurbaRoy48
 
Artificial intelligence tapan
Artificial intelligence tapanArtificial intelligence tapan
Artificial intelligence tapanTapan Khilar
 
AN INTRODUCTION TO EMERGING TECHNOLOGY
AN INTRODUCTION TO EMERGING TECHNOLOGYAN INTRODUCTION TO EMERGING TECHNOLOGY
AN INTRODUCTION TO EMERGING TECHNOLOGYVijay R. Joshi
 
HoloLens.pdf
HoloLens.pdfHoloLens.pdf
HoloLens.pdfVishwas N
 
COSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionCOSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionMark Billinghurst
 
Internet of things initiative-cskskv
Internet of things   initiative-cskskvInternet of things   initiative-cskskv
Internet of things initiative-cskskvChetan Khatri
 

Semelhante a [GTC 2019] Bringing Personal Robots Home: Integrating Computer Vision and Human–Robot Interaction for Real-World Applications (20)

IoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoTIoT Day Italy - Mixed Reality & IoT
IoT Day Italy - Mixed Reality & IoT
 
Implementation of humanoid robot with using the concept of synthetic brain
Implementation of humanoid robot with using the concept of synthetic brainImplementation of humanoid robot with using the concept of synthetic brain
Implementation of humanoid robot with using the concept of synthetic brain
 
Implementation of humanoid robot with using the
Implementation of humanoid robot with using theImplementation of humanoid robot with using the
Implementation of humanoid robot with using the
 
Cloud Robotics for Human-Robot Dialogues
Cloud Robotics for Human-Robot DialoguesCloud Robotics for Human-Robot Dialogues
Cloud Robotics for Human-Robot Dialogues
 
COMP 4010 Lecture 9 AR Interaction
COMP 4010 Lecture 9 AR InteractionCOMP 4010 Lecture 9 AR Interaction
COMP 4010 Lecture 9 AR Interaction
 
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utente
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utenteHumans vs. the Internet of Things: conciliare tecnologie ed esperienza utente
Humans vs. the Internet of Things: conciliare tecnologie ed esperienza utente
 
AI & Robotics PPT For Schools Students.pptx
AI & Robotics PPT For Schools Students.pptxAI & Robotics PPT For Schools Students.pptx
AI & Robotics PPT For Schools Students.pptx
 
IRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for BlindsIRJET- Virtual Vision for Blinds
IRJET- Virtual Vision for Blinds
 
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWARE
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWAREFIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWARE
FIWARE Global Summit - Cloud Robotics with AWS RoboMaker and FIWARE
 
Future of Robotics Technology.pptx
Future of Robotics Technology.pptxFuture of Robotics Technology.pptx
Future of Robotics Technology.pptx
 
Simulation in Robotics
Simulation in RoboticsSimulation in Robotics
Simulation in Robotics
 
realtimeobject (2).pptx
realtimeobject (2).pptxrealtimeobject (2).pptx
realtimeobject (2).pptx
 
Artificial intelligence tapan
Artificial intelligence tapanArtificial intelligence tapan
Artificial intelligence tapan
 
AN INTRODUCTION TO EMERGING TECHNOLOGY
AN INTRODUCTION TO EMERGING TECHNOLOGYAN INTRODUCTION TO EMERGING TECHNOLOGY
AN INTRODUCTION TO EMERGING TECHNOLOGY
 
HoloLens.pdf
HoloLens.pdfHoloLens.pdf
HoloLens.pdf
 
Machine Learning and Robotic Vision
Machine Learning and Robotic VisionMachine Learning and Robotic Vision
Machine Learning and Robotic Vision
 
COSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR InteractionCOSC 426 lect. 4: AR Interaction
COSC 426 lect. 4: AR Interaction
 
Internet of things initiative-cskskv
Internet of things   initiative-cskskvInternet of things   initiative-cskskv
Internet of things initiative-cskskv
 
20161014IROS_WS
20161014IROS_WS20161014IROS_WS
20161014IROS_WS
 
Week14 final
Week14 finalWeek14 final
Week14 final
 

Mais de Preferred Networks

PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57Preferred Networks
 
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Preferred Networks
 
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Preferred Networks
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...Preferred Networks
 
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Preferred Networks
 
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Preferred Networks
 
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2Preferred Networks
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Preferred Networks
 
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演Preferred Networks
 
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Preferred Networks
 
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)Preferred Networks
 
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)Preferred Networks
 
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るKubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るPreferred Networks
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Preferred Networks
 
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会Preferred Networks
 
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2Preferred Networks
 
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Preferred Networks
 
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...Preferred Networks
 
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...Preferred Networks
 
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50Preferred Networks
 

Mais de Preferred Networks (20)

PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
PodSecurityPolicy からGatekeeper に移行しました / Kubernetes Meetup Tokyo #57
 
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
Optunaを使ったHuman-in-the-loop最適化の紹介 - 2023/04/27 W&B 東京ミートアップ #3
 
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
Kubernetes + containerd で cgroup v2 に移行したら "failed to create fsnotify watcher...
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
 
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
Kubernetes ControllerをScale-Outさせる方法 / Kubernetes Meetup Tokyo #55
 
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
Kaggle Happywhaleコンペ優勝解法でのOptuna使用事例 - 2022/12/10 Optuna Meetup #2
 
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
最新リリース:Optuna V3の全て - 2022/12/10 Optuna Meetup #2
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
 
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
スタートアップが提案する2030年の材料開発 - 2022/11/11 QPARC講演
 
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
Deep Learningのための専用プロセッサ「MN-Core」の開発と活用(2022/10/19東大大学院「 融合情報学特別講義Ⅲ」)
 
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
PFNにおける研究開発(2022/10/19 東大大学院「融合情報学特別講義Ⅲ」)
 
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
自然言語処理を 役立てるのはなぜ難しいのか(2022/10/25東大大学院「自然言語処理応用」)
 
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語るKubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
Kubernetes にこれから入るかもしれない注目機能!(2022年11月版) / TechFeed Experts Night #7 〜 コンテナ技術を語る
 
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
Matlantis™のニューラルネットワークポテンシャルPFPの適用範囲拡張
 
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
PFNのオンプレ計算機クラスタの取り組み_第55回情報科学若手の会
 
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
続・PFN のオンプレML基盤の取り組み / オンプレML基盤 on Kubernetes 〜PFN、ヤフー〜 #2
 
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
Kubernetes Service Account As Multi-Cloud Identity / Cloud Native Security Co...
 
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
KubeCon + CloudNativeCon Europe 2022 Recap / Kubernetes Meetup Tokyo #51 / #k...
 
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
KubeCon + CloudNativeCon Europe 2022 Recap - Batch/HPCの潮流とScheduler拡張事例 / Kub...
 
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
独断と偏見で選んだ Kubernetes 1.24 の注目機能と今後! / Kubernetes Meetup Tokyo 50
 

Último

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Último (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

[GTC 2019] Bringing Personal Robots Home: Integrating Computer Vision and Human–Robot Interaction for Real-World Applications

  • 1. Bringing Personal Robots Home [S9360] Integrating Computer Vision & Human–Robot Interaction for Real-World Applications NVIDIA GTC 2019 (Mar 18, 2019) Jun Hatori, Preferred Networks
  • 2.
  • 3. Requirements for Robots Industrial Personal Cost high low Environment fixed, known, structured dynamic, unstructured, unseen Users experts non-experts Goal automation intelligence, personalization
  • 4. Requirements for Robots Industrial Personal Key technology Cost high low hardware Environment fixed, known, structured dynamic, unstructured, unseen computer vision Users experts non-experts human–robot interaction Goal automation intelligence, personalization task planning
  • 5. Requirements for Robots Industrial Personal Key technology Cost high low hardware Environment fixed, known, structured dynamic, unstructured, unseen computer vision Users experts non-experts human–robot interaction Goal automation intelligence, personalization task planning
  • 6. A variety of real-world environments
  • 7. PR1: Wyrobek et al. 2008
  • 8. Key Technologies ● Computer Vision: Generalization to different environments and tasks ○ Object detection of thousands of categories ○ Support unseen environments and unseen objects ● Human–robot interface between humans and robots ○ Intuitive interface with spoken and visual language interpretation ○ Spoken and visual feedback from robots
  • 9. Two Projects ● Interactive picking robot ● Autonomous tidying-up robot
  • 10. Interactively Picking Real-World Objects https://projects.preferred.jp/interactive-robot/
  • 11. Challenges ● Variety of Expressions “a bear doll”, “the animal plushie”, “that fluffy thing”, “up-side-down grizzly” “grab X”, “bring together X and Y”, “move X to a diagonal box” ● Ambiguity and errors “that brown one”, “a dog doll?”
  • 12. Human: the one next to the eraser box. Robot: I got it. Human: hey can you move that brown fluffy thing to the bottom right? Robot: which one do you mean?
  • 13.
  • 14. Proposed Model embedding MLP speech (transcription) CNN (+feat.) MLP cropped images !pick the brown fluffy thing and put in the lower bin. embedding LSTM vision (RGB) SSD Destination LSTM MLP Target Obj.
  • 15. Handling Ambiguous Commands ● Trained with hinge loss for correct sentence–object pairs [Yu+ 2017] ● Instruction is considered ambiguous if margin is below threshold CNN MLP CNN MLPMLP LSTM !pick the brown fluffy thing and put it in the lower right bin. 2nd 1st margin
  • 16. Interactive Picking Dataset grab the human face labeled object and … move the pop red can from the top … move the pink horse plushie … put the box with a 50 written on it that is … Publicly available as PFN-PIC dataset: https://github.com/pfnet-research/picking-instruction 1200 scenes (26k objects in total) 100 types of commodities unconstrained 73k instructions (vocabulary size: 5000)
  • 17.
  • 19. Results 4.7% improvement (39% error reduction) by interactive clarification single instruction interactive 88.0% 92.7% Accuracy of target object matching
  • 20. Summary ● We proposed an interactive picking system that can be controlled by unconstrained spoken language instructions. ● We achieved an object matching accuracy of 92.7%. ● Accuracies for unseen objects are not sufficient (~70%). * Hatori+ 2018. Interactively Picking Real-World Object with Unconstrained Spoken Language Instructions. ICRA-2018 Best Paper on HRI.
  • 21.
  • 23. CEATEC JAPAN 2018 (Oct 16–19, 2018)
  • 24.
  • 25. Environment ● Furnished living room ○ Coffee table, coach, bookshelf, trash bins, laundry bag, toy box ● Two Toyota HSRs working in parallel
  • 26. Object Recognition ● Sensors ○ HSR’s head camera (RGBD) ○ 4 ceiling cameras (RGB) ● Supported objects: ~300 ● PFDet as CNN base model ○ 2nd place accuracy at Google AI Open Images Challenge – Object Detection (Sep, 2018)
  • 27. PFDet: Basic Architecture [1] ● Feature Pyramid Network (FPN) (SENet-154 and SE-ResNeXt-101) ● Multi-node batch normalization ● Non-maximum weighted (NMW) suppression [2] ● Global context ○ Additional FPN block ○ PSP (pyramid spatial pooling) module ○ Context head [3] [1] Akiba+ 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. [2] Zhou+. CAD: Scale invariant framework for real-time object detection. ICCVW 2017. [3] Zhu+. CoupleNet: Coupling global structure with local parts for object detection. ICCV 2017.
  • 28. PFDet: High Scalability Hardware: In-house GPU Cluster NVIDIA Tesla V100 (32GB) × 512 Infiniband Scalability Results ● Training of 16 epochs completed in 33 hours ● Scaling efficiency is 83% compared to 8 GPUs Software Framework
  • 30. System Performance ● Object detection ○ Accuracy: 0.90 mIoU (segmentation mask) ● Robot system (actual measurement at CEATEC) ○ Tidying-up Speed: 1.9 object / minute ○ Grasp success rate: ~90%
  • 31. Robustness of Object Detection Sparse Dense
  • 32.
  • 33. Typical Errors Mango vs. lemonMis-recognition on humans Whiteout False negative in clutter
  • 34. 34
  • 35. Human–Robot Interaction (HRI) ● From user to robot ○ Update where the current item should be stored ○ Inquire about object locations ● From robot to user ○ Spoken and audio feedback ○ Tablet App for monitoring ■ User can also provide feedback ■ AR-based visualization ● Technologies involved: speech recognition, NLP, gesture, AR
  • 39.
  • 40. Remaining Challenges with Tidying-up ● Standalone computation (no external sensor or computer) ● Recognition of unlimited items in domestic environments ● Generalization to unseen environments ● Easy setup
  • 41. Robots as Interface with Physical World ● Domestic robots can track household items while tidying-up, connecting everything in physical world to the virtual world. ● Potential applications: ○ E-commerce ○ Recommendations on items purchase or disposal
  • 42. Key Takeaways ● Robust computer vision and intuitive human–robot interface are prerequisites for successful personal robot applications. ● Some of simple domestic tasks like tidying-up are getting close to a production level. ● Robots are interface with physical world, computerizing household items and connect them to online services.
  • 43. Thank you! Interactive picking: https://pfnet.github.io/interactive-robot/ Tidying-up robot: https://projects.preferred.jp/tidying-up-robot/en/ Related talks ● S9380 - The Frontier of Define-by-Run Deep Learning Frameworks Wed, Mar 20, 11:00 AM - 11:50 AM – SJCC Room 210E ● S9738 - Using GPU Power for NumPy-syntax Calculations Tue, Mar 19, 2:00 PM - 02:50 PM – SJCC Room 210F