O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.
O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.
A Scribd passará a dirigir o SlideShare em 1 de dezembro de 2020A partir desta data, a Scribd passará a gerenciar sua conta do SlideShare e qualquer conteúdo que você possa ter na plataforma. Além disso, serão aplicados os Termos gerais de uso e a Política de Privacidade da Scribd. Se prefira sair da plataforma, por favor, encerre sua conta do SlideShare. Saiba mais.
This was presented at Augmented World Expo in Santa Clara (#AWE2017). A video of the live Natural Feature Tracking demo will be uploaded and linked from here soon.
The key benefit of using AR in your web browser is how quick and easy it is to share. You can send a single web link through social media or email and the recipient can just tap on the link and it works. But up until recently this has not included Computer Vision based AR for a number of technical and market reasons. This presentation will position the web browser in the overall context of Computer Vision history, and we'll look at how this has evolved through developments including jsartoolkit.js, tracking.js and AR.js. We'll then dive deeper into the latest developments to show how OpenCV performs running in the browser and how this compares to native applications. This deep dive will compare the different feature detection/extraction algorithms and how they perform on some well known image data sets. The session will conclude with demos that show how this all works right now in over 2 Billion capable web browsers.
Computer Vision - now working in over 2 Billion Web Browsers!
Computer Vision - now working
in over 2 Billion Web Browsers!
CEO & co-founder
Computer Vision Engineer
Mixed Reality. In the web. On any device.
So what is Mixed Reality?
Here’s a short demo of Milgram’s Mixed Reality Continuum - all running in a browser.
A brief/biased history of Computer Vision
1957 - Russel A. Kirsch scans ﬁrst photo with a computer
1960 - Larry Roberts publishes thesis at MIT
1964 - First facial recognition system (unamed intelligence agency)
1976 - UK Police create ﬁrst License Plate recognition system
1978 - David Marr proposes edge detection framework at MIT
1985 - Lockheed Martin/Carnegie Mellon create ﬁrst self-driving land vehicle
1992 - Tom Caudell at Boeing coins the term Augmented Reality
1999 - Billinghurst & Kato publish/demo ARToolkit at IWAR/SIGGRAPH
2000 - Windows only alpha version of OpenCV launched at CVPR
2007 - OpenCV 1.0 released
2008 - ARToolkit ported to Flash by @saqoosha
2011 - FastCV/Vuforia 1.0 released
2017 - Facebook adds Computer Vision to their camera app
2017 - OpenCV in the browser demonstrated here
How does Computer Vision
work in the browser?
camera -> gUM -> video -> canvas -> pixels -> vision algorithms
This is a container for decoding and presenting video streams.
This brought plugin free video to the web.
Canvas, WebGL & the ArrayBuffer
The 2D Canvas gave us the ability to convert a video stream into pixel data.
WebGL brought 3D Canvases with access to the GPU.
But most importantly WebGL gave us ArrayBuffers
which allowed us to access the pixel data for the ﬁrst time.
Enter WebRTC's getUserMedia()
Some claim this has a latency that makes the web unusable for AR.
But here’s the numbers running on a Pixel - the max difference is ~200ms
200-250ms - Camera stream in a native AR
350-400ms - gUM stream in a web app
FAST feature detection & Tigerstail in 2012
Tracking.js released in 2012
AR.js released in 2017
This brings a more general computer vision toolkit to the web!
But there's no gUM on iOS?
For Vision based functionality we fallback to Visual Search
For Location based apps we fallback to 360°/VR (like Pokemon Go with the camera off)
And remember “video see thu” is not the only form of AR