O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

First Thesis Presentation

2.315 visualizações

Publicada em

More information about our thesis can be found on our blog: http://augmentjapan.wordpress.com/ .

Publicada em: Educação, Tecnologia, Negócios
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

First Thesis Presentation

  1. 1. Matthias Vandenbussche Annelies Van der Borght Promotor: Erik Duval Supervisor: Sten Govaerts Stand-in Supervisor: Gonzalo Parra http://augmentjapan.wordpress.com/
  2. 2. <ul><li>Introduction </li></ul><ul><li>Application </li></ul><ul><li>Related Work </li></ul><ul><li>Survey </li></ul><ul><li>Locating Text </li></ul><ul><li>Preprocessing Text Regions </li></ul><ul><li>Optical Character Recognition (OCR) </li></ul><ul><li>Translations </li></ul><ul><li>Paper Prototype </li></ul><ul><li>Technologies </li></ul><ul><li>Schedule </li></ul><ul><li>Statistics </li></ul>
  3. 6. <ul><li>Mobile application </li></ul><ul><li>Camera feed </li></ul><ul><li>Find text </li></ul><ul><li>Translate </li></ul><ul><li>Augmented reality display </li></ul>
  4. 7. <ul><li>Input: image  video </li></ul><ul><li>Locating text: automatic  user input  combination </li></ul><ul><li>Translation: word per word  sign </li></ul>
  5. 8. 225
  6. 17. <ul><li>Problems: </li></ul><ul><li>Most lack detail </li></ul><ul><li>Most use Simple Vector Machines (SVM) or Neural Networks (NN) </li></ul>
  7. 20. <ul><li>TiRG </li></ul><ul><li>Stroke Width Transform (B. Ephstein et al., 2010) </li></ul><ul><li>Requiring user interaction </li></ul>
  8. 23. NHOCR Tesseract WiseTREND Aspire  ABBYY Finereader
  9. 24. <ul><li>Best recognition rates: </li></ul><ul><li>Best perfect match rates: </li></ul>NHOCR Tesseract ABBYY 39.5% 49% 65.5% NHOCR Tesseract ABBYY 11% 3% 31%
  10. 25. Google Translate Bing Translator SYSTRAN InterTran WordLingo myGengo OneHourTranslation Apertium F-measure BLEU NIST [0;1] [0;1] ? [0;11]
  11. 30. “ We are very sorry for the inconvenience” “ Sorry, my wife might read over the evil described in 菩”
  12. 35. <ul><li>Pre-test questionnaire </li></ul><ul><li>Briefing </li></ul><ul><li>4 scenarios </li></ul><ul><li>Post-test questionnaire </li></ul><ul><li>System Usability Scale (SUS) </li></ul>
  13. 37. <ul><li>Video feed on startup </li></ul><ul><li>► Information message on startup </li></ul><ul><li>Option menu flow on Android </li></ul><ul><li>► remove buttons </li></ul><ul><li>“ X” in iPhone option menu </li></ul><ul><li>► remove or rename to “done” </li></ul><ul><li>“ Languages”  “Translation Services” </li></ul><ul><li>► Renaming of options </li></ul>
  14. 38. <ul><li>Access camera feed </li></ul><ul><li>Alter camera feed </li></ul><ul><li>Locate text </li></ul><ul><li>Track text </li></ul><ul><li>Access to accelerometer </li></ul>
  15. 39. Camera Augment video Locating text Tracking Sensor HTML5 iPhone    Android phone    PhoneGap iPhone    Android phone    Native app iPhone      Android phone     
  16. 44. <ul><li>Now - March: finishing implementation </li></ul><ul><li>Begin March: first iteration </li></ul><ul><li>Begin April: second iteration </li></ul><ul><li>~16 April: comparative user tests </li></ul><ul><li>Begin May - end: writing final text </li></ul>
  17. 45. <ul><li>Own blog: </li></ul><ul><ul><li>19 posts </li></ul></ul><ul><ul><li>15 comments </li></ul></ul><ul><li>Matthias: </li></ul><ul><ul><li>26 comments on other blogs </li></ul></ul><ul><ul><li>172 #thesis11 </li></ul></ul><ul><ul><li>243h 45min worked </li></ul></ul><ul><li>Annelies: </li></ul><ul><ul><li>15 comments on other blogs </li></ul></ul><ul><ul><li>110 #thesis11 </li></ul></ul><ul><ul><li>247h 45min hours worked </li></ul></ul>
  18. 46. <ul><li>Camera-based Kanji OCR for Mobile-phones: Practical Issues (M. Koga et al., 2005) </li></ul><ul><li>Translation camera on mobile phone (Y. Watanabe et al., 2003) </li></ul><ul><li>TranslatAR: A mobile augmented reality translator (V. Fragoso et al., 2011) </li></ul><ul><li>Automatic detection and translation of text from natural scenes (J. Yang et al., 2002) </li></ul><ul><li>TiRG: http://sourceforge.net/projects/tirg/ </li></ul><ul><li>Detecting Text in Natural Scenes with Stroke Width Transform (B. Epshtein et al., 2010) Implementation at: https://sites.google.com/site/roboticssaurav/strokewidthnokia </li></ul><ul><li>Text/Graphics Separation and Skew Correction of Text Regions of Business Card Images for Mobile Devices (A. F. Mollah et al., 2010) </li></ul><ul><li>Correction of perspective text image based on gradient method (L. Tong and Y. Zhang, 2010) </li></ul><ul><li>Reliable measures for aligning japanese-english news articles and sentences (M. Utiyama and H. Isahara, 2003) The set can be found at: http://mastarpj.nict.go.jp/%7Emutiyama/jea/ </li></ul><ul><li>F-measure: Evaluation of machine translation and its evaluation (J. Turian et al., 2003) </li></ul><ul><li>NIST: Automatic evaluation of machine translation quality using n-gram co-occurrence statistics (G. Doddington, 2002) </li></ul><ul><li>BLEU: a method for automatic evaluation of machine translation (K. Papineni et al., 2002) </li></ul><ul><li>Automatic Detection and Translation of Text from Natural Scenes (Jie Yang et al., 2002) </li></ul><ul><li>SUS - A quick and dirty usability scale (J. Brooke, 1996) </li></ul>
  19. 47. <ul><li>A comprehensive method for multilingual video text detection, localization, and extraction (M. R. Lyu et al., 2005) </li></ul><ul><li>A real-time tracker for markerless augmented reality (A. I. Comport et al., 2003) </li></ul><ul><li>A robust text detection algorithm in images and video frames (Qixiang Ye et al., 2003) </li></ul><ul><li>Automatic detection and recognition of signs from natural scenes (Xilin Chen et al., 2004) </li></ul><ul><li>Automatic detection and translation of text from natural scenes (Jie Yang et al., 2002) </li></ul><ul><li>Automatic text location in images and video frames (A. K. Jain and Bin Yu, 1998) </li></ul><ul><li>Camera-based Kanji OCR for Mobile-phones: Practical Issues (M. Koga et al., 2005) </li></ul><ul><li>Comparative Evaluation of Online Machine Translation Systems with Legal Texts (Chunyu Kit and Tak Ming Wong, 2008) </li></ul><ul><li>Correction of perspective text image based on gradient method (Lijing Tong and Yan Zhang, 2010) </li></ul><ul><li>Design-based research: what we learn when we engage in design of interactive systems (Željko Obrenović, 2007) </li></ul><ul><li>Detecting Text in Natural Scenes with Stroke Width Transform (Boris Epshtein, 2010) </li></ul><ul><li>Detection of Text on Road Signs From Video (W Wu et al., 2005) </li></ul><ul><li>Evaluation of machine translation and its evaluation (Joseph P. Turian et al., 2003) </li></ul><ul><li>Fast and robust text detection in images and video frames (Q. Ye et al., 2005) </li></ul><ul><li>Kanji recognition in scene images without detection of text fields - robust against variation of viewpoint, contrast, and background texture (A. Suzuki et al., 2004) </li></ul><ul><li>Markerless augmented reality with a real-time affine region tracker (V Ferrari et al., 2001) </li></ul><ul><li>Multiple target detection and tracking with guaranteed framerates on mobile phones (D. Wagner et al., 2009) </li></ul><ul><li>Performance Evaluation for Text Localization Algorithms: An Empirical Study (Yi-Feng Pan and Cheng-Lin Liu, 2010) </li></ul><ul><li>Real-time vision-based camera tracking for augmented reality applications (Dieter Koller et al., 1997) </li></ul><ul><li>Robust text detection in natural images with edge-enhanced maximally stable extremal regions (Huizhong Chen et al., 2011) </li></ul><ul><li>Sequential correction of perspective warp in camera-based documents (Camille Monnier et al., 2005) </li></ul><ul><li>Snoopertext: A multiresolution system for text detection in complex visual scenes (Minetto, R. Et al., 2010) </li></ul><ul><li>Text Detection on Nokia N900 Using Stroke Width Transform (Saurav Kumar and Andrew Perrault, 2010) </li></ul><ul><li>Text extraction of street level images (J Fabrizio et al., 2009) </li></ul><ul><li>Text information extraction in images and video: a survey (K. Jung, 2004) </li></ul><ul><li>Text locating from natural scene images using image intensities (Jisoo Kim et al., 2005) </li></ul><ul><li>Text/Graphics Separation and Skew Correction of Text Regions of Business Card Images for Mobile Devices (Ayatullah Faruk Mollah et al., 2010) </li></ul><ul><li>TranslatAR: A mobile augmented reality translator (Victor Fragoso et al., 2011) </li></ul><ul><li>Translation and the Internet: Evaluating the Quality of Free Online Machine Translators (Stephen Hampshire and Carmen Porta Salvia, 2010) </li></ul><ul><li>Translation camera on mobile phone (Y. Watanabe et al., 2003) </li></ul><ul><li>Video text recognition using feature compensation as category-dependent feature extraction (M. Mori, 2003) </li></ul>
  20. 48. <ul><li>A Fast Skew Correction Technique for Camera Captured Business Card Images (A. F. Mollah, 2009) </li></ul><ul><li>A new robust algorithm for video text extraction (E. Wong, 2003) </li></ul><ul><li>An evaluation tool for machine translation: Fast evaluation for MT research (S. Nieen et al., 2000) </li></ul><ul><li>An Overview of the Tesseract OCR Engine (Ray Smith, 2007) </li></ul><ul><li>Automatic evaluation of machine translation quality using n-gram co-occurrence statistics (George Doddington, 2002) </li></ul><ul><li>Automatic location of text in video frames (Xian-Sheng Hua et al., 2001) </li></ul><ul><li>BLEU: a Method for Automatic Evaluation of Machine Translation (Kishore Papineni et al., 2002) </li></ul><ul><li>Camera-based analysis of text and documents: a survey (Jian Liang et al., 2005) </li></ul><ul><li>Character extraction of license plates from video (Y. T. Cui and Q. Huang, 1997) </li></ul><ul><li>Color Edge Detection Using Multiscale Quaternion Convolution (Jiangyan Xu et al., 2010) </li></ul><ul><li>Connected components labeling - algorithms in Mathematica, Java, C++ and C# (Mariusz Jankowski and Jens-Peer Kuska , 2004) </li></ul><ul><li>End-to-End Scene Text Recognition (Kai Wang et al., 2011) </li></ul><ul><li>Error Evaluation and Applicability of OCR Systems (V. Alexandrov, 2003) </li></ul><ul><li>Extraction of illusory linear clues in perspectively skewed documents (M. Pilu, 2001) </li></ul><ul><li>Fast, cheap, and creative: Evaluating translation quality using Amazon's Mechanical Turk (Chris Callison-Burch, 2009) </li></ul><ul><li>From Mirroring to Guiding: A Review of State of the Art Technology for Supporting Collaborative Learning (Amy Soller et al., 2005) </li></ul><ul><li>Improvement of video text recognition by character selection (T. Mita and O. Hori, 2001) </li></ul><ul><li>JEIDA's Test-Sets for Quality Evaluation of MT Systems: Technical Evaluation from the Developer's Point of View (Hitoshi Isahara, 1995) </li></ul><ul><li>Kanji Character Detection from Complex Real Scene Images based on Character Properties (Lianli Xu et al., 2008) </li></ul><ul><li>Localizing and segmenting text in images and videos (Rainer Lienhart and Axel Wernicke, 2002) </li></ul><ul><li>Locating text in complex color images (Y. Zhong et al., 1995) </li></ul><ul><li>Marker-less Vision Based Tracking for Mobile Augmented Reality (D. Beier et al., 2003) </li></ul><ul><li>Objective evaluation criteria for machine translation (A. J. Petit, 1977) </li></ul><ul><li>Perspective Correction Methods for Camera-Based Document Analysis (L. Jagannathan and C. V. Jawahar, 2005) </li></ul><ul><li>Re-evaluating machine translation results with paraphrase support (Liang Zhou et al., 2006) </li></ul><ul><li>Re-evaluating the Role of BLEU in Machine Translation Research (Chris Callison-Burch et al., 2006) </li></ul><ul><li>Reliable measures for aligning Japanese-English news articles and sentences (Masao Utiyama and Hitoshi Isahara, 2003) </li></ul><ul><li>SUS - A quick and dirty usability scale (John Brooke, 1996) </li></ul><ul><li>Text detection and segmentation in complex color images (C. Garcia and X. Apostolidis, 2000) </li></ul><ul><li>Text scanner with text detection technology on image sequences (T. Kurata and M. Kourogi, 2002) </li></ul><ul><li>TextFinder: An Automatic System to Detect and Recognize Text In Images (Victor Wu et al., 1999) </li></ul><ul><li>Using multiple edit distances to automatically grade outputs from Machine translation systems (Yasuhiro Akiba et al., 2006) </li></ul>

×