SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Multimodal Interaction!
 An Introduction!

                                                                 Abdallah	
  ‘Abdo’	
  El	
  Ali	
  
                                                                 h"p://staff.science.uva.nl/~elali/	
  




  Some slides adapted from:
  Gabriel Skantze (KTH Royal Institute of Technology, Sweden),
  Denis Lalanne (University of Fribourg, Switzerland)
Who am I?!


          Currently:	
  PhD	
  in	
  Mobile	
  Human-­‐Computer	
  Interac<on	
  -­‐UvA	
  
           Crossmodal	
  Interac=on	
  in	
  Mobile	
  Environments	
  

          Msc	
  in	
  Cogni<ve	
  Science	
  -­‐	
  UvA	
  	
  
           Cogni=on,	
  Language,	
  &	
  Communica=on	
  track	
  

          Bsc	
  in	
  English	
  Language	
  &	
  Literature	
  -­‐	
  American	
  University	
  of	
  
           Beirut	
  
           Screenwri=ng,	
  Copywri=ng,	
  Edi=ng	
  



2
Outline!


    I.  Mul=modal	
  Interac=on	
  &	
  Interfaces	
  

    II.  Mul=modal	
  Input	
  

    III.  Mul=modal	
  Output	
  

    IV.  Prac=cal	
  Ma"ers	
  	
  



3
Multimodal Interaction & Interfaces!




4
A Brief History of Computer Interfaces!
         Punched	
  cards	
  (late	
  19th	
  century)	
  
             Herman	
  Hollerith	
  	
  -­‐	
  Tabula=ng	
  Machine	
  Company	
  (1896)	
  
   The	
  Command	
  Line	
  Interface	
  (1960s)	
  	
  
   Sketchpad	
  (1963)	
  by	
  Ivan	
  Sutherland	
  –	
  light-­‐pen	
  
    pointer-­‐based	
  system	
  to	
  create	
  and	
  manipulate	
  
    objects	
  in	
  drawings	
  
  Alto	
  personal	
  computer	
  (1973)	
  developed	
  at	
  
    Xerox	
  PARC	
  
    Desktop	
  metaphor,	
  WIMP	
  (windows,	
  icons,	
  
       menus,	
  poin=ng	
  device)	
  
    WYSIWYG	
  
         Xerox	
  8010	
  Star	
  Informa=on	
  System	
  (1981)	
  
         Apple	
  Macintosh	
  (1984)	
  
         Windows	
  1.01	
  (1987)	
  
         Microsoc	
  Windows	
  3.0	
  (1990)	
  
         Mac	
  OSX	
  (2000’s)	
  
         […]	
  
5
Multimodal Interfaces!




6
Project NATAL for Xbox 360
                                 Playstation EyePet




7      Kinect for Xbox 360       Playstation Move
HCI and Human Characteristics !

             HCI	
  is	
  a	
  mul=-­‐disciplinary	
  topic	
  
                 Computer	
  Science	
  &	
  AI	
  
                 Cogni=ve	
  Science	
  
                 Sociology	
  
                 Psychology	
  
                 Design	
  
                 […]	
  

             In	
  HCI	
  design,	
  important	
  to	
  understand	
  
              something	
  about	
  
                 Human	
  informa=on-­‐processing	
  
                  (cogni=ve	
  architecture,	
  memory,	
  
                  percep=on,	
  motor	
  skills,	
  etc.)	
  
                 How	
  human	
  ac=on	
  is	
  structured	
  
                 The	
  nature	
  of	
  human	
  communica=on	
  
                 Human	
  physical	
  and	
  physiological	
  
                  requirements/constraints	
  
8
Why HCI?!
      Humans	
  are	
  limited	
  in	
  their	
  
            capacity	
  to	
  process	
  informa=on	
  	
  
              Implica=ons	
  for	
  the	
  interac=on	
  
               design	
  
              Mul=tasking	
  says	
  it	
  all	
  


      Important	
  considera=ons	
  
              Input-­‐output	
  channels	
  (senses	
  and	
  
               effectors)	
  
              Memory	
  
              Learning	
  (acquiring	
  skills)	
  
              Reasoning	
  /	
  Problem	
  solving	
  
               (cogni=ve	
  ac=vity)	
  
              Decision	
  making	
  

9
Use Case: Mobile Interaction!
       Dis=nc=ve	
  aspects	
  of	
  mobile	
  interac=on	
  
       (Chi"aro,	
  2010):	
  


   Hardware:	
  small	
  screen,	
  limited	
  I/O	
  
   Perceptual:	
  noisy	
  street,	
  sunlight	
  reflec=on,	
  
    no	
  device	
  contact	
  
   Motor:	
  voluntary	
  movements	
  when	
  in-­‐
    vehicle,	
  fat-­‐finger	
  problem	
  
   Social:	
  phone	
  ring	
  at	
  a	
  conference,	
  gestures	
  
    in	
  front	
  of	
  strangers	
  
   Cogni<ve:	
  limited	
  a"en=on	
  span,	
  high	
  
    stress	
  &	
  load,	
  limited	
  memory	
   	
  



10
Embodiment!
  Embodied	
  cogni=on,	
  Situated	
  Cogni=on,	
  Embodied	
  Interac=on,	
  EEC,	
  	
  Social	
  Compu=ng,	
  Tangible	
  
    Compu=ng,	
  Ac=ve	
  percep=on,	
  […]	
  

  Gibson	
  (1979)	
  “ The	
  Ecological	
  Approach	
  to	
  Visual	
  Percep=on”	
  
      “....perceiving	
  is	
  an	
  act	
  not	
  a	
  response,	
  an	
  act	
  of	
  a"en=on,	
  not	
  a	
  triggered	
  impression,	
  an	
  achievement,	
  not	
  
         a	
  reflex”	
  

  Heidegger	
  (1927)	
  “Being	
  and	
  Time”	
  
      Present-­‐at-­‐hand	
  vs.	
  ready-­‐to-­‐hand	
  	
  
      e.g.,	
  hammer	
  as	
  object	
  (presence)	
  vs.	
  hammer	
  as	
  tool	
  (cogni=ve	
  extension)	
  
      E.g.,	
  mouse	
  as	
  hardware	
  vs.	
  mouse	
  as	
  tool	
  for	
  performing	
  GUI	
  opera=ons	
  

  Dourish	
  (1999)	
  “Founda=ons	
  of	
  Embodied	
  Interac=on”	
  	
  
      “…interac=on	
  is	
  an	
  embodied	
  phenomenon.	
  It	
  happens	
  in	
  the	
  world,	
  and	
  that	
  world	
  (a	
  physical	
  world	
  and	
  a	
  
         social	
  world)	
  lends	
  form,	
  substance	
  and	
  meaning	
  to	
  the	
  interac=on.	
  

  Sensori-­‐motor	
  coordina=on	
  
      Percep=on	
  for	
  ac=on	
                                                                                                           Agent
      Ac=on	
  for	
  percep=on	
  


                                                                                                                                              World
Sensation & Perception!
          Humans	
  perceive	
  the	
  world	
  through	
  their	
  
           senses	
  (sensory	
  input)	
  and	
  act	
  on	
  it	
  through	
  the	
  
           motor	
  control	
  of	
  their	
  effectors	
  	
  
          Five	
  major	
  senses	
  
              Sight	
  
              Hearing	
  
              Touch	
  
              Taste	
  
              Smell	
  
              (Propriocep=on,	
  thermocep=on,	
  nociocep=on,	
  …)	
  
          Effectors	
  
              Limbs	
  (arms,	
  legs,	
  body	
  posi=on,	
  …)	
  
              Fingers	
  
              Eyes	
  
              Head	
  /	
  Face	
  
              Body	
  
              Vocal	
  system	
  

12
Man-Machine Interaction!

   Interac<on	
  can	
  be	
  seen	
  as	
  a	
  dialog	
  
          between	
  the	
  computer	
  and	
  the	
  user	
  

   Interac=on	
  styles	
  :	
  
            Command	
  language	
  /	
  Command	
  line	
  
             interface	
  
            Form-­‐fills	
  and	
  spreadsheets	
  
            Menus	
  
            Natural	
  language	
  and	
  query	
  language	
  
            Ques=on/answer	
  dialog	
  
            WIMP	
  
            Point-­‐and-­‐click	
  
            Direct	
  manipula=on	
  
            3D	
  interfaces	
  (virtual	
  reality)	
  
            Brain-­‐computer	
  interface	
  
13
Multimodal Interfaces!

 	
   Mul<modal	
  Interac<on:	
  the	
  situa=on	
  
      where	
  the	
  user	
  is	
  provided	
  with	
  mul=ple	
  
      modes	
  for	
  interac=ng	
  with	
  a	
  system	
  

 	
   Mul<modal	
  Interfaces	
  “…process	
  two	
  or	
  
      more	
  combined	
  user	
  input	
  modes	
  (such	
  as	
  
      speech,	
  pen,	
  touch,	
  manual	
  gesture,	
  gaze,	
  
      and	
  head	
  and	
  body	
  movements)	
  in	
  a	
  
      coordinated	
  manner	
  with	
  mul=media	
  system	
  
      output.	
  They	
  are	
  a	
  new	
  class	
  of	
  interfaces	
  
      that	
  aim	
  to	
  recognize	
  naturally	
  occurring	
  
      forms	
  of	
  human	
  language	
  and	
  behavior,	
  and	
  
      which	
  incorporate	
  one	
  or	
  more	
  recogni=on-­‐
      based	
  technologies	
  (e.g.	
  speech,	
  pen,	
  
      vision)”	
  	
  (Ovia"	
  et	
  al.,	
  2002)	
  
14
Multimodality vs. Multimedia!
   Modality	
  “refers	
  to	
  the	
  type	
  of	
  communica=on	
  
     channel	
  used	
  to	
  convey	
  or	
  acquire	
  informa=on.	
  It	
  
     also	
  covers	
  the	
  way	
  an	
  idea	
  is	
  expressed	
  or	
  
     perceived,	
  or	
  the	
  manner	
  an	
  ac=on	
  is	
  
     performed”	
  (Nigay	
  &	
  Coutaz,	
  1993)	
  
       Visual,	
  Auditory,	
  Hap=c,	
  etc.	
  
       Mul=-­‐	
  refers	
  to	
  2	
  or	
  more	
  such	
  modali=es	
  used	
  

   Mode	
  “refers	
  to	
  a	
  state	
  that	
  determines	
  the	
  way	
  
     informa=on	
  is	
  interpreted	
  to	
  extract	
  or	
  convey	
  
     meaning”	
  (Nigay	
  &	
  Coutaz,	
  1993)	
  

   Mul<media	
  “focuses	
  on	
  the	
  medium	
  or	
  technology	
  
     rather	
  than	
  the	
  applica0on	
  or	
  user”	
  (Buxton,	
  1986)	
  
       e.g.,	
  sound	
  clip	
  a"ached	
  to	
  a	
  presenta=on	
  
       Media	
  channels:	
  Text,	
  graphics,	
  anima=on,	
  video,	
  etc.	
  

15
Early Example!




                                    “Put	
  That	
  There”	
  system	
  	
  
                                    (Bolt,	
  1980)	
  




     Speech	
  and	
  gestures	
  used	
  simultaneously	
  
16
Why Multimodal Interaction?!
 Advantages	
  over	
  GUI	
  and	
  unimodal	
  systems:	
  

   Natural/realism:	
  making	
  use	
  of	
  more	
  
           (appropriate)	
  senses	
  
          New	
  ways	
  of	
  interac=ng	
  
          Flexible:	
  different	
  modali=es	
  excel	
  at	
  
           different	
  tasks	
  
          Wearable	
  computers	
  and	
  small	
  devices	
  
             e.g.,	
  keyboard	
  typing	
  devices	
  require	
  training	
  
   Helps	
  the	
  visually/physically	
  impaired	
  
   Faster,	
  more	
  efficient,	
  higher	
  informa=on	
  
           processing	
  bandwidth	
  
          Robust:	
  mutual	
  disambigua=on	
  of	
  
           recogni=on	
  errors	
  
          Mul=modal	
  interfaces	
  are	
  more	
  engaging	
  
17
Why Multimodal Interaction?!

              Human	
  –	
  Human	
  protocols	
  
              	
   Ini0a0ng	
  conversa0on,	
  
                   turn-­‐taking,	
  interrup0ng,	
  
                   direc0ng	
  a:en0on,	
  …	
  

              Human	
  –	
  Computer	
  protocols	
  
              	
   Shell	
  interac0on,	
  drag-­‐and-­‐
                   drop,	
  dialog	
  boxes,	
  …	
  
              	
   	
  
   Use more of users’ senses
   Users perceive multiple things at once
   Users do multiple things at once (e.g., speak and use
    hand gestures, body position, orientation, and gaze)
18
Questions?!




19
Multimodal Input!




20
Multimodal Input Overview!

	
   Mul=modal	
  Input:	
  
       allows	
  humans	
  to	
  
          communicate	
  naturally	
  
         provides	
  user	
  with	
  mul=ple	
  
          input	
  modali=es	
  
         permits	
  mul=ple	
  styles	
  of	
  
          interac=on	
  
         may	
  be	
  simultaneous	
  or	
  not	
  
         must	
  consider	
  modality	
  fusion	
  
          and	
  temporal	
  constraints	
  



21
Multimodal Input!
       Poin=ng	
  (deixis),	
  (Mul=-­‐)Touch	
  	
  
       Mo=on	
  controller	
  
            Accelerometer,	
  gyro	
  
         Speech	
  
            Free	
  form,	
  fixed,	
  non-­‐speech	
  sounds	
  
         Body	
  movement/Gestures	
  
            Gait,	
  posture	
  	
  
         Head	
  posi=on	
  &	
  movements	
  
            Facial	
  expression,	
  Gaze	
  
         Tangibles	
  
         Digital	
  pen	
  and	
  paper	
  
         Biometrics	
  
            Sweat,	
  pulse,	
  respira=on,	
  skin	
  conductance	
  
         Brain	
  ac=vity	
  (neural)	
  
            EEG	
  signals,	
  fMRI	
  signals,	
  blood	
  oxygena=on	
  
         Scent?	
  
            Odor	
  detec=on	
  
         	
  Taste?	
  	
  	
  

22
Speech and Gesture Interaction!
      Speech	
  
          User	
  sa=sfac=on	
  is	
  highly	
  dependant	
  on	
  their	
  profiles	
  and	
  tasks	
  
          The	
  learning	
  rate	
  is	
  fast	
  
          Error	
  handling	
  is	
  getng	
  be"er	
  
          Perceptual	
  &	
  social	
  usage	
  constraints	
  are	
  important	
  (ambient	
  
           noise,	
  confiden=ality,	
  disturbance,	
  etc.)	
  
          Good	
  spoken	
  languages:	
  short	
  sentences	
  with	
  prosody	
  clearly	
  
           demarca=ng	
  end	
  of	
  words	
  
      	
  Gesture	
  
          	
  Habits	
  are	
  inherited	
  from	
  the	
  usage	
  of	
  mouse	
  
          	
  Gesture	
  poin=ng	
  is	
  direct	
  and	
  reliable	
  (deixis)	
  
          	
  Gesture	
  signs	
  may	
  not	
  be	
  natural	
  making	
  recogni=on	
  hard	
  
23
Fundamental Problems !

       Aligning	
  HCI	
  tasks	
  with	
  modali<es	
  (and	
  vice	
  versa)	
  

       Aligning	
  mul=modal	
  usage	
  to	
  user	
  profiles	
  (and	
  vice	
  versa)	
  

       Mul<modal	
  Fusion	
  
          the	
  integra=on	
  of	
  communica=on	
  modali=es	
  in	
  interac=ve	
  systems	
  
           	
  Input	
  


       Mul<modal	
  Fission	
  	
  
          the	
  repar==oning	
  of	
  informa=on	
  among	
  several	
  communica=on	
  
           modali=es	
  	
  Output	
  

24
Multimodal Man-Machine Interaction Model!




25                                  (Dumas et al., 2009)
Levels of Multimodal Fusion!




        Data	
  Level:	
  
            e.g.,	
  combining	
  2	
  webcam	
  video	
  streams,	
  mul=ple	
  
                      perspec=ves	
  

        Feature	
  level:	
  
            e.g.,	
  combining	
  speech	
  and	
  lip	
  movements	
  

        Decision	
  level:	
  
            e.g.,	
  combining	
  gestures	
  and	
  speech	
  
26
Unimodal or Multimodal?!




27
MATCH: Multimodal Access to City Help
                          (Johnston et al., 2002)!
              Interac=ve	
  city	
  guide	
  and	
  naviga=on	
  
               applica=on:	
  provides	
  restaurant	
  and	
  
               subway	
  informa=on	
  for	
  NY	
  and	
  DC	
  
              Dynamic	
  map-­‐based	
  interface	
  on	
  tablet	
  
              Input	
  modali=es:	
  	
  
                  Speech,	
  pen	
  gesture,	
  handwri=ng,	
  GUI	
  
                  Commands	
  can	
  be	
  speech,	
  pen,	
  or	
  
                   mul=modal	
  
                  Visual	
  parsing	
  of	
  complex	
  gestural	
  input	
  
              Output	
  modali=es:	
  	
  
                  Coordinated	
  mul=modal	
  output	
  combining	
  
                   synthe=c	
  speech	
  and	
  dynamic	
  graphics	
  
              Example:	
  	
  
                  Speech:	
  “show	
  inexpensive	
  italian	
  places	
  in	
  
                   chelsea”	
  
                  Mul=modal:	
  “cheap	
  italian	
  places	
  in	
  this	
  
                   area”	
  (pen	
  gesture;	
  right)	
  



28
NUMACK (Foster and White, 2005)!
              NUMACK	
  (Northwestern	
  University	
  
               Mul=modal	
  Autonomous	
  Conversa=onal	
  
               Kiosk)	
  
              Embodied	
  Conversa=on	
  Agent	
  (ECA)	
  that	
  
               gives	
  direc=ons	
  around	
  Northwestern's	
  
               Campus	
  
              Combina=on	
  of	
  speech,	
  gestures	
  and	
  facial	
  
               expressions	
  
              Uses	
  a	
  grammar-­‐based,	
  computa=onal	
  model	
  
               of	
  language	
  and	
  gesture	
  planning	
  system	
  
              NUMACK's	
  verbal,	
  non-­‐verbal	
  and	
  
               mul=modal	
  behaviors	
  realized	
  through	
  
               synthesized	
  speech	
  and	
  kinema=c	
  body	
  
               model	
  	
  
              System	
  updates	
  its	
  model	
  of	
  context	
  and	
  the	
  
               world	
  by	
  fusing	
  mul=modal	
  user	
  input	
  
                  Stereoscopic,	
  head-­‐tracking	
  system	
  
                  Speech	
  
                  Pen	
  	
  	
  

29
Multimodal Input Advantages!

       Improved	
  error	
  handling	
  &	
  efficiency	
  
         fewer	
  errors	
  
         faster	
  task	
  comple=on	
  
       Greater	
  expressive	
  power	
  
       Greater	
  precision	
  in	
  visual-­‐spa=al	
  tasks	
  (e.g.,	
  map	
  
         scrolling	
  &	
  item	
  localiza=on)	
  
       Support	
  for	
  users’	
  preferred	
  interac=on	
  style	
  
       Accommoda=on	
  to	
  diverse	
  users,	
  tasks	
  &	
  usage	
  
         environments	
  	
  
         e.g.,	
  accented	
  speakers	
  &	
  mobile	
  environments	
  
       Shorter	
  &	
  less	
  complex	
  linguis=c	
  construc=ons	
  	
  
         e.g.,	
  fewer	
  loca=ve	
  descrip=ons	
  

30
Questions?!




31
Multimodal Output!




32
Multimodal Output!
       Visual	
  
          Text	
  
          Graphics	
  
          Anima=ons	
  
          Virtual/Augmented	
  Reality	
  
       Auditory	
  
          Speech	
  (e.g.,	
  Embodied	
  
           Conversa=onal	
  Agent)	
  
          Non-­‐speech	
  Sound	
  
       Hap=cs	
  (tac=le)	
  
          Force	
  feedback	
  (e.g.,	
  PS3	
  
           controller)	
  
          Vibrotac=le	
  (e.g.,	
  phone	
  vibrate)	
  	
  
       Scent?	
  
          Scented	
  mobile	
  phones	
  
       Taste?	
  

33
  	
  
                                                Multimodal Output!
       Advantages	
  (Sarter,	
  2006;	
  
            Ovia",	
  2002):	
  
              Synergy	
  
              Redundancy	
  
              Higher	
  Informa=on	
  bandwidth	
  

       Wicken’s	
  Mul=ple	
  Resource	
  
            Theory	
  (1984)	
  

       More	
  modali=es	
  =	
  be"er?	
  
               Higher	
  resource	
  compe==on	
  
                 when	
  people	
  have	
  to	
  a"end	
  to	
  
                 two	
  sources	
  at	
  once	
  (Reeves	
  et	
  
                 al.,	
  2004).	
  




34
Mobile Multimodal Interfaces!
 	
          	
  
   Mobile	
  context	
  means	
  a"en=onal	
  
             and	
  memory	
  resources	
  are	
  limited	
  
             (Tamminen	
  et	
  al.,	
  2004)	
  
                    E.g.,	
  map	
  scrolling,	
  talking	
  with	
  friend,	
  
                     crossing	
  the	
  street	
  

            Poten=al	
  of	
  mul=modal	
  feedback	
  cues	
  
             in:	
  
               1.        addressing	
  issues	
  of	
  accessibility	
  (e.g.,	
  to	
  
                         support	
  blind	
  users	
  in	
  naviga=on)	
  
                         (Magnusson	
  et	
  al.,	
  2009)	
  	
  
               2.        developing	
  pedestrian	
  naviga=on	
  aids	
  to	
  
                         support	
  situa=onal	
  impairment	
  and	
  
                         awareness	
  (Brewster	
  et	
  al.,	
  2003)	
  

            Examples:	
  	
  
                                                                                           http://www.lalyagaye.com/
                    Pocket	
  Navigator	
  (Pielot	
  et	
  al,	
  2010)	
  	
  
                                                                                                                       http://feelspace.cogsci.uni-osnabrueck.de/
                    AudioGPS	
  (Holland	
  et	
  al.,	
  2002)	
  	
  
35
Tactile and Non-Speech Auditory Feedback!

       Tactons:	
  “Structured,	
  abstract	
  messages	
  that	
  can	
  be	
  used	
  to	
  communicate	
  non-­‐
         visually”	
  (Brown,	
  2005).	
  Informa=on	
  encoded	
  in	
  parameters	
  such	
  as:	
  
           Waveform,	
  dura=on,	
  rhythm,	
  spa=al	
  loca=on,	
  frequency,	
  […]	
  

       Earcons:	
  “Non-­‐verbal	
  audio	
  messages	
  that	
  are	
  used	
  in	
  the	
  computer/user	
  
         interface	
  to	
  provide	
  informa0on	
  to	
  the	
  user	
  about	
  some	
  computer	
  object,	
  
         opera0on	
  or	
  interac0on"	
  (Bla"ner,	
  1989).	
  Informa=on	
  encoded	
  in:	
  
           Pitch,	
  amplitude,	
  dura=on,	
  spa=al	
  loca=on,	
  […]	
  

       Amodal	
  parameters:	
  consist	
  of	
  informa=on	
  that	
  is	
  not	
  specific	
  to	
  any	
  one	
  
         sensory	
  modality	
  (Lewkowickz,	
  1994).	
  Parameters	
  common	
  to	
  both	
  tac=le	
  and	
  
         auditory	
  domains	
  (Lewkowickz,	
  1994;	
  Hoggan	
  et	
  al.,	
  2009):	
  
           Spa=al	
  loca=on,	
  rhythm,	
  texture,	
  dura=on,	
  frequency,	
  intensity/amplitude	
  	
  

36
Crossmodal Interaction!
     	
          	
  
                Subset	
  of	
  mul=modal	
  interac=on	
  where	
  the	
  
                 senses	
  receive	
  the	
  ‘same’	
  informa=on	
  
                 content	
  across	
  invoked	
  sensory	
  modali=es	
  
                 (Gibson,	
  1966;	
  Lewkowicz,1994)	
  
                   Cf.,	
  Sensory	
  Subs=tu=on	
  (Visell,	
  2008)	
  
                   vOICe:	
  Seeing	
  with	
  Sound	
  applica=on;	
  Braille	
  



                Crossmodal	
  Interac=on	
  refers	
  to	
  situa=ons	
  
                 where	
  characteris=cs	
  of	
  one	
  sensory	
  
                 modality	
  may	
  be	
  bi-­‐direc=onally	
  
                 transformed	
  into	
  the	
  characteris=cs	
  of	
  
                 another	
  (e.g.,	
  audio	
  ⇿	
  tac=le)	
  (Hoggan,	
  
                 2007;	
  2009)	
  	
  Redundancy	
  



37
Crossmodal Output Advantages!

     	
   	
  
       Crossmodal	
  output	
  advantages:	
  
           Unlike	
  mul=modal	
  interac=on,	
  
             li"le	
  risk	
  of	
  informa=on	
  
             processing	
  overload	
  
            When	
  one	
  sensory	
  modality	
  is	
  
             knocked	
  out	
  (e.g.,	
  noise	
  
             environment,	
  body	
  contact),	
  
             informa=on	
  is	
  s=ll	
  received	
  
            Permits	
  both	
  ‘eyes-­‐free’	
  and	
  
             ‘hands-­‐free’	
  interac=on	
  



38
Questions?!




39
Practical Matters !




40
Multimodal Input Research Areas!

       	
         	
  
                 Applied	
  Machine	
  Learning	
  
                           Speech	
  Recogni=on,	
  Speech	
  Synthesis	
  
                           Gesture	
  Recogni=on,	
  Mo=on	
  Tracking	
  
                           Head,	
  Gait	
  and	
  Pose	
  Es=ma=on	
  
                           Mul=modal	
  Fusion	
  

     	
     	
   HCI	
  
                           Usability	
  issues	
  in	
  diverse	
  tasks	
  
                           Social	
  acceptability	
  
                           Context-­‐aware	
  and	
  ubiquitous	
  compu=ng	
  
                            (which	
  modality	
  to	
  use	
  when)	
  
                           Design/Prototyping	
  of	
  Mul=modal	
  Interfaces	
  
                            (e.g.,	
  wizard	
  of	
  Oz)	
  


41
Multimodal Output Research Areas!

     	
      	
  
            Virtual	
  and	
  Mixed	
  Reality	
  (Immersive	
  
              Environments)	
  
              Embodied	
  Conversa=on	
  Agents	
  
                  Hap=cs:	
  force-­‐feedback,	
  vibrotac=le	
  feedback	
  
                  Audio:	
  feedback,	
  synthesis	
  
                  Crossmodal	
  Integra=on	
  
     	
     	
  
            HCI	
  (Usability,	
  Ssa<sfac<on)	
  
              Mul=modal	
  Feedback	
  (in-­‐vehicle/pedestrian	
  
                     naviga=on,	
  safety	
  and	
  control,	
  surgery,	
  
                     ergonomics,	
  etc.)	
  	
  
              Crossmodal	
  Feedback	
  
              (Mobile)	
  Mul=modal	
  Interface	
  design	
  


42
International Communities!

       	
       	
  
               CHI:	
  ACM	
  CHI	
  Conference	
  on	
  Human	
  Factors	
  in	
  Compu=ng	
  
                 Systems	
  

               MobileHCI:	
  ACM	
  conference	
  on	
  Human-­‐computer	
  interac=on	
  
                with	
  mobile	
  devices	
  and	
  services	
  

               ICMI:	
  ACM	
  Interna=onal	
  Conference	
  on	
  Mul=modal	
  
                Interac=on	
  

     	
     	
   CSCW:	
  ACM	
  Conference	
  on	
  Computer	
  Supported	
  Coopera=ve	
  
                Work	
  

               ACM	
  MM:	
  ACM	
  Mul=media	
  Conference	
  	
  

               INTERACT:	
  IFIP	
  conference	
  on	
  Human-­‐Computer	
  Interac=on	
  


               WHC:	
  World	
  Hap=cs	
  Conference	
  



43
Resources!
	
             	
  
              Books:	
  
                  Paul	
  Dourish	
  (2004)	
  “Where	
  the	
  Ac=on	
  is:	
  The	
  founda=ons	
  of	
  
                             embodied	
  interac=on”	
  
                            Andy	
  Clark	
  (2003)	
  “Natural-­‐Born	
  Cyborgs:	
  Minds,	
  Technologies,	
  
                             and	
  the	
  Future	
  of	
  Human	
  Intelligence”	
  
                            Bill	
  Buxton	
  (2007)	
  “Sketching	
  User	
  Experiences:	
  Getng	
  the	
  
                             design	
  right	
  and	
  the	
  right	
  design”	
  
                            Adam	
  Greenfield	
  (2006)	
  “Everyware:	
  The	
  dawning	
  age	
  of	
  
                             ubiquitous	
  compu=ng”	
  

              Ar<cles:	
  
       	
            	
  
                    Mark	
  Weiser	
  (1991)	
  “ The	
  Computer	
  for	
  the	
  21st	
  Century”,	
  
                             Scien0fic	
  American	
  
                            Sharon	
  Ovia"	
  (2002)	
  “Perceptual	
  user	
  interfaces:	
  mul=modal	
  
                             interfaces	
  that	
  process	
  what	
  comes	
  naturally”,	
  Communica=ons	
  
                             of	
  the	
  ACM	
  
                            Sharon	
  Ovia"	
  (1999)	
  “ Ten	
  myths	
  of	
  mul=modal	
  interac=on”,	
  
                             Communica=ons	
  of	
  the	
  ACM	
  
                            Nadine	
  Sarter	
  (2006)	
  “Mul=modal	
  informa=on	
  presenta=on:	
  
                             Design	
  guidance	
  and	
  research	
  challenges”,	
  Interna=onal	
  Journal	
  
                             of	
  Industrial	
  Ergonomics	
  
                            Leah	
  Reeves	
  et	
  al.	
  (2004)	
  “Guidelines	
  for	
  mul=modal	
  user	
  
                             interface	
  design”,	
  Communica=ons	
  of	
  the	
  ACM	
  
44
Summary!
            	
            	
  
                         We	
  are	
  embodied	
  and	
  embedded	
  
                          creatures,	
  and	
  this	
  influences	
  the	
  way	
  we	
  
                          interact	
  with	
  the	
  world	
  and	
  computa=onal	
  
                          ar=facts	
  

                         Mul<modal	
  Interfaces	
  aim	
  at	
  making	
  
                          communica=on	
  with	
  machines	
  more	
  
                          natural,	
  more	
  efficient,	
  and	
  more	
  engaging	
  

     	
           	
     Mul<modal	
  Input	
  and	
  Output	
  focus	
  on	
  
                          different	
  aspects	
  within	
  HCI,	
  requiring	
  
                          different	
  skill	
  sets,	
  but	
  mul=modal	
  research	
  
                          and	
  development	
  requires	
  both	
  

                         Mul<modal	
  Interac<on	
  is	
  an	
  exci=ng	
  and	
  
                          rapidly	
  growing	
  area	
  that	
  hugely	
  benefits	
  
                          from	
  HCI	
  work	
  	
  


45
The Future of Computing is Multimodal…!




     	
      	
  




46
Contact!
                   Abdo	
  El	
  Ali	
  

                   e:	
  elali@uva.nl	
  



                   w:	
  h"p://staff.science.uva.nl/~elali/	
  



                   t:	
  +31	
  (0)20	
  525	
  8661	
  


            	
     Address:	
  	
  
            	
     Room	
  C3.258,	
  Informa=cs	
  
                   Ins=tute,	
  Science	
  Park	
  904,	
  1098	
  XH	
  
                   Amsterdam,	
  NL	
  


47   Slides	
  available	
  at:	
  h"p://staff.science.uva.nl/~elali/hci_abdo_2011.pdf
References (1)!
     Bla"ner,	
  M.	
  M.,	
  Sumikawa,	
  D.	
  A.,	
  &	
  Greenberg,	
  R.	
  M.	
  (1989).	
  Earcons	
  and	
  icons:	
  Their	
  structure	
  and	
  common	
  design	
  
        principles.	
  Human-­‐Computer	
  Interac=on,	
  4,	
  1,	
  11-­‐44	
  
     Bolt.,	
  R.	
  A.	
  (1980).	
  “Put-­‐that-­‐there”:	
  Voice	
  and	
  gesture	
  at	
  the	
  graphics	
  interface.	
  SIGGRAPH	
  Comput.	
  Graph.	
  14,	
  3,	
  
        262-­‐270.	
  
     Brown,	
  L.	
  M.,	
  Brewster,	
  S.	
  A.	
  and	
  Purchase,	
  H.	
  C.	
  (2005).	
  A	
  First	
  Inves=ga=on	
  into	
  the	
  Effec=veness	
  of	
  Tactons.	
  In	
  
        Proceedings	
  of	
  the	
  First	
  Joint	
  Eurohap=cs	
  Conference	
  and	
  Symposium	
  on	
  Hap=c	
  Interfaces	
  for	
  Virtual	
  Environment	
  
        and	
  Teleoperator	
  Systems	
  (WHC	
  '05).	
  IEEE	
  Computer	
  Society,	
  Washington,	
  DC,	
  USA,	
  167-­‐176.	
  
     Brewster,	
  S.,	
  Lumsden,	
  J.,	
  Bell,	
  M.,	
  Hall,	
  M.,	
  and	
  Tasker,	
  S.	
  (2003.)	
  Mul=modal	
  'eyes-­‐free’	
  interac=on	
  techniques	
  for	
  
        wearable	
  devices.	
  In	
  Proc.	
  of	
  CHI	
  '03.	
  ACM	
  Press,	
  New	
  York,	
  NY.	
  
     Buxton,	
  W.	
  (1986)	
  There's	
  More	
  to	
  Interac=on	
  than	
  Meets	
  the	
  Eye:	
  Some	
  Issues	
  in	
  Manual	
  Input.	
  In	
  Norman,	
  D.	
  A.	
  and	
  
        Draper,	
  S.	
  W.	
  (Eds.),	
  (1986),	
  User	
  Centered	
  System	
  Design:	
  New	
  Perspec=ves	
  on	
  Human-­‐Computer	
  Interac=on.	
  
        Lawrence	
  Erlbaum	
  Associates,	
  Hillsdale,	
  New	
  Jersey,	
  pp.	
  319-­‐337.	
  
     Chi"aro,	
  L.	
  (2009).	
  Dis=nc=ve	
  aspects	
  of	
  mobile	
  interac=on	
  and	
  their	
  implica=ons	
  for	
  the	
  design	
  of	
  mul=modal	
  
        interfaces.	
  Journal	
  on	
  Mul=modal	
  User	
  Interfaces,	
  3(3),	
  157-­‐165.	
  
     Dourish,	
  P.	
  (2000).	
  Embodied	
  Interac=on:	
  Exploring	
  the	
  Founda=ons	
  of	
  a	
  New	
  Approach	
  to	
  HCI.	
  Transac=ons	
  on	
  
        Computer-­‐Human	
  Interac=on.	
  
     Dumas,	
  B.,	
  Lalanne,	
  D.	
  and	
  Ovia",	
  S.	
  (2009).	
  Mul=modal	
  Interfaces:	
  A	
  Survey	
  of	
  Principles,	
  Models	
  and	
  Frameworks.	
  In	
  
        Human	
  Machine	
  Interac=on,	
  Denis	
  Lalanne	
  and	
  Jorg	
  Kohlas	
  (Eds.).	
  Lecture	
  Notes	
  In	
  Computer	
  Science,	
  Vol.	
  5440.	
  
        Springer-­‐Verlag,	
  Berlin,	
  Heidelberg	
  3-­‐26.	
  
     Gibson,	
  J.	
  J.	
  (1966).	
  The	
  Senses	
  Considered	
  as	
  Perceptual	
  Systems.	
  Houghton	
  Mifflin,	
  Boston.	
  
     Gibson,	
  J.	
  J.	
  (1979).	
  The	
  Ecological	
  Approach	
  to	
  Visual	
  Percep=on.	
  Houghton	
  Mifflin,	
  Boston.	
  
     Heidegger,	
  M.	
  (1927).	
  Being	
  and	
  Time.	
  Trans.	
  by	
  John	
  Macquarrie	
  &	
  Edward	
  Robinson,	
  London:	
  SCM	
  Press,	
  1962).	
  
     Hoggan,	
  E.	
  and	
  Brewster,	
  S.A.	
  (2007)	
  Designing	
  Audio	
  and	
  Tac=le	
  Crossmodal	
  Icons	
  for	
  Mobile	
  Devices.	
  In	
  ACM	
  
       Interna=onal	
  Conference	
  on	
  Mul=modal	
  Interfaces	
  (Nagoya,	
  Japan).	
  ACM	
  Press,	
  pp	
  162-­‐169	
  

48
References (2)!
     Hoggan,	
  E.,	
  Raisamo,	
  R.	
  and	
  Brewster,	
  S.A	
  (2009).	
  Mapping	
  Informa=on	
  to	
  Audio	
  and	
  Tac=le	
  Icons.	
  In	
  
        Proceedings	
  of	
  ACM	
  ICMI	
  2009	
  (Cambridge,	
  MA,	
  USA).	
  ACM	
  Press,	
  pp	
  327-­‐334	
  	
  
     Holland,	
  S.,	
  Morse,	
  D.	
  R.,	
  and	
  Gedenryd,	
  H.	
  (2002).	
  AudioGPS:	
  Spa=al	
  audio	
  naviga=on	
  with	
  a	
  minimal	
  a"en=on	
  
        interface.	
  Personal	
  Ubiquitous	
  Comput.,	
  6(4):253–259,	
  2002	
  
     Kopp,	
  S.,	
  Tepper,	
  P.	
  and	
  Cassell,	
  J.	
  (2004).	
  "Towards	
  Integrated	
  Microplanning	
  of	
  Language	
  and	
  Iconic	
  Gesture	
  for	
  
        Mul=modal	
  Output.“	
  ICMI	
  2004.	
  	
  	
  	
  
     Lewkowicz,	
  D.	
  J.	
  (1994).	
  Development	
  of	
  intersensory	
  percep=on	
  in	
  human	
  infants.	
  In	
  Lewkowicz,	
  D.	
  J.	
  &	
  
        Lickliter,	
  R.	
  (Eds.).	
  Development	
  of	
  Intersensory	
  Percep=on:	
  Compara=ve	
  Perspec=ves,	
  Norwood,	
  N.J.:	
  
        Lawrence	
  Erlbaum	
  Associates	
  	
  
     Magnusson,	
  C.,	
  Tollmar,	
  K.,	
  Brewster,	
  S.,	
  Sarjakoski,	
  T.,	
  Sarjakoski,	
  T.,	
  &	
  Roselier,	
  S.	
  (2009).	
  Exploring	
  future	
  
        challenges	
  for	
  hap=c,	
  audio	
  and	
  visual	
  interfaces	
  for	
  mobile	
  maps	
  and	
  loca=on	
  based	
  services.	
  In	
  Proceedings	
  
        of	
  the	
  2nd	
  interna=onal	
  workshop	
  on	
  loca=on	
  and	
  the	
  web	
  (pp.	
  8:1{8:4).	
  New	
  York,	
  NY,	
  USA:	
  ACM.	
  
     Nigay,	
  L.	
  and	
  Coutaz,	
  J.	
  (1993).	
  A	
  design	
  space	
  for	
  mul=modal	
  systems:	
  concurrent	
  processing	
  and	
  data	
  fusion.	
  
        In	
  Proceedings	
  of	
  the	
  INTERACT	
  '93	
  and	
  CHI	
  '93	
  conference	
  on	
  Human	
  factors	
  in	
  compu=ng	
  systems	
  (CHI	
  '93).	
  
        ACM,	
  New	
  York,	
  NY,	
  USA,	
  172-­‐178.	
  
     Pielot,	
  M.,	
  Krull,	
  O.	
  and	
  Boll,	
  S.	
  (2010b).	
  Where	
  is	
  my	
  team:	
  suppor=ng	
  situa=on	
  awareness	
  with	
  tac=le	
  displays.	
  
        In	
  Proceedings	
  of	
  the	
  28th	
  interna0onal	
  conference	
  on	
  Human	
  factors	
  in	
  compu0ng	
  systems	
  (CHI	
  '10).	
  ACM,	
  
        New	
  York,	
  NY,	
  USA,	
  1705-­‐1714.	
  
     Pielot,	
  M,	
  Poppinga,	
  B.,	
  and	
  Boll,	
  S.	
  (2010).	
  PocketNavigator:	
  Vibro-­‐Tac=le	
  Waypoint	
  Naviga=on	
  for	
  Everyday	
  
        Mobile	
  Devices,	
  Mobile	
  HCI	
  2010,	
  Lisboa,	
  Portugal.	
  
     Reeves,	
  L.	
  M.,	
  KLai,	
  J.,	
  Larson,	
  J.	
  A.,	
  Ovia",	
  S.,	
  Balaji,	
  T.	
  S.,	
  Buisine,	
  S.,	
  Collings,P.,	
  Kraal,	
  B.,	
  Mar=n,	
  J.	
  C.,	
  McTear,	
  
        M.,	
  Raman,	
  T.	
  V.,	
  Stanney,	
  K.	
  M.,	
  Su,	
  H.,	
  and	
  Wang,	
  Q.	
  Y.	
  Guidelines	
  for	
  Mul=modal	
  User	
  Interface	
  Design.	
  
        Commun.	
  ACM	
  47(1)(2004),	
  57	
  –	
  59.	
  
     Visell.	
  Y.	
  (2009).	
  Tac=le	
  sensory	
  subs=tu=on:	
  Models	
  for	
  enac=on	
  in	
  HCI.	
  Interact.	
  Comput.	
  21,	
  1-­‐2,	
  p.38-­‐53.	
  	
  
49

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Human Computer Interaction (HCI)
Human Computer Interaction (HCI)Human Computer Interaction (HCI)
Human Computer Interaction (HCI)
 
Design process interaction design basics
Design process interaction design basicsDesign process interaction design basics
Design process interaction design basics
 
HCI 3e - Ch 5: Interaction design basics
HCI 3e - Ch 5:  Interaction design basicsHCI 3e - Ch 5:  Interaction design basics
HCI 3e - Ch 5: Interaction design basics
 
HCI 3e - Ch 10: Universal design
HCI 3e - Ch 10:  Universal designHCI 3e - Ch 10:  Universal design
HCI 3e - Ch 10: Universal design
 
Hci
HciHci
Hci
 
Hypertext, multimedia and www
Hypertext, multimedia and wwwHypertext, multimedia and www
Hypertext, multimedia and www
 
HCI 3e - Ch 13: Socio-organizational issues and stakeholder requirements
HCI 3e - Ch 13:  Socio-organizational issues and stakeholder requirementsHCI 3e - Ch 13:  Socio-organizational issues and stakeholder requirements
HCI 3e - Ch 13: Socio-organizational issues and stakeholder requirements
 
WEB INTERFACE DESIGN
WEB INTERFACE DESIGNWEB INTERFACE DESIGN
WEB INTERFACE DESIGN
 
interaction norman model in Human Computer Interaction(HCI)
interaction  norman model in Human Computer Interaction(HCI)interaction  norman model in Human Computer Interaction(HCI)
interaction norman model in Human Computer Interaction(HCI)
 
Exploring GOMs
Exploring GOMsExploring GOMs
Exploring GOMs
 
HCI 3e - Ch 11: User support
HCI 3e - Ch 11:  User supportHCI 3e - Ch 11:  User support
HCI 3e - Ch 11: User support
 
Interaction Paradigms
Interaction ParadigmsInteraction Paradigms
Interaction Paradigms
 
HCI Presentation
HCI PresentationHCI Presentation
HCI Presentation
 
Mobile hci
Mobile hciMobile hci
Mobile hci
 
Human Computer Interaction Chapter 5 Universal Design and User Support - Dr....
Human Computer Interaction Chapter 5 Universal Design and User Support -  Dr....Human Computer Interaction Chapter 5 Universal Design and User Support -  Dr....
Human Computer Interaction Chapter 5 Universal Design and User Support - Dr....
 
HCI 3e - Ch 4: Paradigms
HCI 3e - Ch 4:  ParadigmsHCI 3e - Ch 4:  Paradigms
HCI 3e - Ch 4: Paradigms
 
Iteration and prototyping
Iteration and prototypingIteration and prototyping
Iteration and prototyping
 
Human Computer Interaction HCI
Human Computer Interaction HCI Human Computer Interaction HCI
Human Computer Interaction HCI
 
HCI Lecture 1
HCI Lecture 1HCI Lecture 1
HCI Lecture 1
 
HCI - Chapter 1
HCI - Chapter 1HCI - Chapter 1
HCI - Chapter 1
 

Destaque

Multimodal Semiotics
Multimodal SemioticsMultimodal Semiotics
Multimodal Semiotics
svngl
 
human computer interface
human computer interfacehuman computer interface
human computer interface
Santosh Kumar
 
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHYMULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
chelsiageorge20
 
Stellmach.2011.designing gaze supported multimodal interactions for the explo...
Stellmach.2011.designing gaze supported multimodal interactions for the explo...Stellmach.2011.designing gaze supported multimodal interactions for the explo...
Stellmach.2011.designing gaze supported multimodal interactions for the explo...
mrgazer
 
Ricoueur la memoria archivada
Ricoueur la memoria archivadaRicoueur la memoria archivada
Ricoueur la memoria archivada
SANTIAGO GARCIA V
 

Destaque (20)

Multimodality
MultimodalityMultimodality
Multimodality
 
Multimodality
MultimodalityMultimodality
Multimodality
 
Multimodal Discourse Analysis Systemic Functional Perspectives
Multimodal Discourse Analysis Systemic Functional PerspectivesMultimodal Discourse Analysis Systemic Functional Perspectives
Multimodal Discourse Analysis Systemic Functional Perspectives
 
Multimodal Semiotics
Multimodal SemioticsMultimodal Semiotics
Multimodal Semiotics
 
Multimodal texts
Multimodal textsMultimodal texts
Multimodal texts
 
human computer interface
human computer interfacehuman computer interface
human computer interface
 
Multimodality
MultimodalityMultimodality
Multimodality
 
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHYMULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
MULTIMODAL INTERFACE OF BRAQIN COMPUTER INTERFACE AND ELECTOOCULOGRAPHY
 
Sensor based interaction
Sensor based interaction Sensor based interaction
Sensor based interaction
 
Multimodal man machine interaction
Multimodal man machine interactionMultimodal man machine interaction
Multimodal man machine interaction
 
Human-Computer Interaction
Human-Computer InteractionHuman-Computer Interaction
Human-Computer Interaction
 
HCI 3e - Ch 9: Evaluation techniques
HCI 3e - Ch 9:  Evaluation techniquesHCI 3e - Ch 9:  Evaluation techniques
HCI 3e - Ch 9: Evaluation techniques
 
Stellmach.2011.designing gaze supported multimodal interactions for the explo...
Stellmach.2011.designing gaze supported multimodal interactions for the explo...Stellmach.2011.designing gaze supported multimodal interactions for the explo...
Stellmach.2011.designing gaze supported multimodal interactions for the explo...
 
A Generic Approach for Multi-Device User Interface Rendering with UIML
A Generic Approach for Multi-Device User Interface Rendering with UIMLA Generic Approach for Multi-Device User Interface Rendering with UIML
A Generic Approach for Multi-Device User Interface Rendering with UIML
 
A Fusion Framework for Multimodal Interactive Applications
A Fusion Framework for Multimodal Interactive ApplicationsA Fusion Framework for Multimodal Interactive Applications
A Fusion Framework for Multimodal Interactive Applications
 
MobileHCI 2016 - Technology Literacy in Poor Infrastructure Environments: Cha...
MobileHCI 2016 - Technology Literacy in Poor Infrastructure Environments: Cha...MobileHCI 2016 - Technology Literacy in Poor Infrastructure Environments: Cha...
MobileHCI 2016 - Technology Literacy in Poor Infrastructure Environments: Cha...
 
Presentación1 sábado 28 enero 2012
Presentación1 sábado 28 enero 2012Presentación1 sábado 28 enero 2012
Presentación1 sábado 28 enero 2012
 
Ricoueur la memoria archivada
Ricoueur la memoria archivadaRicoueur la memoria archivada
Ricoueur la memoria archivada
 
Top 10 buzz words in call center technology
Top 10 buzz words in call center technologyTop 10 buzz words in call center technology
Top 10 buzz words in call center technology
 
MobiMed: Comparing Object Identification Techniques on Smartphones
MobiMed: Comparing Object Identification Techniques on SmartphonesMobiMed: Comparing Object Identification Techniques on Smartphones
MobiMed: Comparing Object Identification Techniques on Smartphones
 

Semelhante a Multimodal Interaction: An Introduction

Multimodal, crossmedia, multi platform
Multimodal, crossmedia, multi platformMultimodal, crossmedia, multi platform
Multimodal, crossmedia, multi platform
Hans Kemp
 
Presentatie Wijnand IJsselsteijn, TU/e
Presentatie Wijnand IJsselsteijn, TU/ePresentatie Wijnand IJsselsteijn, TU/e
Presentatie Wijnand IJsselsteijn, TU/e
#devdate
 
Haptics
Haptics Haptics
Haptics
pmvin
 
The artificiality of natural user interfaces alessio malizia
The artificiality of natural user interfaces   alessio maliziaThe artificiality of natural user interfaces   alessio malizia
The artificiality of natural user interfaces alessio malizia
Marco Ajovalasit
 
Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1
Lucia Gradinariu
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
Hans Kemp
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
Hans Kemp
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
Hans Kemp
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
Hans Kemp
 
New Media New Technology 2011 - Back to the future
New Media New Technology 2011 - Back to the futureNew Media New Technology 2011 - Back to the future
New Media New Technology 2011 - Back to the future
Peter Van Der Putten
 

Semelhante a Multimodal Interaction: An Introduction (20)

Multimodal, crossmedia, multi platform
Multimodal, crossmedia, multi platformMultimodal, crossmedia, multi platform
Multimodal, crossmedia, multi platform
 
Presentatie Wijnand IJsselsteijn, TU/e
Presentatie Wijnand IJsselsteijn, TU/ePresentatie Wijnand IJsselsteijn, TU/e
Presentatie Wijnand IJsselsteijn, TU/e
 
My Robot
My RobotMy Robot
My Robot
 
Haptics ppt
Haptics pptHaptics ppt
Haptics ppt
 
Haptics
Haptics Haptics
Haptics
 
User defined gestures for surface computing
User defined gestures for surface computingUser defined gestures for surface computing
User defined gestures for surface computing
 
Designing Kansei Experience For Interaction
Designing Kansei Experience For InteractionDesigning Kansei Experience For Interaction
Designing Kansei Experience For Interaction
 
Introduction to Prototyping: What, Why, How
Introduction to Prototyping: What, Why, HowIntroduction to Prototyping: What, Why, How
Introduction to Prototyping: What, Why, How
 
From Interaction to Understanding
From Interaction to UnderstandingFrom Interaction to Understanding
From Interaction to Understanding
 
The artificiality of natural user interfaces alessio malizia
The artificiality of natural user interfaces   alessio maliziaThe artificiality of natural user interfaces   alessio malizia
The artificiality of natural user interfaces alessio malizia
 
Human Computer Interaction
Human Computer InteractionHuman Computer Interaction
Human Computer Interaction
 
Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1Cognitive Computing for Tacit Knowledge1
Cognitive Computing for Tacit Knowledge1
 
Mis 008
Mis 008Mis 008
Mis 008
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
 
Using Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic ExperiencesUsing Augmented Reality to Create Empathic Experiences
Using Augmented Reality to Create Empathic Experiences
 
New Media New Technology 2011 - Back to the future
New Media New Technology 2011 - Back to the futureNew Media New Technology 2011 - Back to the future
New Media New Technology 2011 - Back to the future
 
Life Comes At Us Polydimensionally
Life Comes At Us PolydimensionallyLife Comes At Us Polydimensionally
Life Comes At Us Polydimensionally
 

Mais de Abdallah El Ali

Mais de Abdallah El Ali (6)

CHI 2018 - Measuring, Understanding, and Classifying News Media Sympathy on T...
CHI 2018 - Measuring, Understanding, and Classifying News Media Sympathy on T...CHI 2018 - Measuring, Understanding, and Classifying News Media Sympathy on T...
CHI 2018 - Measuring, Understanding, and Classifying News Media Sympathy on T...
 
Minimal Mobile Human Computer Interaction
Minimal Mobile Human Computer InteractionMinimal Mobile Human Computer Interaction
Minimal Mobile Human Computer Interaction
 
Photographer Paths: Sequence Alignment of Geotagged Photos for Exploration-ba...
Photographer Paths: Sequence Alignment of Geotagged Photos for Exploration-ba...Photographer Paths: Sequence Alignment of Geotagged Photos for Exploration-ba...
Photographer Paths: Sequence Alignment of Geotagged Photos for Exploration-ba...
 
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
Fishing or a Z?: Investigating the Effects of Error on Mimetic and Alphabet D...
 
Understanding Contextual Factors in Location-aware Multimedia Messaging
Understanding Contextual Factors in Location-aware Multimedia MessagingUnderstanding Contextual Factors in Location-aware Multimedia Messaging
Understanding Contextual Factors in Location-aware Multimedia Messaging
 
What's in an Android?
What's in an Android?What's in an Android?
What's in an Android?
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Multimodal Interaction: An Introduction

  • 1. Multimodal Interaction! An Introduction! Abdallah  ‘Abdo’  El  Ali   h"p://staff.science.uva.nl/~elali/   Some slides adapted from: Gabriel Skantze (KTH Royal Institute of Technology, Sweden), Denis Lalanne (University of Fribourg, Switzerland)
  • 2. Who am I?!   Currently:  PhD  in  Mobile  Human-­‐Computer  Interac<on  -­‐UvA     Crossmodal  Interac=on  in  Mobile  Environments     Msc  in  Cogni<ve  Science  -­‐  UvA       Cogni=on,  Language,  &  Communica=on  track     Bsc  in  English  Language  &  Literature  -­‐  American  University  of   Beirut     Screenwri=ng,  Copywri=ng,  Edi=ng   2
  • 3. Outline! I.  Mul=modal  Interac=on  &  Interfaces   II.  Mul=modal  Input   III.  Mul=modal  Output   IV.  Prac=cal  Ma"ers     3
  • 4. Multimodal Interaction & Interfaces! 4
  • 5. A Brief History of Computer Interfaces!   Punched  cards  (late  19th  century)     Herman  Hollerith    -­‐  Tabula=ng  Machine  Company  (1896)     The  Command  Line  Interface  (1960s)       Sketchpad  (1963)  by  Ivan  Sutherland  –  light-­‐pen   pointer-­‐based  system  to  create  and  manipulate   objects  in  drawings     Alto  personal  computer  (1973)  developed  at   Xerox  PARC     Desktop  metaphor,  WIMP  (windows,  icons,   menus,  poin=ng  device)     WYSIWYG     Xerox  8010  Star  Informa=on  System  (1981)     Apple  Macintosh  (1984)     Windows  1.01  (1987)     Microsoc  Windows  3.0  (1990)     Mac  OSX  (2000’s)     […]   5
  • 7. Project NATAL for Xbox 360 Playstation EyePet 7 Kinect for Xbox 360 Playstation Move
  • 8. HCI and Human Characteristics !   HCI  is  a  mul=-­‐disciplinary  topic     Computer  Science  &  AI     Cogni=ve  Science     Sociology     Psychology     Design     […]     In  HCI  design,  important  to  understand   something  about     Human  informa=on-­‐processing   (cogni=ve  architecture,  memory,   percep=on,  motor  skills,  etc.)     How  human  ac=on  is  structured     The  nature  of  human  communica=on     Human  physical  and  physiological   requirements/constraints   8
  • 9. Why HCI?!   Humans  are  limited  in  their   capacity  to  process  informa=on       Implica=ons  for  the  interac=on   design     Mul=tasking  says  it  all     Important  considera=ons     Input-­‐output  channels  (senses  and   effectors)     Memory     Learning  (acquiring  skills)     Reasoning  /  Problem  solving   (cogni=ve  ac=vity)     Decision  making   9
  • 10. Use Case: Mobile Interaction! Dis=nc=ve  aspects  of  mobile  interac=on   (Chi"aro,  2010):     Hardware:  small  screen,  limited  I/O     Perceptual:  noisy  street,  sunlight  reflec=on,   no  device  contact     Motor:  voluntary  movements  when  in-­‐ vehicle,  fat-­‐finger  problem     Social:  phone  ring  at  a  conference,  gestures   in  front  of  strangers     Cogni<ve:  limited  a"en=on  span,  high   stress  &  load,  limited  memory     10
  • 11. Embodiment!   Embodied  cogni=on,  Situated  Cogni=on,  Embodied  Interac=on,  EEC,    Social  Compu=ng,  Tangible   Compu=ng,  Ac=ve  percep=on,  […]     Gibson  (1979)  “ The  Ecological  Approach  to  Visual  Percep=on”     “....perceiving  is  an  act  not  a  response,  an  act  of  a"en=on,  not  a  triggered  impression,  an  achievement,  not   a  reflex”     Heidegger  (1927)  “Being  and  Time”     Present-­‐at-­‐hand  vs.  ready-­‐to-­‐hand       e.g.,  hammer  as  object  (presence)  vs.  hammer  as  tool  (cogni=ve  extension)     E.g.,  mouse  as  hardware  vs.  mouse  as  tool  for  performing  GUI  opera=ons     Dourish  (1999)  “Founda=ons  of  Embodied  Interac=on”       “…interac=on  is  an  embodied  phenomenon.  It  happens  in  the  world,  and  that  world  (a  physical  world  and  a   social  world)  lends  form,  substance  and  meaning  to  the  interac=on.     Sensori-­‐motor  coordina=on     Percep=on  for  ac=on   Agent   Ac=on  for  percep=on   World
  • 12. Sensation & Perception!   Humans  perceive  the  world  through  their   senses  (sensory  input)  and  act  on  it  through  the   motor  control  of  their  effectors       Five  major  senses     Sight     Hearing     Touch     Taste     Smell     (Propriocep=on,  thermocep=on,  nociocep=on,  …)     Effectors     Limbs  (arms,  legs,  body  posi=on,  …)     Fingers     Eyes     Head  /  Face     Body     Vocal  system   12
  • 13. Man-Machine Interaction!   Interac<on  can  be  seen  as  a  dialog   between  the  computer  and  the  user     Interac=on  styles  :     Command  language  /  Command  line   interface     Form-­‐fills  and  spreadsheets     Menus     Natural  language  and  query  language     Ques=on/answer  dialog     WIMP     Point-­‐and-­‐click     Direct  manipula=on     3D  interfaces  (virtual  reality)     Brain-­‐computer  interface   13
  • 14. Multimodal Interfaces!   Mul<modal  Interac<on:  the  situa=on   where  the  user  is  provided  with  mul=ple   modes  for  interac=ng  with  a  system     Mul<modal  Interfaces  “…process  two  or   more  combined  user  input  modes  (such  as   speech,  pen,  touch,  manual  gesture,  gaze,   and  head  and  body  movements)  in  a   coordinated  manner  with  mul=media  system   output.  They  are  a  new  class  of  interfaces   that  aim  to  recognize  naturally  occurring   forms  of  human  language  and  behavior,  and   which  incorporate  one  or  more  recogni=on-­‐ based  technologies  (e.g.  speech,  pen,   vision)”    (Ovia"  et  al.,  2002)   14
  • 15. Multimodality vs. Multimedia!   Modality  “refers  to  the  type  of  communica=on   channel  used  to  convey  or  acquire  informa=on.  It   also  covers  the  way  an  idea  is  expressed  or   perceived,  or  the  manner  an  ac=on  is   performed”  (Nigay  &  Coutaz,  1993)     Visual,  Auditory,  Hap=c,  etc.     Mul=-­‐  refers  to  2  or  more  such  modali=es  used     Mode  “refers  to  a  state  that  determines  the  way   informa=on  is  interpreted  to  extract  or  convey   meaning”  (Nigay  &  Coutaz,  1993)     Mul<media  “focuses  on  the  medium  or  technology   rather  than  the  applica0on  or  user”  (Buxton,  1986)     e.g.,  sound  clip  a"ached  to  a  presenta=on     Media  channels:  Text,  graphics,  anima=on,  video,  etc.   15
  • 16. Early Example! “Put  That  There”  system     (Bolt,  1980)   Speech  and  gestures  used  simultaneously   16
  • 17. Why Multimodal Interaction?! Advantages  over  GUI  and  unimodal  systems:     Natural/realism:  making  use  of  more   (appropriate)  senses     New  ways  of  interac=ng     Flexible:  different  modali=es  excel  at   different  tasks     Wearable  computers  and  small  devices     e.g.,  keyboard  typing  devices  require  training     Helps  the  visually/physically  impaired     Faster,  more  efficient,  higher  informa=on   processing  bandwidth     Robust:  mutual  disambigua=on  of   recogni=on  errors     Mul=modal  interfaces  are  more  engaging   17
  • 18. Why Multimodal Interaction?! Human  –  Human  protocols     Ini0a0ng  conversa0on,   turn-­‐taking,  interrup0ng,   direc0ng  a:en0on,  …   Human  –  Computer  protocols     Shell  interac0on,  drag-­‐and-­‐ drop,  dialog  boxes,  …         Use more of users’ senses   Users perceive multiple things at once   Users do multiple things at once (e.g., speak and use hand gestures, body position, orientation, and gaze) 18
  • 21. Multimodal Input Overview!   Mul=modal  Input:     allows  humans  to   communicate  naturally     provides  user  with  mul=ple   input  modali=es     permits  mul=ple  styles  of   interac=on     may  be  simultaneous  or  not     must  consider  modality  fusion   and  temporal  constraints   21
  • 22. Multimodal Input!   Poin=ng  (deixis),  (Mul=-­‐)Touch       Mo=on  controller     Accelerometer,  gyro     Speech     Free  form,  fixed,  non-­‐speech  sounds     Body  movement/Gestures     Gait,  posture       Head  posi=on  &  movements     Facial  expression,  Gaze     Tangibles     Digital  pen  and  paper     Biometrics     Sweat,  pulse,  respira=on,  skin  conductance     Brain  ac=vity  (neural)     EEG  signals,  fMRI  signals,  blood  oxygena=on     Scent?     Odor  detec=on      Taste?       22
  • 23. Speech and Gesture Interaction!  Speech     User  sa=sfac=on  is  highly  dependant  on  their  profiles  and  tasks     The  learning  rate  is  fast     Error  handling  is  getng  be"er     Perceptual  &  social  usage  constraints  are  important  (ambient   noise,  confiden=ality,  disturbance,  etc.)     Good  spoken  languages:  short  sentences  with  prosody  clearly   demarca=ng  end  of  words      Gesture      Habits  are  inherited  from  the  usage  of  mouse      Gesture  poin=ng  is  direct  and  reliable  (deixis)      Gesture  signs  may  not  be  natural  making  recogni=on  hard   23
  • 24. Fundamental Problems !   Aligning  HCI  tasks  with  modali<es  (and  vice  versa)     Aligning  mul=modal  usage  to  user  profiles  (and  vice  versa)     Mul<modal  Fusion     the  integra=on  of  communica=on  modali=es  in  interac=ve  systems     Input     Mul<modal  Fission       the  repar==oning  of  informa=on  among  several  communica=on   modali=es    Output   24
  • 25. Multimodal Man-Machine Interaction Model! 25 (Dumas et al., 2009)
  • 26. Levels of Multimodal Fusion! Data  Level:   e.g.,  combining  2  webcam  video  streams,  mul=ple   perspec=ves   Feature  level:   e.g.,  combining  speech  and  lip  movements   Decision  level:   e.g.,  combining  gestures  and  speech   26
  • 28. MATCH: Multimodal Access to City Help (Johnston et al., 2002)!   Interac=ve  city  guide  and  naviga=on   applica=on:  provides  restaurant  and   subway  informa=on  for  NY  and  DC     Dynamic  map-­‐based  interface  on  tablet     Input  modali=es:       Speech,  pen  gesture,  handwri=ng,  GUI     Commands  can  be  speech,  pen,  or   mul=modal     Visual  parsing  of  complex  gestural  input     Output  modali=es:       Coordinated  mul=modal  output  combining   synthe=c  speech  and  dynamic  graphics     Example:       Speech:  “show  inexpensive  italian  places  in   chelsea”     Mul=modal:  “cheap  italian  places  in  this   area”  (pen  gesture;  right)   28
  • 29. NUMACK (Foster and White, 2005)!   NUMACK  (Northwestern  University   Mul=modal  Autonomous  Conversa=onal   Kiosk)     Embodied  Conversa=on  Agent  (ECA)  that   gives  direc=ons  around  Northwestern's   Campus     Combina=on  of  speech,  gestures  and  facial   expressions     Uses  a  grammar-­‐based,  computa=onal  model   of  language  and  gesture  planning  system     NUMACK's  verbal,  non-­‐verbal  and   mul=modal  behaviors  realized  through   synthesized  speech  and  kinema=c  body   model       System  updates  its  model  of  context  and  the   world  by  fusing  mul=modal  user  input     Stereoscopic,  head-­‐tracking  system     Speech     Pen       29
  • 30. Multimodal Input Advantages!   Improved  error  handling  &  efficiency     fewer  errors     faster  task  comple=on     Greater  expressive  power     Greater  precision  in  visual-­‐spa=al  tasks  (e.g.,  map   scrolling  &  item  localiza=on)     Support  for  users’  preferred  interac=on  style     Accommoda=on  to  diverse  users,  tasks  &  usage   environments       e.g.,  accented  speakers  &  mobile  environments     Shorter  &  less  complex  linguis=c  construc=ons       e.g.,  fewer  loca=ve  descrip=ons   30
  • 33. Multimodal Output!   Visual     Text     Graphics     Anima=ons     Virtual/Augmented  Reality     Auditory     Speech  (e.g.,  Embodied   Conversa=onal  Agent)     Non-­‐speech  Sound     Hap=cs  (tac=le)     Force  feedback  (e.g.,  PS3   controller)     Vibrotac=le  (e.g.,  phone  vibrate)       Scent?     Scented  mobile  phones     Taste?   33
  • 34.     Multimodal Output!   Advantages  (Sarter,  2006;   Ovia",  2002):     Synergy     Redundancy     Higher  Informa=on  bandwidth     Wicken’s  Mul=ple  Resource   Theory  (1984)     More  modali=es  =  be"er?     Higher  resource  compe==on   when  people  have  to  a"end  to   two  sources  at  once  (Reeves  et   al.,  2004).   34
  • 35. Mobile Multimodal Interfaces!       Mobile  context  means  a"en=onal   and  memory  resources  are  limited   (Tamminen  et  al.,  2004)     E.g.,  map  scrolling,  talking  with  friend,   crossing  the  street     Poten=al  of  mul=modal  feedback  cues   in:   1.  addressing  issues  of  accessibility  (e.g.,  to   support  blind  users  in  naviga=on)   (Magnusson  et  al.,  2009)     2.  developing  pedestrian  naviga=on  aids  to   support  situa=onal  impairment  and   awareness  (Brewster  et  al.,  2003)     Examples:     http://www.lalyagaye.com/   Pocket  Navigator  (Pielot  et  al,  2010)     http://feelspace.cogsci.uni-osnabrueck.de/   AudioGPS  (Holland  et  al.,  2002)     35
  • 36. Tactile and Non-Speech Auditory Feedback!   Tactons:  “Structured,  abstract  messages  that  can  be  used  to  communicate  non-­‐ visually”  (Brown,  2005).  Informa=on  encoded  in  parameters  such  as:     Waveform,  dura=on,  rhythm,  spa=al  loca=on,  frequency,  […]     Earcons:  “Non-­‐verbal  audio  messages  that  are  used  in  the  computer/user   interface  to  provide  informa0on  to  the  user  about  some  computer  object,   opera0on  or  interac0on"  (Bla"ner,  1989).  Informa=on  encoded  in:     Pitch,  amplitude,  dura=on,  spa=al  loca=on,  […]     Amodal  parameters:  consist  of  informa=on  that  is  not  specific  to  any  one   sensory  modality  (Lewkowickz,  1994).  Parameters  common  to  both  tac=le  and   auditory  domains  (Lewkowickz,  1994;  Hoggan  et  al.,  2009):     Spa=al  loca=on,  rhythm,  texture,  dura=on,  frequency,  intensity/amplitude     36
  • 37. Crossmodal Interaction!       Subset  of  mul=modal  interac=on  where  the   senses  receive  the  ‘same’  informa=on   content  across  invoked  sensory  modali=es   (Gibson,  1966;  Lewkowicz,1994)     Cf.,  Sensory  Subs=tu=on  (Visell,  2008)     vOICe:  Seeing  with  Sound  applica=on;  Braille     Crossmodal  Interac=on  refers  to  situa=ons   where  characteris=cs  of  one  sensory   modality  may  be  bi-­‐direc=onally   transformed  into  the  characteris=cs  of   another  (e.g.,  audio  ⇿  tac=le)  (Hoggan,   2007;  2009)    Redundancy   37
  • 38. Crossmodal Output Advantages!       Crossmodal  output  advantages:     Unlike  mul=modal  interac=on,   li"le  risk  of  informa=on   processing  overload     When  one  sensory  modality  is   knocked  out  (e.g.,  noise   environment,  body  contact),   informa=on  is  s=ll  received     Permits  both  ‘eyes-­‐free’  and   ‘hands-­‐free’  interac=on   38
  • 41. Multimodal Input Research Areas!       Applied  Machine  Learning     Speech  Recogni=on,  Speech  Synthesis     Gesture  Recogni=on,  Mo=on  Tracking     Head,  Gait  and  Pose  Es=ma=on     Mul=modal  Fusion         HCI     Usability  issues  in  diverse  tasks     Social  acceptability     Context-­‐aware  and  ubiquitous  compu=ng   (which  modality  to  use  when)     Design/Prototyping  of  Mul=modal  Interfaces   (e.g.,  wizard  of  Oz)   41
  • 42. Multimodal Output Research Areas!       Virtual  and  Mixed  Reality  (Immersive   Environments)     Embodied  Conversa=on  Agents     Hap=cs:  force-­‐feedback,  vibrotac=le  feedback     Audio:  feedback,  synthesis     Crossmodal  Integra=on         HCI  (Usability,  Ssa<sfac<on)     Mul=modal  Feedback  (in-­‐vehicle/pedestrian   naviga=on,  safety  and  control,  surgery,   ergonomics,  etc.)       Crossmodal  Feedback     (Mobile)  Mul=modal  Interface  design   42
  • 43. International Communities!       CHI:  ACM  CHI  Conference  on  Human  Factors  in  Compu=ng   Systems     MobileHCI:  ACM  conference  on  Human-­‐computer  interac=on   with  mobile  devices  and  services     ICMI:  ACM  Interna=onal  Conference  on  Mul=modal   Interac=on         CSCW:  ACM  Conference  on  Computer  Supported  Coopera=ve   Work     ACM  MM:  ACM  Mul=media  Conference       INTERACT:  IFIP  conference  on  Human-­‐Computer  Interac=on     WHC:  World  Hap=cs  Conference   43
  • 44. Resources!       Books:     Paul  Dourish  (2004)  “Where  the  Ac=on  is:  The  founda=ons  of   embodied  interac=on”     Andy  Clark  (2003)  “Natural-­‐Born  Cyborgs:  Minds,  Technologies,   and  the  Future  of  Human  Intelligence”     Bill  Buxton  (2007)  “Sketching  User  Experiences:  Getng  the   design  right  and  the  right  design”     Adam  Greenfield  (2006)  “Everyware:  The  dawning  age  of   ubiquitous  compu=ng”     Ar<cles:         Mark  Weiser  (1991)  “ The  Computer  for  the  21st  Century”,   Scien0fic  American     Sharon  Ovia"  (2002)  “Perceptual  user  interfaces:  mul=modal   interfaces  that  process  what  comes  naturally”,  Communica=ons   of  the  ACM     Sharon  Ovia"  (1999)  “ Ten  myths  of  mul=modal  interac=on”,   Communica=ons  of  the  ACM     Nadine  Sarter  (2006)  “Mul=modal  informa=on  presenta=on:   Design  guidance  and  research  challenges”,  Interna=onal  Journal   of  Industrial  Ergonomics     Leah  Reeves  et  al.  (2004)  “Guidelines  for  mul=modal  user   interface  design”,  Communica=ons  of  the  ACM   44
  • 45. Summary!       We  are  embodied  and  embedded   creatures,  and  this  influences  the  way  we   interact  with  the  world  and  computa=onal   ar=facts     Mul<modal  Interfaces  aim  at  making   communica=on  with  machines  more   natural,  more  efficient,  and  more  engaging         Mul<modal  Input  and  Output  focus  on   different  aspects  within  HCI,  requiring   different  skill  sets,  but  mul=modal  research   and  development  requires  both     Mul<modal  Interac<on  is  an  exci=ng  and   rapidly  growing  area  that  hugely  benefits   from  HCI  work     45
  • 46. The Future of Computing is Multimodal…!     46
  • 47. Contact! Abdo  El  Ali   e:  elali@uva.nl   w:  h"p://staff.science.uva.nl/~elali/   t:  +31  (0)20  525  8661     Address:       Room  C3.258,  Informa=cs   Ins=tute,  Science  Park  904,  1098  XH   Amsterdam,  NL   47 Slides  available  at:  h"p://staff.science.uva.nl/~elali/hci_abdo_2011.pdf
  • 48. References (1)! Bla"ner,  M.  M.,  Sumikawa,  D.  A.,  &  Greenberg,  R.  M.  (1989).  Earcons  and  icons:  Their  structure  and  common  design   principles.  Human-­‐Computer  Interac=on,  4,  1,  11-­‐44   Bolt.,  R.  A.  (1980).  “Put-­‐that-­‐there”:  Voice  and  gesture  at  the  graphics  interface.  SIGGRAPH  Comput.  Graph.  14,  3,   262-­‐270.   Brown,  L.  M.,  Brewster,  S.  A.  and  Purchase,  H.  C.  (2005).  A  First  Inves=ga=on  into  the  Effec=veness  of  Tactons.  In   Proceedings  of  the  First  Joint  Eurohap=cs  Conference  and  Symposium  on  Hap=c  Interfaces  for  Virtual  Environment   and  Teleoperator  Systems  (WHC  '05).  IEEE  Computer  Society,  Washington,  DC,  USA,  167-­‐176.   Brewster,  S.,  Lumsden,  J.,  Bell,  M.,  Hall,  M.,  and  Tasker,  S.  (2003.)  Mul=modal  'eyes-­‐free’  interac=on  techniques  for   wearable  devices.  In  Proc.  of  CHI  '03.  ACM  Press,  New  York,  NY.   Buxton,  W.  (1986)  There's  More  to  Interac=on  than  Meets  the  Eye:  Some  Issues  in  Manual  Input.  In  Norman,  D.  A.  and   Draper,  S.  W.  (Eds.),  (1986),  User  Centered  System  Design:  New  Perspec=ves  on  Human-­‐Computer  Interac=on.   Lawrence  Erlbaum  Associates,  Hillsdale,  New  Jersey,  pp.  319-­‐337.   Chi"aro,  L.  (2009).  Dis=nc=ve  aspects  of  mobile  interac=on  and  their  implica=ons  for  the  design  of  mul=modal   interfaces.  Journal  on  Mul=modal  User  Interfaces,  3(3),  157-­‐165.   Dourish,  P.  (2000).  Embodied  Interac=on:  Exploring  the  Founda=ons  of  a  New  Approach  to  HCI.  Transac=ons  on   Computer-­‐Human  Interac=on.   Dumas,  B.,  Lalanne,  D.  and  Ovia",  S.  (2009).  Mul=modal  Interfaces:  A  Survey  of  Principles,  Models  and  Frameworks.  In   Human  Machine  Interac=on,  Denis  Lalanne  and  Jorg  Kohlas  (Eds.).  Lecture  Notes  In  Computer  Science,  Vol.  5440.   Springer-­‐Verlag,  Berlin,  Heidelberg  3-­‐26.   Gibson,  J.  J.  (1966).  The  Senses  Considered  as  Perceptual  Systems.  Houghton  Mifflin,  Boston.   Gibson,  J.  J.  (1979).  The  Ecological  Approach  to  Visual  Percep=on.  Houghton  Mifflin,  Boston.   Heidegger,  M.  (1927).  Being  and  Time.  Trans.  by  John  Macquarrie  &  Edward  Robinson,  London:  SCM  Press,  1962).   Hoggan,  E.  and  Brewster,  S.A.  (2007)  Designing  Audio  and  Tac=le  Crossmodal  Icons  for  Mobile  Devices.  In  ACM   Interna=onal  Conference  on  Mul=modal  Interfaces  (Nagoya,  Japan).  ACM  Press,  pp  162-­‐169   48
  • 49. References (2)! Hoggan,  E.,  Raisamo,  R.  and  Brewster,  S.A  (2009).  Mapping  Informa=on  to  Audio  and  Tac=le  Icons.  In   Proceedings  of  ACM  ICMI  2009  (Cambridge,  MA,  USA).  ACM  Press,  pp  327-­‐334     Holland,  S.,  Morse,  D.  R.,  and  Gedenryd,  H.  (2002).  AudioGPS:  Spa=al  audio  naviga=on  with  a  minimal  a"en=on   interface.  Personal  Ubiquitous  Comput.,  6(4):253–259,  2002   Kopp,  S.,  Tepper,  P.  and  Cassell,  J.  (2004).  "Towards  Integrated  Microplanning  of  Language  and  Iconic  Gesture  for   Mul=modal  Output.“  ICMI  2004.         Lewkowicz,  D.  J.  (1994).  Development  of  intersensory  percep=on  in  human  infants.  In  Lewkowicz,  D.  J.  &   Lickliter,  R.  (Eds.).  Development  of  Intersensory  Percep=on:  Compara=ve  Perspec=ves,  Norwood,  N.J.:   Lawrence  Erlbaum  Associates     Magnusson,  C.,  Tollmar,  K.,  Brewster,  S.,  Sarjakoski,  T.,  Sarjakoski,  T.,  &  Roselier,  S.  (2009).  Exploring  future   challenges  for  hap=c,  audio  and  visual  interfaces  for  mobile  maps  and  loca=on  based  services.  In  Proceedings   of  the  2nd  interna=onal  workshop  on  loca=on  and  the  web  (pp.  8:1{8:4).  New  York,  NY,  USA:  ACM.   Nigay,  L.  and  Coutaz,  J.  (1993).  A  design  space  for  mul=modal  systems:  concurrent  processing  and  data  fusion.   In  Proceedings  of  the  INTERACT  '93  and  CHI  '93  conference  on  Human  factors  in  compu=ng  systems  (CHI  '93).   ACM,  New  York,  NY,  USA,  172-­‐178.   Pielot,  M.,  Krull,  O.  and  Boll,  S.  (2010b).  Where  is  my  team:  suppor=ng  situa=on  awareness  with  tac=le  displays.   In  Proceedings  of  the  28th  interna0onal  conference  on  Human  factors  in  compu0ng  systems  (CHI  '10).  ACM,   New  York,  NY,  USA,  1705-­‐1714.   Pielot,  M,  Poppinga,  B.,  and  Boll,  S.  (2010).  PocketNavigator:  Vibro-­‐Tac=le  Waypoint  Naviga=on  for  Everyday   Mobile  Devices,  Mobile  HCI  2010,  Lisboa,  Portugal.   Reeves,  L.  M.,  KLai,  J.,  Larson,  J.  A.,  Ovia",  S.,  Balaji,  T.  S.,  Buisine,  S.,  Collings,P.,  Kraal,  B.,  Mar=n,  J.  C.,  McTear,   M.,  Raman,  T.  V.,  Stanney,  K.  M.,  Su,  H.,  and  Wang,  Q.  Y.  Guidelines  for  Mul=modal  User  Interface  Design.   Commun.  ACM  47(1)(2004),  57  –  59.   Visell.  Y.  (2009).  Tac=le  sensory  subs=tu=on:  Models  for  enac=on  in  HCI.  Interact.  Comput.  21,  1-­‐2,  p.38-­‐53.     49