SlideShare uma empresa Scribd logo
1 de 76
Baixar para ler offline
ADVANCED METHODS FOR
USER EVALUATION IN
ENTERPRISE AR
Mark Billinghurst
mark.billinghurst@unisa.edu.au
September 29th 2021
February 2011
CityViewAR
From Idea to Product
Define
Requirements
CityViewAR
1
2
3
4
5
6
1
Sketch
Interface
2
Rough
Wireframes
3
Interactive
Prototype
4
High Fidelity
Prototype
5
Developer
Coding
6
User
Testing
7
Deploy
App
8
Design, Develop, Evaluate
Iterate
Interaction Design Process
Evaluate
(Re)Design
Needs
analysis
Build an
interactive
version
Final Product
“Iterative design, with its repeating cycle of design
and testing, is the only validated methodology in
existence that will consistently produce successful
results.
If you don’t have user-testing as an integral part
of your design process you are going to throw
buckets of money down the drain.”
Bruce Tognazzini
USER EVALUATION
What is Evaluation?
• Concerned with gathering data about the
quality of a design/product (UI)
• User performance, Usability
• User experience, User acceptance, …
• Why Evaluate?
• To validate/refine the prototype/solution
• To learn more about user and the problem
• To move forward to the next iteration
When to evaluate?
• Once the product has been developed
• pros : rapid development, small evaluation cost
• cons : rectifying problems
• During design and development
• pros : find and rectify problems early
• cons : higher evaluation cost, longer development
design implementationevaluation
redesign &
reimplementation
design implementation
Types of Evaluation
• Formative testing
• early stage of development (Low-fidelity prototypes)
• focus on user perception of experience
• comparing multiple design option
• Summative testing
• later development (High-fidelity interactive prototype)
• evaluate the effectiveness of specific design choices
• focus on performance and usability
CityViewAR
Define
Requirements
CityViewAR
1
2
3
4
5
6
1
Sketch
Interface
2
Rough
Wireframes
3
Interactive
Prototype
4
High Fidelity
Prototype
5
Developer
Coding
6
User
Testing
7
Deploy
App
8
Design, Develop, Evaluate
Iterate
Formative Summative
Four Evaluation Paradigms
1. ‘quick and dirty’
2. User testing
3. Field studies
4. Predictive evaluation
1. Quick and Dirty Testing (Formative)
• Informal feedback from users to confirm that their
ideas are in-line with users’ needs and are liked.
• Quick and dirty evaluations are done any time.
• Emphasis is on fast input to the design process
rather than carefully documented findings.
Pop Demo
https://www.youtube.com/watch?v=2XTgvjHDKlo
2. User Testing (Formative/Summative)
• Investigations of the users and their use of
human-computer interface
• Observe and Describe
• Explain
• Predict
• Determination of the Causes
• Main research method in HCI field
• Closely related to ...
• Behavioral Science in Psychology
• Ergonomics & Human factors in Industrial Engineering
User Testing Methods
• Interview
• Focus group
• Survey
• Questionnaire
• Usability Testing
• User Experiment
Informal, qualitative feedback
Formative testing
Formal, quantitative feedback
Summative Testing
Mix & match to your needs.
Triangulation.
Experimental
investigation
Descriptive
/
Relational
investigation
Interview and Focus Group
• Ask the user
• Direct conversations as tools
for feedback collection
• Understand requirements,
needs, problems
• Interviews – one at a time
• Focus groups – many
Survey and Questionnaire
• Survey
• Using questionnaire to which a user is asked to respond
• Questionnaire
• a well-defined and well-written set of questions
• Typically self-administered
• Surveys are good at:
• getting a large number of responses quickly from a
geographically dispersed population
• collecting sensitive/private information
• You can capture the “big picture” relatively quickly
Example: Net Promoter Score
• Standardized measure across large number of industries
Evaluating NPS
Experimental Study (Summative)
• Experiments discover/verify new knowledge by investigating the
causal effect between two or more things (i.e. variables).
• Independent Variable (IV)
• Manipulated to create different experimental conditions
• e.g. type of tool used, design alternatives
• Dependent Variable (DV)
• Measured to find out the effects of changing the independent variables
• e.g. user performance, satisfaction, usability
Experimental Design
Randomly
assigned
Statistical data analysis
Experimental
task
Condition
2
Condition
3
Condition
1
Subjects
data data data
Between-Subject
Randomly
assigned
Statistical data analysis
Subjects
data data data
Within-Subject
Experimental
tasks
Condition
2
Condition
3
Condition
1
Experimental
tasks
Condition
1
Condition
3
Condition
2
Experimental
tasks
Condition
1
Condition
2
Condition
3
Between vs. Within-Subject Design
Between-subject Within-subject
- Avoids interference effects
(e.g. practice / learning effect)
- Shorter time for each participant
(less fatigue and frustration)
- Learning effect
- Longer time for each participant
(Larger impact of fatigue and frustration)
- Impact of individuals difference
- Harder to detect difference between - conditions
- Require larger sample size
- Individual difference can be isolated
- Easier to detect difference between conditions
- Requires smaller sample size
- gender, age, experience must be between-
subject factors
- change over time must be within-subject factor
Important:
Randomised assignment to conditions
Important:
Counterbalance/randomise the order of
presenting conditions
Objective vs. Subjective Measures
• Objective measures
• Not influenced by personal feeling/opinion
• Based on observation, compared against standardized scale.
• More consistent
• Subjective measures
• Based on user's opinions, interpretations, points of view,
emotions and judgment.
• More vulnerable to context and users’ status
Data Types
• Subjective (Qualitative)
• Subjective survey
• Likert Scale, condition rankings
• Observations
• Think Aloud
• Interview responses
• Objective (Quantitative)
• Performance measures
• Time, accuracy, errors
• Process measures
• Video/audio analysis
How easy was the task
1 2 3 4 5
Not very easy Very easy
Standard Questionnaires in HCI
• Existing validated questionnaires in the HCI literature:
• System Usability Scale (SUS)
• Computer System Usability Questionnaire (CSUQ)
• Interface Consistency Testing Questionnaire (ICTQ)
• Questionnaire for User Interaction Satisfaction (QUIS)
• User Experience Questionnaire (UEQ)
• NASA Task Load Index (TLX)
• See book and web for more questionnaires
• http://oldwww.acm.org/perlman/question.html
• http://www.usabilitynet.org/tools/r_questionnaire.htm
• http://www.measuringu.com/blog/ux-questions.php
• Surveys on: Game Experience, Presence, Engagement, User Experience
3. Field Studies (Formative/Summative)
• Field studies are done in natural settings
• The aim is to understand what users do naturally
and how technology impacts them.
• In product design field studies can be used to:
- identify opportunities for new technology
- determine design requirements
- decide how to introduce new technology
- evaluate technology in use.
Example Field Study
• AR map application
• Digital Map (D) vs. AR Map (M)
• Experimenter followed pairs of people
• Made observations, interviewed afterwards
• Found interesting behaviours – e.g. Map as shared artifact in (M), separate in (D)
Morrison, A., Oulasvirta, A., Peltonen, P., Lemmela, S., Jacucci, G., Reitmayr, G., ... & Juustila, A. (2009, April). Like bees around
the hive: a comparative study of a mobile augmented reality map. In Proceedings of the SIGCHI conference on human factors in
computing systems (pp. 1889-1898).
Digital Map AR Enhanced Map
4. Predictive Evaluation (Formative)
• Experts apply their knowledge of typical users, often
guided by heuristics, to predict usability problems.
• Many heuristics available
• Nielsen’s 10 principles, Tognazzini’s 16 principles, etc..
• A key feature of predictive evaluation is that users
need not be present
• Relatively quick and inexpensive
How many Experts do you need?
• Nielsen
• 5 experts will find approximately
80% of problems
• However
• depends on how complex an interface is
• how many interface flaws exist
è Get as many as you can,
in the timeframe/resource given
• Using AR/VR to share communication cues
• Gaze, gesture, head pose, body position
• Collaboration between AR/VR
• VR user appears in AR user’s space
• Sharing same environment
• Virtual copy of real world
• What is the effect of gaze cues?
Piumsomboon, T., Dey, A., Ens, B., Lee, G., & Billinghurst, M. (2019). The effects of sharing awareness cues
in collaborative mixed reality. Frontiers in Robotics and AI, 6, 5.
Example: Sharing Communication Cues (2019)
Sharing Virtual Communication Cues
• AR/VR displays
• Gesture input (Leap Motion)
• Room scale tracking
Conditions
• Baseline: In the Baseline condition, we showed only the head and hands of the
collaborator in the scene. The head and hands were presented in all conditions
• Field-of-view (FoV): We showed the FoV frustum of each collaborator to the
other. This enabled collaborators to understand roughly where their partner was
looking and how much area the other person could see at any point in time.
• Head-gaze (FoV + Head-gaze ray): FoV frustum plus a ray originating from the
user's head to identify the center of the FoV, which provided a more precise
indication where the other collaborator was looking
• Eye-gaze (FoV + Eye-gaze ray): In this cue, we showed a ray originating from
the user's eye to show exactly where the user was looking at.
Task
• Search task
• Find specific blocks together
• Two phases:
• Object identification
• Object placement
• Designed to force collaboration
• Each person seeing different information
• Within-subject Design
• Everyone experiences all conditions
Measures
• Performance (Objective)
• Rate of Mutual Gaze
• Task completion time
• Observed (Objective)
• Number of hand gestures
• Physical movement
• Distance between collaborator
• Subjective
• Usability Survey (SUS)
• Social Presence Survey
• Interview
Data Collected
• Participants
• 16 pairs = 32 people
• 9 women
• Aged 20 – 55, average 31 years
• Experience
• No experience with VR (6), no experience AR (10), no HMD (7).
• Data collection
• Objective
• 4 (conditions) × 8 (trials per condition) × 16 pairs = 512 data points
• Subjective
• 4 (conditions) × 32 (participants) = 128 data points.
Motion Data
• Map user x,y
position over time
Results
• Predictions
• Eye/Head pointing better than no cues
• Eye/head pointing could reduce need for pointing
• Results
• No difference in task completion time
• Head-gaze/eye-gaze great mutual gaze rate
• Using head-gaze greater ease of use than baseline
• All cues provide higher co-presence than baseline
• Pointing gestures reduced in cue conditions
• But
• No difference between head-gaze and eye-gaze
Lessons Learned
• Decide on type of experiment
• Within subject vs. between subject
• Have well designed task with measurable outcomes
• Use both qualitative and quantitative measures
• Performance + user preference
• Have enough subjects for significant results
• Use the appropriate statistics
• Compare conditions + perform post hoc analysis
• Provide subject training on task
• Observe user behavior and interview subjects
BEYOND QUESTIONNAIRES
Moving Beyond Questionnaires
• Consider the entire user
• Cultural, social factors
• Move data capture from post experiment to during experiment
• Move from performance measures to process measures
• Richer types of data captured
• Physiological Cues - EEG, GSR, EMG, Heart rate, etc.
• Richer Behavioural Cues - Body motion, user positioning, etc.
• Higher level understanding
• Map data to Emotion recognition, Cognitive load, etc.
• Use better analysis tools
• Video analysis, conversation analysis, multi-modal analysis, etc.
Consider the Whole User
Social Acceptance
• People don’t want to look silly
• Only 12% of 4,600 adults would be willing to wear AR glasses
• 20% of mobile AR browser users experience social issues
• Acceptance more due to Social than Technical issues
• Needs further study (ethnographic, field tests, longitudinal)
TAT AugmentedID
Physical Ergonomics
• Evaluate the human motion range
• Consider human comfort and natural posture
• Example: Ergonomics for hand input
• Coarse and fine scale motions, gripping and grasping
• Avoid “Gorilla arm syndrome” from holding arm pose
Gorilla Arm in AR
• Design interface to reduce mid-air gestures
XRgonomics
• Uses physiological model to calculate ergonomic interaction cost
• Difficulty of reaching points around the user
• Customizable for different users
• Programmable API, Hololens demonstrator
• GitHub Repository
• https://github.com/joaobelo92/xrgonomics
Evangelista Belo, J. M., Feit, A. M., Feuchtner, T., & GrønbÌk, K. (2021, May). XRgonomics: Facilitating the Creation of
Ergonomic 3D Interfaces. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-11).
XRgonomics
https://www.youtube.com/watch?v=AlWbqP19MGs
New Tools
• New types of sensors
• EEG, ECG, GSR, etc
• Sensors integrated into AR/VR systems
• Integrated into HMDs
• Data processing and capture tools
• iMotions, etc
• AR/VR Analytics tools
• Cognitive3D, etc
HP Reverb G2 Omnicept
• Wide FOV, high resolution, best in class VR display
• Eye tracking, heart rate, pupillometry, and face camera
• SDK for measuring cognitive load
https://www.youtube.com/watch?v=PH3JnH_7CN8
• Real time data collection and processing
Looxid Link
Example Analytics Tool - Cognitive3D
• Data capture and analytics for VR
• Multiple sensory input (eye tracking, HR, EEG, body movement, etc)
• https://cognitive3d.com/
https://www.youtube.com/watch?v=AblJ_v2BZJY
Example: Measuring Trust
• How to reliably measure trust?
• Using physiological sensors (EEG, GSR, HRV)
• Subjective measures (STS, SMEQ, NASA-TLX)
• Relationship between cognitive load (CL) and trust?
• Novelty:
• Use EEG, GSR, HRV to evaluate trust at different CL
• Implemented custom VR environment with virtual agent
• Compare physiological, behavioral, subjective measures
Gupta, K., Hajika, R., Pai, Y. S., Duenser, A., Lochner, M., & Billinghurst, M. (2020, March).
Measuring human trust in a virtual assistant using physiological sensing in virtual reality.
In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (pp. 756-765). IEEE.
Experimental Task
• Target selection + N back memory task
• Agent voice guidance
Experiment Design
• Two factors
• Cognitive Load (Low, High)
• Low = N-Back with N = 1
• High = N-Back with N = 2
• Agent Accuracy (No, Low, High)
• No = No agent
• Low = 50% accurate
• High = 100% accurate
• Within Subject Design
• 24 subjects (12 Male), 23-35 years old
• All experienced with virtual assistant
2 x 3 Expt Design
Results
• Physiological Measures
• EEG sign. diff. in alpha band power level with CL
• GSR/HRV – sign. diff. in FFT mean/peak frequency
• Performance
• Better with more accurate agent, no effect of CL
• Subjective Measures
• Sign. diff. in STS scores with accuracy, and CL
• SMEQ had a significant effect of CL
• NASA-TLX significant effect of CL and accuracy
• Overall
• Trust for virtual agents can be measured using combo
of physiological, performance, and subjective measures
”I don’t trust you anymore!!”
RESEARCH DIRECTIONS
Meta-Review
Review of 10 years of AR user studies
Dey, A., Billinghurst, M., Lindeman, R. W.,
& Swan, J. (2018). A systematic review of
10 years of augmented reality usability
studies: 2005 to 2014. Frontiers in
Robotics and AI, 5, 37.
Paper Analysis
Breakdown by Application Area
Breakdown by Application Area
Summary
• Few AR papers have a formal experiment (~10%)
• Most papers use within-subjects design (73%)
• Most experiments in controlled environments (76%)
• Lack of experimentation in real world conditions, heuristic, pilot studies
• Half of paper collect Qualitative and Quantitative measures (48%)
• Performance measures (76%), surveys (50%)
• Most papers focus on visual senses (96%)
• Young participants dominate (University students) (62%)
• Females in minority (36%)
• Most use HMD (35%) or handheld displays (34%)
• Handheld/mobile AR studies becoming more common
• Most studies are in interaction (23%), very few collaborative studies (4%)
Research Opportunities
• Need for increased user studies in collaboration
• More use of field studies, natural user
• Need a wider range of evaluation methods
• Use a more diverse selection of participants
• Increase number of participants
• More user studies conducted outdoors are needed
• Report participant demographics, study design, or experimental task
CONCLUSION
Conclusions
•Evaluate throughout the design process
•Move from formative to summative evaluation
•Use multiple types of evaluation
•Collect multiple types of data
•Go beyond questionnaires
•Many directions for future research
To study more …
Jonathan Lazar, Jinjuan Heidi Feng Harry Hochheiser
Research Methods In Human-Computer Interaction
www.empathiccomputing.org
@marknb00
mark.billinghurst@auckland.ac.nz

Mais conteĂşdo relacionado

Mais procurados

Mais procurados (20)

Lecture 6 Interaction Design for VR
Lecture 6 Interaction Design for VRLecture 6 Interaction Design for VR
Lecture 6 Interaction Design for VR
 
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote CollaborationTalk to Me: Using Virtual Avatars to Improve Remote Collaboration
Talk to Me: Using Virtual Avatars to Improve Remote Collaboration
 
Comp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and PrototypingComp4010 Lecture5 Interaction and Prototyping
Comp4010 Lecture5 Interaction and Prototyping
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
 
Comp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-PerceptionComp4010 2021 Lecture2-Perception
Comp4010 2021 Lecture2-Perception
 
Comp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface DesignComp4010 Lecture10 VR Interface Design
Comp4010 Lecture10 VR Interface Design
 
Empathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the MetaverseEmpathic Computing: Capturing the Potential of the Metaverse
Empathic Computing: Capturing the Potential of the Metaverse
 
2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR2022 COMP 4010 Lecture 7: Introduction to VR
2022 COMP 4010 Lecture 7: Introduction to VR
 
Comp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research DirectionsComp4010 Lecture12 Research Directions
Comp4010 Lecture12 Research Directions
 
ISS2022 Keynote
ISS2022 KeynoteISS2022 Keynote
ISS2022 Keynote
 
2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR2022 COMP4010 Lecture1: Introduction to XR
2022 COMP4010 Lecture1: Introduction to XR
 
Comp 4010 2021 Snap Tutorial 2
Comp 4010 2021 Snap Tutorial 2Comp 4010 2021 Snap Tutorial 2
Comp 4010 2021 Snap Tutorial 2
 
Novel Interfaces for AR Systems
Novel Interfaces for AR SystemsNovel Interfaces for AR Systems
Novel Interfaces for AR Systems
 
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR ApplicationsComp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
 
Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality Grand Challenges for Mixed Reality
Grand Challenges for Mixed Reality
 
Research Directions in Transitional Interfaces
Research Directions in Transitional InterfacesResearch Directions in Transitional Interfaces
Research Directions in Transitional Interfaces
 
Comp4010 Lecture7 Designing AR Systems
Comp4010 Lecture7 Designing AR SystemsComp4010 Lecture7 Designing AR Systems
Comp4010 Lecture7 Designing AR Systems
 
Comp4010 lecture3-AR Technology
Comp4010 lecture3-AR TechnologyComp4010 lecture3-AR Technology
Comp4010 lecture3-AR Technology
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction
 
AR-VR Workshop
AR-VR WorkshopAR-VR Workshop
AR-VR Workshop
 

Semelhante a Advanced Methods for User Evaluation in Enterprise AR

Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008
PataĂąjali Chary
 
User interface design: definitions, processes and principles
User interface design: definitions, processes and principlesUser interface design: definitions, processes and principles
User interface design: definitions, processes and principles
David Little
 

Semelhante a Advanced Methods for User Evaluation in Enterprise AR (20)

International Business User Research: Methods and Tools
International Business User Research: Methods and ToolsInternational Business User Research: Methods and Tools
International Business User Research: Methods and Tools
 
Requirements Engineering for the Humanities
Requirements Engineering for the HumanitiesRequirements Engineering for the Humanities
Requirements Engineering for the Humanities
 
UX Design Process | Sample Proposal
UX Design Process | Sample Proposal UX Design Process | Sample Proposal
UX Design Process | Sample Proposal
 
ICS2208 Lecture 5
ICS2208 Lecture 5ICS2208 Lecture 5
ICS2208 Lecture 5
 
Interaction design: desiging user interfaces for digital products
Interaction design: desiging user interfaces for digital productsInteraction design: desiging user interfaces for digital products
Interaction design: desiging user interfaces for digital products
 
Career Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptxCareer Counseling Application Prototype.pptx
Career Counseling Application Prototype.pptx
 
Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008
 
UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008
 
Session1 methods research_question
Session1 methods research_questionSession1 methods research_question
Session1 methods research_question
 
UX Workshop at Startit@KBC
UX Workshop at Startit@KBCUX Workshop at Startit@KBC
UX Workshop at Startit@KBC
 
Rosenhan "User Research"
Rosenhan "User Research"Rosenhan "User Research"
Rosenhan "User Research"
 
11 - Evaluating Framework in Interaction Design_new.pptx
11 - Evaluating Framework in Interaction Design_new.pptx11 - Evaluating Framework in Interaction Design_new.pptx
11 - Evaluating Framework in Interaction Design_new.pptx
 
UX Burlington 2017: Exploratory Research in UX Design
UX Burlington 2017: Exploratory Research in UX DesignUX Burlington 2017: Exploratory Research in UX Design
UX Burlington 2017: Exploratory Research in UX Design
 
Users are Losers! They’ll Like Whatever we Make! and Other Fallacies.
Users are Losers! They’ll Like Whatever we Make! and Other Fallacies.Users are Losers! They’ll Like Whatever we Make! and Other Fallacies.
Users are Losers! They’ll Like Whatever we Make! and Other Fallacies.
 
Design Research For Everyday Projects - UX London
Design Research For Everyday Projects  - UX LondonDesign Research For Everyday Projects  - UX London
Design Research For Everyday Projects - UX London
 
When Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking AsiaWhen Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking Asia
 
Designing Mobile UX
Designing Mobile UXDesigning Mobile UX
Designing Mobile UX
 
Usability Workshop, 11-8-2012
Usability Workshop, 11-8-2012Usability Workshop, 11-8-2012
Usability Workshop, 11-8-2012
 
User interface design: definitions, processes and principles
User interface design: definitions, processes and principlesUser interface design: definitions, processes and principles
User interface design: definitions, processes and principles
 
UCIDesign.ppt
UCIDesign.pptUCIDesign.ppt
UCIDesign.ppt
 

Mais de Mark Billinghurst

Mais de Mark Billinghurst (11)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Future Research Directions for Augmented Reality
Future Research Directions for Augmented RealityFuture Research Directions for Augmented Reality
Future Research Directions for Augmented Reality
 
Evaluation Methods for Social XR Experiences
Evaluation Methods for Social XR ExperiencesEvaluation Methods for Social XR Experiences
Evaluation Methods for Social XR Experiences
 
Empathic Computing: Delivering the Potential of the Metaverse
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
 
Empathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader MetaverseEmpathic Computing: Designing for the Broader Metaverse
Empathic Computing: Designing for the Broader Metaverse
 
2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology2022 COMP4010 Lecture3: AR Technology
2022 COMP4010 Lecture3: AR Technology
 
Empathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive AnalyticsEmpathic Computing and Collaborative Immersive Analytics
Empathic Computing and Collaborative Immersive Analytics
 
Metaverse Learning
Metaverse LearningMetaverse Learning
Metaverse Learning
 
Empathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole MetaverseEmpathic Computing: Developing for the Whole Metaverse
Empathic Computing: Developing for the Whole Metaverse
 
Comp4010 Lecture9 VR Input and Systems
Comp4010 Lecture9 VR Input and SystemsComp4010 Lecture9 VR Input and Systems
Comp4010 Lecture9 VR Input and Systems
 

Último

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Advanced Methods for User Evaluation in Enterprise AR

  • 1. ADVANCED METHODS FOR USER EVALUATION IN ENTERPRISE AR Mark Billinghurst mark.billinghurst@unisa.edu.au September 29th 2021
  • 2.
  • 5. From Idea to Product Define Requirements CityViewAR 1 2 3 4 5 6 1 Sketch Interface 2 Rough Wireframes 3 Interactive Prototype 4 High Fidelity Prototype 5 Developer Coding 6 User Testing 7 Deploy App 8 Design, Develop, Evaluate Iterate
  • 7. “Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process you are going to throw buckets of money down the drain.” Bruce Tognazzini
  • 9. What is Evaluation? • Concerned with gathering data about the quality of a design/product (UI) • User performance, Usability • User experience, User acceptance, … • Why Evaluate? • To validate/refine the prototype/solution • To learn more about user and the problem • To move forward to the next iteration
  • 10. When to evaluate? • Once the product has been developed • pros : rapid development, small evaluation cost • cons : rectifying problems • During design and development • pros : find and rectify problems early • cons : higher evaluation cost, longer development design implementationevaluation redesign & reimplementation design implementation
  • 11. Types of Evaluation • Formative testing • early stage of development (Low-fidelity prototypes) • focus on user perception of experience • comparing multiple design option • Summative testing • later development (High-fidelity interactive prototype) • evaluate the effectiveness of specific design choices • focus on performance and usability
  • 13. Four Evaluation Paradigms 1. ‘quick and dirty’ 2. User testing 3. Field studies 4. Predictive evaluation
  • 14. 1. Quick and Dirty Testing (Formative) • Informal feedback from users to confirm that their ideas are in-line with users’ needs and are liked. • Quick and dirty evaluations are done any time. • Emphasis is on fast input to the design process rather than carefully documented findings.
  • 16. 2. User Testing (Formative/Summative) • Investigations of the users and their use of human-computer interface • Observe and Describe • Explain • Predict • Determination of the Causes • Main research method in HCI field • Closely related to ... • Behavioral Science in Psychology • Ergonomics & Human factors in Industrial Engineering
  • 17. User Testing Methods • Interview • Focus group • Survey • Questionnaire • Usability Testing • User Experiment Informal, qualitative feedback Formative testing Formal, quantitative feedback Summative Testing Mix & match to your needs. Triangulation. Experimental investigation Descriptive / Relational investigation
  • 18. Interview and Focus Group • Ask the user • Direct conversations as tools for feedback collection • Understand requirements, needs, problems • Interviews – one at a time • Focus groups – many
  • 19. Survey and Questionnaire • Survey • Using questionnaire to which a user is asked to respond • Questionnaire • a well-defined and well-written set of questions • Typically self-administered • Surveys are good at: • getting a large number of responses quickly from a geographically dispersed population • collecting sensitive/private information • You can capture the “big picture” relatively quickly
  • 20. Example: Net Promoter Score • Standardized measure across large number of industries
  • 22. Experimental Study (Summative) • Experiments discover/verify new knowledge by investigating the causal effect between two or more things (i.e. variables). • Independent Variable (IV) • Manipulated to create different experimental conditions • e.g. type of tool used, design alternatives • Dependent Variable (DV) • Measured to find out the effects of changing the independent variables • e.g. user performance, satisfaction, usability
  • 23. Experimental Design Randomly assigned Statistical data analysis Experimental task Condition 2 Condition 3 Condition 1 Subjects data data data Between-Subject Randomly assigned Statistical data analysis Subjects data data data Within-Subject Experimental tasks Condition 2 Condition 3 Condition 1 Experimental tasks Condition 1 Condition 3 Condition 2 Experimental tasks Condition 1 Condition 2 Condition 3
  • 24. Between vs. Within-Subject Design Between-subject Within-subject - Avoids interference effects (e.g. practice / learning effect) - Shorter time for each participant (less fatigue and frustration) - Learning effect - Longer time for each participant (Larger impact of fatigue and frustration) - Impact of individuals difference - Harder to detect difference between - conditions - Require larger sample size - Individual difference can be isolated - Easier to detect difference between conditions - Requires smaller sample size - gender, age, experience must be between- subject factors - change over time must be within-subject factor Important: Randomised assignment to conditions Important: Counterbalance/randomise the order of presenting conditions
  • 25. Objective vs. Subjective Measures • Objective measures • Not influenced by personal feeling/opinion • Based on observation, compared against standardized scale. • More consistent • Subjective measures • Based on user's opinions, interpretations, points of view, emotions and judgment. • More vulnerable to context and users’ status
  • 26. Data Types • Subjective (Qualitative) • Subjective survey • Likert Scale, condition rankings • Observations • Think Aloud • Interview responses • Objective (Quantitative) • Performance measures • Time, accuracy, errors • Process measures • Video/audio analysis How easy was the task 1 2 3 4 5 Not very easy Very easy
  • 27. Standard Questionnaires in HCI • Existing validated questionnaires in the HCI literature: • System Usability Scale (SUS) • Computer System Usability Questionnaire (CSUQ) • Interface Consistency Testing Questionnaire (ICTQ) • Questionnaire for User Interaction Satisfaction (QUIS) • User Experience Questionnaire (UEQ) • NASA Task Load Index (TLX) • See book and web for more questionnaires • http://oldwww.acm.org/perlman/question.html • http://www.usabilitynet.org/tools/r_questionnaire.htm • http://www.measuringu.com/blog/ux-questions.php • Surveys on: Game Experience, Presence, Engagement, User Experience
  • 28. 3. Field Studies (Formative/Summative) • Field studies are done in natural settings • The aim is to understand what users do naturally and how technology impacts them. • In product design field studies can be used to: - identify opportunities for new technology - determine design requirements - decide how to introduce new technology - evaluate technology in use.
  • 29. Example Field Study • AR map application • Digital Map (D) vs. AR Map (M) • Experimenter followed pairs of people • Made observations, interviewed afterwards • Found interesting behaviours – e.g. Map as shared artifact in (M), separate in (D) Morrison, A., Oulasvirta, A., Peltonen, P., Lemmela, S., Jacucci, G., Reitmayr, G., ... & Juustila, A. (2009, April). Like bees around the hive: a comparative study of a mobile augmented reality map. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1889-1898). Digital Map AR Enhanced Map
  • 30. 4. Predictive Evaluation (Formative) • Experts apply their knowledge of typical users, often guided by heuristics, to predict usability problems. • Many heuristics available • Nielsen’s 10 principles, Tognazzini’s 16 principles, etc.. • A key feature of predictive evaluation is that users need not be present • Relatively quick and inexpensive
  • 31. How many Experts do you need? • Nielsen • 5 experts will find approximately 80% of problems • However • depends on how complex an interface is • how many interface flaws exist è Get as many as you can, in the timeframe/resource given
  • 32. • Using AR/VR to share communication cues • Gaze, gesture, head pose, body position • Collaboration between AR/VR • VR user appears in AR user’s space • Sharing same environment • Virtual copy of real world • What is the effect of gaze cues? Piumsomboon, T., Dey, A., Ens, B., Lee, G., & Billinghurst, M. (2019). The effects of sharing awareness cues in collaborative mixed reality. Frontiers in Robotics and AI, 6, 5. Example: Sharing Communication Cues (2019)
  • 33. Sharing Virtual Communication Cues • AR/VR displays • Gesture input (Leap Motion) • Room scale tracking
  • 34. Conditions • Baseline: In the Baseline condition, we showed only the head and hands of the collaborator in the scene. The head and hands were presented in all conditions • Field-of-view (FoV): We showed the FoV frustum of each collaborator to the other. This enabled collaborators to understand roughly where their partner was looking and how much area the other person could see at any point in time. • Head-gaze (FoV + Head-gaze ray): FoV frustum plus a ray originating from the user's head to identify the center of the FoV, which provided a more precise indication where the other collaborator was looking • Eye-gaze (FoV + Eye-gaze ray): In this cue, we showed a ray originating from the user's eye to show exactly where the user was looking at.
  • 35. Task • Search task • Find specific blocks together • Two phases: • Object identification • Object placement • Designed to force collaboration • Each person seeing different information • Within-subject Design • Everyone experiences all conditions
  • 36. Measures • Performance (Objective) • Rate of Mutual Gaze • Task completion time • Observed (Objective) • Number of hand gestures • Physical movement • Distance between collaborator • Subjective • Usability Survey (SUS) • Social Presence Survey • Interview
  • 37. Data Collected • Participants • 16 pairs = 32 people • 9 women • Aged 20 – 55, average 31 years • Experience • No experience with VR (6), no experience AR (10), no HMD (7). • Data collection • Objective • 4 (conditions) × 8 (trials per condition) × 16 pairs = 512 data points • Subjective • 4 (conditions) × 32 (participants) = 128 data points.
  • 38.
  • 39. Motion Data • Map user x,y position over time
  • 40. Results • Predictions • Eye/Head pointing better than no cues • Eye/head pointing could reduce need for pointing • Results • No difference in task completion time • Head-gaze/eye-gaze great mutual gaze rate • Using head-gaze greater ease of use than baseline • All cues provide higher co-presence than baseline • Pointing gestures reduced in cue conditions • But • No difference between head-gaze and eye-gaze
  • 41. Lessons Learned • Decide on type of experiment • Within subject vs. between subject • Have well designed task with measurable outcomes • Use both qualitative and quantitative measures • Performance + user preference • Have enough subjects for significant results • Use the appropriate statistics • Compare conditions + perform post hoc analysis • Provide subject training on task • Observe user behavior and interview subjects
  • 43. Moving Beyond Questionnaires • Consider the entire user • Cultural, social factors • Move data capture from post experiment to during experiment • Move from performance measures to process measures • Richer types of data captured • Physiological Cues - EEG, GSR, EMG, Heart rate, etc. • Richer Behavioural Cues - Body motion, user positioning, etc. • Higher level understanding • Map data to Emotion recognition, Cognitive load, etc. • Use better analysis tools • Video analysis, conversation analysis, multi-modal analysis, etc.
  • 45. Social Acceptance • People don’t want to look silly • Only 12% of 4,600 adults would be willing to wear AR glasses • 20% of mobile AR browser users experience social issues • Acceptance more due to Social than Technical issues • Needs further study (ethnographic, field tests, longitudinal)
  • 47.
  • 48.
  • 49. Physical Ergonomics • Evaluate the human motion range • Consider human comfort and natural posture • Example: Ergonomics for hand input • Coarse and fine scale motions, gripping and grasping • Avoid “Gorilla arm syndrome” from holding arm pose
  • 50. Gorilla Arm in AR • Design interface to reduce mid-air gestures
  • 51. XRgonomics • Uses physiological model to calculate ergonomic interaction cost • Difficulty of reaching points around the user • Customizable for different users • Programmable API, Hololens demonstrator • GitHub Repository • https://github.com/joaobelo92/xrgonomics Evangelista Belo, J. M., Feit, A. M., Feuchtner, T., & GrønbĂŚk, K. (2021, May). XRgonomics: Facilitating the Creation of Ergonomic 3D Interfaces. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-11).
  • 53. New Tools • New types of sensors • EEG, ECG, GSR, etc • Sensors integrated into AR/VR systems • Integrated into HMDs • Data processing and capture tools • iMotions, etc • AR/VR Analytics tools • Cognitive3D, etc
  • 54. HP Reverb G2 Omnicept • Wide FOV, high resolution, best in class VR display • Eye tracking, heart rate, pupillometry, and face camera • SDK for measuring cognitive load
  • 56. • Real time data collection and processing
  • 58. Example Analytics Tool - Cognitive3D • Data capture and analytics for VR • Multiple sensory input (eye tracking, HR, EEG, body movement, etc) • https://cognitive3d.com/
  • 60. Example: Measuring Trust • How to reliably measure trust? • Using physiological sensors (EEG, GSR, HRV) • Subjective measures (STS, SMEQ, NASA-TLX) • Relationship between cognitive load (CL) and trust? • Novelty: • Use EEG, GSR, HRV to evaluate trust at different CL • Implemented custom VR environment with virtual agent • Compare physiological, behavioral, subjective measures Gupta, K., Hajika, R., Pai, Y. S., Duenser, A., Lochner, M., & Billinghurst, M. (2020, March). Measuring human trust in a virtual assistant using physiological sensing in virtual reality. In 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR) (pp. 756-765). IEEE.
  • 61. Experimental Task • Target selection + N back memory task • Agent voice guidance
  • 62. Experiment Design • Two factors • Cognitive Load (Low, High) • Low = N-Back with N = 1 • High = N-Back with N = 2 • Agent Accuracy (No, Low, High) • No = No agent • Low = 50% accurate • High = 100% accurate • Within Subject Design • 24 subjects (12 Male), 23-35 years old • All experienced with virtual assistant 2 x 3 Expt Design
  • 63. Results • Physiological Measures • EEG sign. diff. in alpha band power level with CL • GSR/HRV – sign. diff. in FFT mean/peak frequency • Performance • Better with more accurate agent, no effect of CL • Subjective Measures • Sign. diff. in STS scores with accuracy, and CL • SMEQ had a significant effect of CL • NASA-TLX significant effect of CL and accuracy • Overall • Trust for virtual agents can be measured using combo of physiological, performance, and subjective measures ”I don’t trust you anymore!!”
  • 65. Meta-Review Review of 10 years of AR user studies Dey, A., Billinghurst, M., Lindeman, R. W., & Swan, J. (2018). A systematic review of 10 years of augmented reality usability studies: 2005 to 2014. Frontiers in Robotics and AI, 5, 37.
  • 67.
  • 70. Summary • Few AR papers have a formal experiment (~10%) • Most papers use within-subjects design (73%) • Most experiments in controlled environments (76%) • Lack of experimentation in real world conditions, heuristic, pilot studies • Half of paper collect Qualitative and Quantitative measures (48%) • Performance measures (76%), surveys (50%) • Most papers focus on visual senses (96%) • Young participants dominate (University students) (62%) • Females in minority (36%) • Most use HMD (35%) or handheld displays (34%) • Handheld/mobile AR studies becoming more common • Most studies are in interaction (23%), very few collaborative studies (4%)
  • 71. Research Opportunities • Need for increased user studies in collaboration • More use of field studies, natural user • Need a wider range of evaluation methods • Use a more diverse selection of participants • Increase number of participants • More user studies conducted outdoors are needed • Report participant demographics, study design, or experimental task
  • 73. Conclusions •Evaluate throughout the design process •Move from formative to summative evaluation •Use multiple types of evaluation •Collect multiple types of data •Go beyond questionnaires •Many directions for future research
  • 74. To study more … Jonathan Lazar, Jinjuan Heidi Feng Harry Hochheiser Research Methods In Human-Computer Interaction
  • 75.