SlideShare uma empresa Scribd logo
1 de 19
Erfaringer med Remote Usability Testing?


Jan Stage

Professor, PhD
Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI)
Aalborg Universitet, Institut for Datalogi, HCI-Lab

jans@cs.aau.dk
Oversigt
       • Undersøgelse 1
       • Undersøgelse 2




Institut for Datalogi     2
Oversigt
       • Undersøgelse 1: synkron eller asynkron
               •        Metode
               •        Resultater
               •        Konklusion
       • Undersøgelse 2




Institut for Datalogi                             3
Empirical Study 1
       Four methods: LAB – RS – AE – AU
       Test subjects: 6 in each condition (18 users and 6 with usability expertise), all
          students at Aalborg University
       System: Email client (Mozilla Thunderbird 1.5)
       9 defined tasks (typical email functions)
       Setting, procedure and data collection in accordance with method
       Data analysis: 24 outputs were analysed by three persons in random and
          different order
       Generated their individual lists of usability problems with their own
         categorizations (also for the AE and AU conditions)
       These were merged into an overall problem list through negotiation
Institut for Datalogi                                                                      4
Results: Task Completion
                              No significant difference in task completion
                              Significant difference in task completion
                                 time
                              The users in the two asynchronous
                                 conditions spent considerably more
                                 time
                              We do not know the reason




Institut for Datalogi                                                        5
Results: Usability Problems Identified




       A total of 46 usability problems
       No significant difference between LAB and RS
       AE/AU identified significantly fewer problems, also critical problems
       No significant difference between AE and AU in terms of problems identified


Institut for Datalogi                                                                6
Conclusion
       RS is the most widely described and used remote method. The performance
          is virtually equivalent to LAB (or slightly better)
       AE and AU perform surprisingly well
       Experts do not perform significantly better than users
       Video analysis (LAB and RS) required considerably more evaluator effort than
          the user-based reporting (AU and AE)
       Users can actually contribute to usability evaluation – not with the same
          quality, but reasonably well, and there are plenty of them




Institut for Datalogi                                                                 7
Oversigt
       • Undersøgelse 1
       • Undersøgelse 2: hvilken asynkron metode
               •        Metode
               •        Resultater
               •        Konklusion




Institut for Datalogi                              8
Empirical Study 2
       Purpose: examine and compare remote asynchronous methods
       Focus on usability problems identified
       Comparable with the previous study
       Selection of asynchronous methods based on literature survey




Institut for Datalogi                                                 9
The 3 Remote Asynchronous Methods
       User-reported critical incident (UCI)
               •    Well-defined method (Castillo et al. CHI 1998)
       Forum-based online reporting and discussion (Forum)
               •    Assumption: through collaboration participants may give input which increases data
                    quality and richness (Thompson, 1999)
               •    A source for collecting qualitative data in a study of auto logging (Millen, 1999): the
                    participants turned out to report detailed usability feedback
       Diary-based longitudinal user reporting (Diary)
               •    Used on a longitudinal basis for participants in a study of auto logging to provide
                    qualitative information (Steves et al. CSCW 2001)
               •    First day: same tasks as the other conditions (first part of diary delivered)
               •    Four more days: new tasks (same type) sent daily (complete diary delivered)
       Conventional user-based laboratory test (Lab)
               •    Included as benchmark


Institut for Datalogi                                                                                         10
Empirical Study (1)
       Participants:
               • 40 test subjects, 10 for each condition
               • Students, age 20 to 30
               • Distributed evenly: gender and tech/non-tech education
       Setting:
               • LAB: in our usability lab
               • Remote asynchronous: in the participants’ homes
       Participants in the remote asynchronous conditions received the software and
          installed it on their computer
       Training material for the remote asynchronous conditions
               • Identification and categorisation of usability problems
               • A minimalist approach that was strictly remote and asynchronous (via email)



Institut for Datalogi                                                                          11
Empirical Study (2)
       Tasks:
               • Nine fixed tasks
               • The same across the four conditions to ensure that all participants used the
                 same parts of the system
               • Typical email tasks (same as previous study)
       Data collection in accordance with the method
               •    LAB: video recordings
               •    UCI: web-based system for generating problem descriptions while solving tasks
               •    Forum: after solving tasks, one week for posting and discussing problems
               •    Diary: a diary with no imposed structure; first part after the first day




Institut for Datalogi                                                                               12
Data Analysis
       All data collected before the data analysis started
       3 evaluators did the whole data analysis
       The 40 data sets were analysed by the 3 evaluators
               • In random order: by a draw
               • In different order between them
       The user input from the three remote conditions was transformed into
          usability problem descriptions
       Each evaluator generated his/her own individual lists of usability problems with
          their own severity ratings
               • A problem list for each condition
               • A complete problem list (joined)
       These were merged into an overall problem list through negotiation

Institut for Datalogi                                                                     13
Results: Task Completion Time
       Considerable variation in task completion times




       Participants in the remote conditions worked in their home at a time they
          selected
       For each task there was a hint that allowed them to check if they had solved
          the task correctly
       As we have no data on the task solving process in the remote conditions, we
          cannot explain this variation
Institut for Datalogi                                                                 14
Results: Usability Problems Identified
                                        Lab               UCI           Forum             Diary
                                        N=10              N=10           N=10             N=10
         Task completion time in                                                      Tasks 1-9:
         minutes: Average (SD)     24.24 (6.3)      34.45 (14.33)     15.45 (5.83)   32.57 (28.34)

         Usability problems:       #           %     #           %     #       %      #           %
         Critical (21)             20          95    10          48    9       43    11       52
         Serious (17)              14          82    2           12    1       6      6       35
         Cosmetic (24)             12          50    1           4     5       21    12       50
         Total (62)                46          74    13          21   15       24    29       47



       LAB: significantly better than the 3 remote conditions
       UCI-Forum: no significant difference
       UCI-Diary: significant overall: Diary – also significant on cosmetic
       Forum-Diary: significant overall: Diary – not significant on any level

Institut for Datalogi                                                                                 15
Results: Evaluator Effort
                                 Lab      UCI     Forum   Diary
                                   (46)    (13)    (15)     (29)
         Preparation               6:00    2:40    2:40     2:40
         Conducting test          10:00    1:00    1:00     1:30
         Analysis                 33:18    2:52    3:56     9:38
         Merging problem lists    11:45    1:41    1:42     4:58
         Total time spent         61:03    8:13    9:18    18:46
         Avg. time per problem     1:20    0:38    0:37     0:39


       The sum for all evaluators involved in each activity
       Time for finding test subjects is not included (8h, common for all)
       Task specifications from an earlier study. Preparation in the remote
          conditions: work out written instructions
       Considerable differences between the remote conditions for analysis and
         merging of problem lists
Institut for Datalogi                                                            16
Conclusion
       The three remote methods performed significantly below the classical lab test
          in terms of the number of usability problems identified
       The Diary was the best remote method – it identified half of the problems
          found in the Lab condition
       UCI and Forum performed similarly for critical problems but worse for
         serious problems
       UCI and Forum took 13% of the lab test. Diary took 30%
       The productivity of the remote methods was considerably higher




Institut for Datalogi                                                                  17
Institut for Datalogi   18
Interaktionsdesign og usability-evaluering
       Master i IT
       Videreuddannelse under IT-Vest
       Fagpakke i Interaktionsdesign og usability-evaluering starter 1/2-12
       Optager bachelorer, men også indgang for datamatikere
       Information: http://www.master-it-vest.dk/




Institut for Datalogi                                                         19

Mais conteúdo relacionado

Semelhante a Erfaringer med Remote Usability Testing af Jan Stage, AAU

Differences in-task-descriptions
Differences in-task-descriptionsDifferences in-task-descriptions
Differences in-task-descriptions
Sameer Chavan
 

Semelhante a Erfaringer med Remote Usability Testing af Jan Stage, AAU (20)

Thesis Defense_Karpinsky
Thesis Defense_KarpinskyThesis Defense_Karpinsky
Thesis Defense_Karpinsky
 
Differences in-task-descriptions
Differences in-task-descriptionsDifferences in-task-descriptions
Differences in-task-descriptions
 
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
WUD Slovakia 2015: Experiment v UX class / Prof. Ing. Mária Bieliková, PhD.
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 
Towards Task Analysis Tool Support
Towards Task Analysis Tool SupportTowards Task Analysis Tool Support
Towards Task Analysis Tool Support
 
Usability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metricsUsability evaluation methods (part 2) and performance metrics
Usability evaluation methods (part 2) and performance metrics
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Lecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptxLecture-2 Applied ML .pptx
Lecture-2 Applied ML .pptx
 
ISM2014
ISM2014ISM2014
ISM2014
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
 
Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!Investigating teachers' understanding of IMS Learning Design: Yes they can!
Investigating teachers' understanding of IMS Learning Design: Yes they can!
 
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
Vision Based Analysis on Trajectories of Notes Representing Ideas Toward Work...
 
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...Evaluating the User Experience of Virtual Learning Environments Using Biometr...
Evaluating the User Experience of Virtual Learning Environments Using Biometr...
 
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems DeepXplore: Automated Whitebox Testing of Deep Learning Systems
DeepXplore: Automated Whitebox Testing of Deep Learning Systems
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Controlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong ZhaoControlled Experiments - Shengdong Zhao
Controlled Experiments - Shengdong Zhao
 
Usability testing through the decades
Usability testing through the decadesUsability testing through the decades
Usability testing through the decades
 
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting wit...
 

Mais de InfinIT - Innovationsnetværket for it

Mais de InfinIT - Innovationsnetværket for it (20)

Erfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermarkErfaringer med-c kurt-noermark
Erfaringer med-c kurt-noermark
 
Object orientering, test driven development og c
Object orientering, test driven development og cObject orientering, test driven development og c
Object orientering, test driven development og c
 
Embedded softwaredevelopment hcs
Embedded softwaredevelopment hcsEmbedded softwaredevelopment hcs
Embedded softwaredevelopment hcs
 
C og c++-jens lund jensen
C og c++-jens lund jensenC og c++-jens lund jensen
C og c++-jens lund jensen
 
201811xx foredrag c_cpp
201811xx foredrag c_cpp201811xx foredrag c_cpp
201811xx foredrag c_cpp
 
C som-programmeringssprog-bt
C som-programmeringssprog-btC som-programmeringssprog-bt
C som-programmeringssprog-bt
 
Infinit seminar 060918
Infinit seminar 060918Infinit seminar 060918
Infinit seminar 060918
 
DCR solutions
DCR solutionsDCR solutions
DCR solutions
 
Not your grandfathers BPM
Not your grandfathers BPMNot your grandfathers BPM
Not your grandfathers BPM
 
Kmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolutionKmd workzone - an evolutionary approach to revolution
Kmd workzone - an evolutionary approach to revolution
 
EcoKnow - oplæg
EcoKnow - oplægEcoKnow - oplæg
EcoKnow - oplæg
 
Martin Wickins Chatbots i fronten
Martin Wickins Chatbots i frontenMartin Wickins Chatbots i fronten
Martin Wickins Chatbots i fronten
 
Marie Fenger ai kundeservice
Marie Fenger ai kundeserviceMarie Fenger ai kundeservice
Marie Fenger ai kundeservice
 
Mads Kaysen SupWiz
Mads Kaysen SupWizMads Kaysen SupWiz
Mads Kaysen SupWiz
 
Leif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support CenterLeif Howalt NNIT Service Support Center
Leif Howalt NNIT Service Support Center
 
Jan Neerbek NLP og Chatbots
Jan Neerbek NLP og ChatbotsJan Neerbek NLP og Chatbots
Jan Neerbek NLP og Chatbots
 
Anders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer SupportAnders Soegaard NLP for Customer Support
Anders Soegaard NLP for Customer Support
 
Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018Stephen Alstrup infinit august 2018
Stephen Alstrup infinit august 2018
 
Innovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekterInnovation og værdiskabelse i it-projekter
Innovation og værdiskabelse i it-projekter
 
Rokoko infin it presentation
Rokoko infin it presentation Rokoko infin it presentation
Rokoko infin it presentation
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Erfaringer med Remote Usability Testing af Jan Stage, AAU

  • 1. Erfaringer med Remote Usability Testing? Jan Stage Professor, PhD Forskningsleder i Informationssystemer (IS)/Human-Computer Interaction (HCI) Aalborg Universitet, Institut for Datalogi, HCI-Lab jans@cs.aau.dk
  • 2. Oversigt • Undersøgelse 1 • Undersøgelse 2 Institut for Datalogi 2
  • 3. Oversigt • Undersøgelse 1: synkron eller asynkron • Metode • Resultater • Konklusion • Undersøgelse 2 Institut for Datalogi 3
  • 4. Empirical Study 1 Four methods: LAB – RS – AE – AU Test subjects: 6 in each condition (18 users and 6 with usability expertise), all students at Aalborg University System: Email client (Mozilla Thunderbird 1.5) 9 defined tasks (typical email functions) Setting, procedure and data collection in accordance with method Data analysis: 24 outputs were analysed by three persons in random and different order Generated their individual lists of usability problems with their own categorizations (also for the AE and AU conditions) These were merged into an overall problem list through negotiation Institut for Datalogi 4
  • 5. Results: Task Completion No significant difference in task completion Significant difference in task completion time The users in the two asynchronous conditions spent considerably more time We do not know the reason Institut for Datalogi 5
  • 6. Results: Usability Problems Identified A total of 46 usability problems No significant difference between LAB and RS AE/AU identified significantly fewer problems, also critical problems No significant difference between AE and AU in terms of problems identified Institut for Datalogi 6
  • 7. Conclusion RS is the most widely described and used remote method. The performance is virtually equivalent to LAB (or slightly better) AE and AU perform surprisingly well Experts do not perform significantly better than users Video analysis (LAB and RS) required considerably more evaluator effort than the user-based reporting (AU and AE) Users can actually contribute to usability evaluation – not with the same quality, but reasonably well, and there are plenty of them Institut for Datalogi 7
  • 8. Oversigt • Undersøgelse 1 • Undersøgelse 2: hvilken asynkron metode • Metode • Resultater • Konklusion Institut for Datalogi 8
  • 9. Empirical Study 2 Purpose: examine and compare remote asynchronous methods Focus on usability problems identified Comparable with the previous study Selection of asynchronous methods based on literature survey Institut for Datalogi 9
  • 10. The 3 Remote Asynchronous Methods User-reported critical incident (UCI) • Well-defined method (Castillo et al. CHI 1998) Forum-based online reporting and discussion (Forum) • Assumption: through collaboration participants may give input which increases data quality and richness (Thompson, 1999) • A source for collecting qualitative data in a study of auto logging (Millen, 1999): the participants turned out to report detailed usability feedback Diary-based longitudinal user reporting (Diary) • Used on a longitudinal basis for participants in a study of auto logging to provide qualitative information (Steves et al. CSCW 2001) • First day: same tasks as the other conditions (first part of diary delivered) • Four more days: new tasks (same type) sent daily (complete diary delivered) Conventional user-based laboratory test (Lab) • Included as benchmark Institut for Datalogi 10
  • 11. Empirical Study (1) Participants: • 40 test subjects, 10 for each condition • Students, age 20 to 30 • Distributed evenly: gender and tech/non-tech education Setting: • LAB: in our usability lab • Remote asynchronous: in the participants’ homes Participants in the remote asynchronous conditions received the software and installed it on their computer Training material for the remote asynchronous conditions • Identification and categorisation of usability problems • A minimalist approach that was strictly remote and asynchronous (via email) Institut for Datalogi 11
  • 12. Empirical Study (2) Tasks: • Nine fixed tasks • The same across the four conditions to ensure that all participants used the same parts of the system • Typical email tasks (same as previous study) Data collection in accordance with the method • LAB: video recordings • UCI: web-based system for generating problem descriptions while solving tasks • Forum: after solving tasks, one week for posting and discussing problems • Diary: a diary with no imposed structure; first part after the first day Institut for Datalogi 12
  • 13. Data Analysis All data collected before the data analysis started 3 evaluators did the whole data analysis The 40 data sets were analysed by the 3 evaluators • In random order: by a draw • In different order between them The user input from the three remote conditions was transformed into usability problem descriptions Each evaluator generated his/her own individual lists of usability problems with their own severity ratings • A problem list for each condition • A complete problem list (joined) These were merged into an overall problem list through negotiation Institut for Datalogi 13
  • 14. Results: Task Completion Time Considerable variation in task completion times Participants in the remote conditions worked in their home at a time they selected For each task there was a hint that allowed them to check if they had solved the task correctly As we have no data on the task solving process in the remote conditions, we cannot explain this variation Institut for Datalogi 14
  • 15. Results: Usability Problems Identified Lab UCI Forum Diary N=10 N=10 N=10 N=10 Task completion time in Tasks 1-9: minutes: Average (SD) 24.24 (6.3) 34.45 (14.33) 15.45 (5.83) 32.57 (28.34) Usability problems: # % # % # % # % Critical (21) 20 95 10 48 9 43 11 52 Serious (17) 14 82 2 12 1 6 6 35 Cosmetic (24) 12 50 1 4 5 21 12 50 Total (62) 46 74 13 21 15 24 29 47 LAB: significantly better than the 3 remote conditions UCI-Forum: no significant difference UCI-Diary: significant overall: Diary – also significant on cosmetic Forum-Diary: significant overall: Diary – not significant on any level Institut for Datalogi 15
  • 16. Results: Evaluator Effort Lab UCI Forum Diary (46) (13) (15) (29) Preparation 6:00 2:40 2:40 2:40 Conducting test 10:00 1:00 1:00 1:30 Analysis 33:18 2:52 3:56 9:38 Merging problem lists 11:45 1:41 1:42 4:58 Total time spent 61:03 8:13 9:18 18:46 Avg. time per problem 1:20 0:38 0:37 0:39 The sum for all evaluators involved in each activity Time for finding test subjects is not included (8h, common for all) Task specifications from an earlier study. Preparation in the remote conditions: work out written instructions Considerable differences between the remote conditions for analysis and merging of problem lists Institut for Datalogi 16
  • 17. Conclusion The three remote methods performed significantly below the classical lab test in terms of the number of usability problems identified The Diary was the best remote method – it identified half of the problems found in the Lab condition UCI and Forum performed similarly for critical problems but worse for serious problems UCI and Forum took 13% of the lab test. Diary took 30% The productivity of the remote methods was considerably higher Institut for Datalogi 17
  • 19. Interaktionsdesign og usability-evaluering Master i IT Videreuddannelse under IT-Vest Fagpakke i Interaktionsdesign og usability-evaluering starter 1/2-12 Optager bachelorer, men også indgang for datamatikere Information: http://www.master-it-vest.dk/ Institut for Datalogi 19