SlideShare uma empresa Scribd logo
1 de 29
1 1 ASSESSING INTERACTIVE SYSTEM EFFECTIVENESS WITH USABILITY DESIGN HEURISTICS AND MARKOV MODELS OF USER BEHAVIOR  Presented by: Lashanda Lee
2 Motivation 2 For HCI to be successful, interfaces must be designed to: Effectively translate intentions, actions and inputs of operator for computer  Effectively translate machine outputs for human comprehension HCI frameworks available can aid in evaluating interface designs and generating subjective data. Quantitative objective data is also needed as basis for cost justification and determination of ROI Modeling human behavior may reduce the need for experimentation : Time Expense Combination of data from OR techniques with subjective data can be used to generate score for overall system effectiveness  Allows for comparison of alternative interface designs.
3 Literature ReviewHCI frameworks 3 Norman’s model of HCI Two stages Does not focus on continuous cycle of communication  Pipe-line model Inputs and outputs of system operate in parallel Complex model with many states Does not show cognitive process of human Explains the computer processing Dix et al model Focuses on distances between both user and system  Focuses on continuous cycle of communication  Chosen as basis for evaluation in present research
4 Literature ReviewUsability paradigms and principles 4 Paradigms: how humans interact with computers  ubiquitous computing, intelligent systems, virtual reality  and WIMP  Principles: how paradigms work  Flexibility, consistency, robustness, recoverability and learnability Each paradigm focused on different usability principles.  Specific usability measures can be used to assess certain paradigms  Paradigms address figurative distance of articulation in Dix’s model in different ways   Examples: Intelligent interfaces using NLP: Greatest reduction in articulation distance for users but furthest from system language Command line interface Farthest from user in Dix framework but easy for the computer to understand WIMP Easy for both the system and user to interpret and equal in distance between the user and the system in the Dix interaction framework
5 Literature ReviewMeasures of usabilityQualitative measures 5 Low cost but low discovery Comparisons of designs based on interface qualities Data hard to analyze May not lead to design changes because management considers data unreliable Subjective data Inspection methods Low cost and quick discovery of problems using low skill evaluators Often fail to find many serious problems and do not provide enough evidence to create design recommendations Types include: Heuristic methods, guidelines, style and rule inspections Verbal reports Hard to find an appropriate way to use the data  Gain insight into cognition Surveys Inexpensive and helps find trouble spots Some information lost due to STM capabilities
6 Literature ReviewMeasures of usabilityQuantitative measures 6 Used to make comparisons of designs based on quantities associated with certain interface features  Useful in presenting information to management Goals may be too ambitious or there are too many goals  Cannot cover entire systems  Subjective responses Rankings, ratings or fuzzy set ratings Considered quantitative because they involve manipulation and analysis of data as a basis for comparing interface alternatives.  Objective responses Measures of effectiveness: binary task completion, number of correct tasks completed and task performance accuracy Measures of efficiency: task completion time, time in mode, usage patterns and degree of variation from an optimal solution Fuzzy sets User modeling Count of concrete occurrences and not based on the opinions of users
7 Literature ReviewQuantitative objective measuresFuzzy sets and user modeling 7 Fuzzy sets Used to compare interface alternatives Aggregate score produced based on count of interface inadequacies  Fuzzy sets logic used to determine membership for aggregate score Method uses both subjective and objective measures Requires multiple cycles of user testing to compare scores Doesn’t use variable weights for dimensions considering them all equal User Modeling Used to predict interface action sequences based on prior use data. Limited in revealing actual human performance, not exact Can be used to help guide users while performing task with an interface Activist GOMS Estimates task performance times  Produces accurate predictions of user actions Takes a long time to create Benefits include: Model one or more types of users Analyze without additional user testing
8 Literature ReviewUsability measuresSummary 8 Qualitative Used iteratively Low discovery Hard to analyze Usually does not effect change in a display because management considers data unreliable Quantitative Appear to be better for detailed usability problem analysis and design recommendations User modeling can decrease cost Necessary to gain management support Combine an objective quantitative user modeling approach with subjective usability measures may provide: An approach effective in finding problems Basis for interface redesign
9 Literature ReviewOperations Research methods of usability evaluation 9 Use of techniques such as mathematical modeling to analyze complex situations Used to optimize systems Limited use in usability evaluation or interface improvements Methods used: Markov models Stochastic processes Used for website customization Predict user behavior Research by: Kitajima et al., Thimbleby et al., Jenamani et al. Probabilistic finite state models Include time distributions and transitional probabilities Generate user behavior predictions Research by: Sholl et al. Critical path models Algorithm determines longest time Can also incorporate stochastic process predictions Research by: Baber and Mellor (2001)
10 Literature ReviewOperations Research methods of usability evaluationMarkov models: Kitajima et al and Thimbleby et al. 10 Kitajima et al. Markov models used to predict user behavior Determine number of clicks to find relevant articles  After interface improvements, used model to predict number of clicks  Number of clicks was reduced Used equation u(i) = 1+Σ Piku(k) Thimbleby et al. Applied Markov chains to several applications: microwave oven and cell phone Used Markov chains to predict number of steps Used Mathematica simulation of microwave to gather information Used a mixture of perfect error-free matrix: Used knowledge factors from 0 to 1, (1 was a fully knowledgeable user) Simulated user behavior Original design took 120 steps (for random user) Improved design took fewer steps (for the random user) Fewer steps considered “easier”
11 Literature ReviewOperations Research methods of usability evaluationSummary 11 Appears to be a viable and useful approach to evaluate interface usability Provides objective quantitative data without need for several iterations of testing Used repeatedly to predict behavior, such as number of clicks and task times  Accurately predicts user behavior
12 Summary and Problem Statement 12 Need to use framework  describing communication between humans and computers to guide design improvements (Dix et al. was chosen for its simplicity and cyclic structure.) Usability paradigms help identify types of technology that can be used to improve systems and provide direction in how to evaluate systems. WIMP paradigm chosen for its simplicity accommodation of user and system  Many subjective measures but not adequate for assessing performance and supporting design changes Objective, quantitative measures often gain the support of management for design changes but are expensive  OR methods: Markov models accurately predict human behavior Need to define approach to using both types of measures to evaluate usability and require minimal user testing. Combined use of  Dix et al. model subjective system evaluations and OR modeling techniques to predict user behavior of interface Both methods used to produce overall system effectiveness score to compare alternative designs.
13 MethodOverview of system effectiveness score 13 Dix et al. framework Survey for designers- capture the perceptions of importance of each link in HCI framework Survey for user with Markov model prediction of average number of interface actions (clicks) -users rated interfaces with respect to links in the framework Novelty is measure reflects designer’s intent for application and user’s perception of the system Designer weights and user ratings are multiplied and summed across links Weighted sum is divided by Markov model prediction of average number of clicks  Score represents perceived usability per action
14 MethodWeighting factor determination 14 Designers expected to be most concerned with cognitive load. Four designers surveyed using the Dix et al. framework: Based on paradigm for application (WIMP), how important is each link to system effectiveness Pair-wise comparisons of links Values ranged between 0 and 0.5 Weighting factors averaged across designers to determine weight for each dimension Weights were used in calculating overall system subjective score (designer’s rankings x user ratings)
15 MethodExperimental task 15 Used a version of Lenovo.com prototype to find and order ThinkPad R60 Twenty participants: 11 males, 9 females Age range: 17-25 Half participants used old version of Lenovo.com website: Required 11 clicks to buy (optimal path) Tabs that separated the features information and the ability to purchase Half of the participants used a new prototype: Required 9 clicks to buy (optimal path) All information about type of computer contained on 1 page Multi-level navigation structure More salient buttons
16 MethodDeveloping Markov Chain models 16 JavaScript recorded user actions Old online ordering system used to identify states: Links, Tabs, Menu options (Radio buttons and popups not included) Used action sequences to create transitional probability matrices Based on actual number of users going from state i to state k. Assumptions of Markov model include: Sum of each row must equal 1 Probability of next interface state only depends on current state To determine average number of clicks to task completion, used Kitajima et al. (2005) u(i) = 1+Σ Piku(k) Need state probability matrix based on action sequences Need average number of steps from one state to another (based on designer analysis)
17 MethodRating system effectiveness (based on Dix framework) 17 Used Dix et al. framework End users rated links On a scale from 1 to 10 Presented framework at end of the task  Determined average ratings for each link and used in overall system effectiveness score
18 MethodOverall system effectiveness score and Markov model validation 18 Overall score Used to compare alternative interface design Average designer weights for each dimension Average rating by end users Product of two is partial score Partial score divided by predicted average number of clicks is overall score Highest ratio considered to indicate higher overall system effectiveness Validation T-test used to determine if actual observed number of clicks was significantly different from number of clicks with Markov model. SystemEffectiveness:
19 ResultsAssessment of Markov model assumption  19 Transition from one state must only be dependent on the current state Durbin-Watson test used to assess autocorrelation among user steps in interaction  Test statistics were:1.2879 (old)  and 2.0815 (new)  Normalization procedure applied to original transitional probability matrices. Durbin-Watson test conducted on normalized data Test statistics were: 1.3920 (old) and 2.27 (new) Test revealed mixed evidence Model was accepted and applied to predict average number of clicks
20 ResultsComputation of average number of steps  20 The average number of steps it takes to get from any one state to the other Represents individual u(k) in the Kitajima et al. equation Matrix created by designers of the interface
21 ResultsComputation of average number of clicks  21 Use u(i) = 1+Σ Piku(k) Consider paths to absorbent state to determine average number of clicks Markov model predicted number of clicks for each interface: 11.5 for old (actual 12.9) 9 for new (actual 9.2) T-test used to compare the difference between actual clicks across interfaces T-value: -4.30 with p-value: 0.0004 Actual number of clicks different across interfaces - new was significantly less T-test used to compare actual click count to predicted click count for all subjects: P-value: 0.439 for new P-value: 0.0605 for old No significant difference between actual and predicted on either interface T-tests used to compare predicted clicks across interfaces:  P-value: 0.0033 New interface reduced number of clicks
22 ResultsPartial system effectiveness score 22 Each participant rated interfaces on each dimension using scale of 1 to 10 Designers completed pair-wise comparisons Designers expected to rate articulation and observation higher T-test used to compare designer ratings of articulation and observation with performance and presentation  Rated articulation and observation higher  Average designer weights were multiplied by average user ratings  T-test used to compare partial score of new against old for all subjects T-value: 5.08; p-value: < .0001 Partial score for new interface is significantly higher
23 ResultsOverall system effectiveness score 23 Partial score was divided by predicted average number of clicks to yield perceived usability per click New: 0.939 Old: 0.475 T-test used to compare overall score for new and old interfaces for all subjects T-value: 5.62; p-value: < .0001 Overall system effectiveness score for new was significantly higher than old
24 ResultsReducing experimentation 24 Purpose of Markov model was to predict number of clicks and to reduce need of additional user testing. Designers can speculate an average number of steps to transition among state in the new interface and multiply by probabilities determined for original interface (through user testing) Predicted number of clicks for new interface was 9.35 (actual 9.2) T-test used to compare if actual number of clicks was different then the predicted number of clicks T-value: 1.15; p-value: 0.270 Markov model was accurate in predicting the average number of clicks In order to obtain user ratings, focus groups would be necessary  Approach significantly reduces time and money necessary for user testing
25 DiscussionDesigner ratings 25 Hypothesis: Average designer weighting factors for articulation and observation will be higher than performance and presentation Designers were concerned with cognitive load, as represented by articulation and observation If customer cannot find what (s)he is looking for, may lead to: Frustration Lost customers Lost revenue Designers realize that effectively reducing cognitive load is important
26 DiscussionImproved usability 26 Hypothesis: New interface will improve perceived usability Multi-level navigation was used to reduce cognitive load: Easier to find and view all options Users could reach many state with 1 click Identified by users of  new interface as one of the most usable features More prominent buttons: Aided in easily identifying next steps In original interface, users had difficult time finding customize button Often scrolled up and down page or backtracked to determine what to do next Partial system effectiveness score was higher for new interface (8.6) than the old (5.2)
27 DiscussionHigher system effectiveness score 27 Hypothesis: New interface will produce higher score because of perceived higher usability Old interface degraded performance: From features tab, some found it difficult to identify what to do next Once users found product tab, some scrolled up and down trying to determine what to do next (new interface alleviated both these problems -all information on 1 page) Higher perceived usability and fewer clicks led to higher ratio
28 DiscussionMarkov model accurately predicted average number of clicks 28 Hypothesis: Markov model will accurately predict average number of clicks used equation detailed by Kitajima Because Markov models are used to represent stochastic behavior they proved valid in present work Model revealed the variability among participants but do not show exact magnitude of the error
29 Conclusion 29 Objective was to create new measure of usability  Based on: Few quantitative objective measures  Many subjective measures insufficient to justify design changes Research supports subjective measure using Dix et al. framework and an objective measure, based on Markov models Method is: effective in objectively selecting among alternative designs and reducing the amount of experimentation necessary Easy to implement  Can be used with several alternatives without the need for testing Cannot apply to interfaces where selection of next state depends on previous states and not only current state Future research: Use Markov models to predict next steps, user will take and make relevant interface options more salient to improve usability Find a way to incorporate time-on-task in overall effectiveness score: Perceived time-on-task will impact customer retention Research a method to accurately predict

Mais conteúdo relacionado

Mais procurados

Introduction of software engineering
Introduction of software engineeringIntroduction of software engineering
Introduction of software engineeringBhagyashriMore10
 
hci in software development process
hci in software development processhci in software development process
hci in software development processKainat Ilyas
 
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rulesPreeti Mishra
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.docbutest
 
HCI 3e - Ch 6: HCI in the software process
HCI 3e - Ch 6:  HCI in the software processHCI 3e - Ch 6:  HCI in the software process
HCI 3e - Ch 6: HCI in the software processAlan Dix
 
Effectiveness of software product metrics for mobile application
Effectiveness of software product metrics for mobile application Effectiveness of software product metrics for mobile application
Effectiveness of software product metrics for mobile application tanveer ahmad
 
Requirements Engineering Processes in Software Engineering SE6
Requirements Engineering Processes in Software Engineering SE6Requirements Engineering Processes in Software Engineering SE6
Requirements Engineering Processes in Software Engineering SE6koolkampus
 
An interactive approach to requirements prioritization using quality factors
An interactive approach to requirements prioritization using quality factorsAn interactive approach to requirements prioritization using quality factors
An interactive approach to requirements prioritization using quality factorsijfcstjournal
 
Software engineering
Software engineeringSoftware engineering
Software engineeringsweetysweety8
 
Hci in software process
Hci in software processHci in software process
Hci in software processrida mariam
 
Comparison Of Methodologies
Comparison Of MethodologiesComparison Of Methodologies
Comparison Of Methodologiesguestc990b6
 
Software engineering rogers pressman chapter 7
Software engineering rogers pressman chapter 7Software engineering rogers pressman chapter 7
Software engineering rogers pressman chapter 7mohammad hossein Jalili
 
Requirements engineering process in software engineering
Requirements engineering process in software engineeringRequirements engineering process in software engineering
Requirements engineering process in software engineeringPreeti Mishra
 
System development approaches
System development approachesSystem development approaches
System development approachesJaipal Dhobale
 

Mais procurados (20)

Slides chapter 19
Slides chapter 19Slides chapter 19
Slides chapter 19
 
Introduction of software engineering
Introduction of software engineeringIntroduction of software engineering
Introduction of software engineering
 
hci in software development process
hci in software development processhci in software development process
hci in software development process
 
Design process design rules
Design process  design rulesDesign process  design rules
Design process design rules
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.doc
 
HCI 3e - Ch 6: HCI in the software process
HCI 3e - Ch 6:  HCI in the software processHCI 3e - Ch 6:  HCI in the software process
HCI 3e - Ch 6: HCI in the software process
 
Effectiveness of software product metrics for mobile application
Effectiveness of software product metrics for mobile application Effectiveness of software product metrics for mobile application
Effectiveness of software product metrics for mobile application
 
Requirements Engineering Processes in Software Engineering SE6
Requirements Engineering Processes in Software Engineering SE6Requirements Engineering Processes in Software Engineering SE6
Requirements Engineering Processes in Software Engineering SE6
 
Slides chapters 21-23
Slides chapters 21-23Slides chapters 21-23
Slides chapters 21-23
 
An interactive approach to requirements prioritization using quality factors
An interactive approach to requirements prioritization using quality factorsAn interactive approach to requirements prioritization using quality factors
An interactive approach to requirements prioritization using quality factors
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Hci in software process
Hci in software processHci in software process
Hci in software process
 
Comparison Of Methodologies
Comparison Of MethodologiesComparison Of Methodologies
Comparison Of Methodologies
 
Software engineering rogers pressman chapter 7
Software engineering rogers pressman chapter 7Software engineering rogers pressman chapter 7
Software engineering rogers pressman chapter 7
 
Requirements engineering process in software engineering
Requirements engineering process in software engineeringRequirements engineering process in software engineering
Requirements engineering process in software engineering
 
Evaluating and selecting software packages a review
Evaluating and selecting software packages a reviewEvaluating and selecting software packages a review
Evaluating and selecting software packages a review
 
195
195195
195
 
System development approaches
System development approachesSystem development approaches
System development approaches
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Ch07
Ch07Ch07
Ch07
 

Destaque

Liquid Lab Final Lashanda Hodge
Liquid Lab Final Lashanda HodgeLiquid Lab Final Lashanda Hodge
Liquid Lab Final Lashanda HodgeLashanda Hodge
 
Nc Erc Annual Report 2007 2008
Nc Erc Annual Report 2007 2008Nc Erc Annual Report 2007 2008
Nc Erc Annual Report 2007 2008Lashanda83
 
Teaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakTeaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakShelly Sanchez Terrell
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
 

Destaque (6)

Liquid Lab Final Lashanda Hodge
Liquid Lab Final Lashanda HodgeLiquid Lab Final Lashanda Hodge
Liquid Lab Final Lashanda Hodge
 
Nc Erc Annual Report 2007 2008
Nc Erc Annual Report 2007 2008Nc Erc Annual Report 2007 2008
Nc Erc Annual Report 2007 2008
 
Inaugural Addresses
Inaugural AddressesInaugural Addresses
Inaugural Addresses
 
Teaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & TextspeakTeaching Students with Emojis, Emoticons, & Textspeak
Teaching Students with Emojis, Emoticons, & Textspeak
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
 

Semelhante a Thesis

Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Dr. Cornelius Ludmann
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessIJMER
 
What I Learned In Pr Writing
What I Learned In Pr WritingWhat I Learned In Pr Writing
What I Learned In Pr Writingcwhitin4
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangersguest08cd22
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangersguestbdd02b
 
Designfo#{1} #{2}trangers
Designfo#{1} #{2}trangersDesignfo#{1} #{2}trangers
Designfo#{1} #{2}trangersguest0437b8
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangersguru100
 
Designfor strangers
Designfor strangersDesignfor strangers
Designfor strangersguestc72c35
 
Design For Strangers
Design For StrangersDesign For Strangers
Design For Strangerstest99
 
Rashmi Xerox Parc
Rashmi Xerox ParcRashmi Xerox Parc
Rashmi Xerox Parctest98
 
Methods for analysis and design of an information system.pdf
Methods for analysis and design of an information system.pdfMethods for analysis and design of an information system.pdf
Methods for analysis and design of an information system.pdfJonathanCovena1
 
Ui Design And Usability For Everybody
Ui Design And Usability For EverybodyUi Design And Usability For Everybody
Ui Design And Usability For EverybodyEmpatika
 
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSIS
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSISA METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSIS
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSISIJwest
 
THE USABILITY METRICS FOR USER EXPERIENCE
THE USABILITY METRICS FOR USER EXPERIENCETHE USABILITY METRICS FOR USER EXPERIENCE
THE USABILITY METRICS FOR USER EXPERIENCEvivatechijri
 
System_Analysis_and_Design_Assignment_New2.ppt
System_Analysis_and_Design_Assignment_New2.pptSystem_Analysis_and_Design_Assignment_New2.ppt
System_Analysis_and_Design_Assignment_New2.pptMarissaPedragosa
 

Semelhante a Thesis (20)

Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
 
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcessEvolvea Frameworkfor SelectingPrime Software DevelopmentProcess
Evolvea Frameworkfor SelectingPrime Software DevelopmentProcess
 
What I Learned In Pr Writing
What I Learned In Pr WritingWhat I Learned In Pr Writing
What I Learned In Pr Writing
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangers
 
Biblioteca.
Biblioteca.Biblioteca.
Biblioteca.
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangers
 
Designfo#{1} #{2}trangers
Designfo#{1} #{2}trangersDesignfo#{1} #{2}trangers
Designfo#{1} #{2}trangers
 
Designfor Strangers
Designfor StrangersDesignfor Strangers
Designfor Strangers
 
Designfor strangers
Designfor strangersDesignfor strangers
Designfor strangers
 
Design For Strangers
Design For StrangersDesign For Strangers
Design For Strangers
 
Qué es un blog?
Qué es un blog?Qué es un blog?
Qué es un blog?
 
Rashmi Xerox Parc
Rashmi Xerox ParcRashmi Xerox Parc
Rashmi Xerox Parc
 
Software models
Software modelsSoftware models
Software models
 
Methods for analysis and design of an information system.pdf
Methods for analysis and design of an information system.pdfMethods for analysis and design of an information system.pdf
Methods for analysis and design of an information system.pdf
 
software engineering
software engineering software engineering
software engineering
 
Ui Design And Usability For Everybody
Ui Design And Usability For EverybodyUi Design And Usability For Everybody
Ui Design And Usability For Everybody
 
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSIS
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSISA METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSIS
A METHOD FOR WEBSITE USABILITY EVALUATION: A COMPARATIVE ANALYSIS
 
THE USABILITY METRICS FOR USER EXPERIENCE
THE USABILITY METRICS FOR USER EXPERIENCETHE USABILITY METRICS FOR USER EXPERIENCE
THE USABILITY METRICS FOR USER EXPERIENCE
 
ICS3211_lecture 03 2023.pdf
ICS3211_lecture 03 2023.pdfICS3211_lecture 03 2023.pdf
ICS3211_lecture 03 2023.pdf
 
System_Analysis_and_Design_Assignment_New2.ppt
System_Analysis_and_Design_Assignment_New2.pptSystem_Analysis_and_Design_Assignment_New2.ppt
System_Analysis_and_Design_Assignment_New2.ppt
 

Thesis

  • 1. 1 1 ASSESSING INTERACTIVE SYSTEM EFFECTIVENESS WITH USABILITY DESIGN HEURISTICS AND MARKOV MODELS OF USER BEHAVIOR Presented by: Lashanda Lee
  • 2. 2 Motivation 2 For HCI to be successful, interfaces must be designed to: Effectively translate intentions, actions and inputs of operator for computer Effectively translate machine outputs for human comprehension HCI frameworks available can aid in evaluating interface designs and generating subjective data. Quantitative objective data is also needed as basis for cost justification and determination of ROI Modeling human behavior may reduce the need for experimentation : Time Expense Combination of data from OR techniques with subjective data can be used to generate score for overall system effectiveness Allows for comparison of alternative interface designs.
  • 3. 3 Literature ReviewHCI frameworks 3 Norman’s model of HCI Two stages Does not focus on continuous cycle of communication Pipe-line model Inputs and outputs of system operate in parallel Complex model with many states Does not show cognitive process of human Explains the computer processing Dix et al model Focuses on distances between both user and system Focuses on continuous cycle of communication Chosen as basis for evaluation in present research
  • 4. 4 Literature ReviewUsability paradigms and principles 4 Paradigms: how humans interact with computers ubiquitous computing, intelligent systems, virtual reality and WIMP Principles: how paradigms work Flexibility, consistency, robustness, recoverability and learnability Each paradigm focused on different usability principles. Specific usability measures can be used to assess certain paradigms Paradigms address figurative distance of articulation in Dix’s model in different ways Examples: Intelligent interfaces using NLP: Greatest reduction in articulation distance for users but furthest from system language Command line interface Farthest from user in Dix framework but easy for the computer to understand WIMP Easy for both the system and user to interpret and equal in distance between the user and the system in the Dix interaction framework
  • 5. 5 Literature ReviewMeasures of usabilityQualitative measures 5 Low cost but low discovery Comparisons of designs based on interface qualities Data hard to analyze May not lead to design changes because management considers data unreliable Subjective data Inspection methods Low cost and quick discovery of problems using low skill evaluators Often fail to find many serious problems and do not provide enough evidence to create design recommendations Types include: Heuristic methods, guidelines, style and rule inspections Verbal reports Hard to find an appropriate way to use the data Gain insight into cognition Surveys Inexpensive and helps find trouble spots Some information lost due to STM capabilities
  • 6. 6 Literature ReviewMeasures of usabilityQuantitative measures 6 Used to make comparisons of designs based on quantities associated with certain interface features Useful in presenting information to management Goals may be too ambitious or there are too many goals Cannot cover entire systems Subjective responses Rankings, ratings or fuzzy set ratings Considered quantitative because they involve manipulation and analysis of data as a basis for comparing interface alternatives. Objective responses Measures of effectiveness: binary task completion, number of correct tasks completed and task performance accuracy Measures of efficiency: task completion time, time in mode, usage patterns and degree of variation from an optimal solution Fuzzy sets User modeling Count of concrete occurrences and not based on the opinions of users
  • 7. 7 Literature ReviewQuantitative objective measuresFuzzy sets and user modeling 7 Fuzzy sets Used to compare interface alternatives Aggregate score produced based on count of interface inadequacies Fuzzy sets logic used to determine membership for aggregate score Method uses both subjective and objective measures Requires multiple cycles of user testing to compare scores Doesn’t use variable weights for dimensions considering them all equal User Modeling Used to predict interface action sequences based on prior use data. Limited in revealing actual human performance, not exact Can be used to help guide users while performing task with an interface Activist GOMS Estimates task performance times Produces accurate predictions of user actions Takes a long time to create Benefits include: Model one or more types of users Analyze without additional user testing
  • 8. 8 Literature ReviewUsability measuresSummary 8 Qualitative Used iteratively Low discovery Hard to analyze Usually does not effect change in a display because management considers data unreliable Quantitative Appear to be better for detailed usability problem analysis and design recommendations User modeling can decrease cost Necessary to gain management support Combine an objective quantitative user modeling approach with subjective usability measures may provide: An approach effective in finding problems Basis for interface redesign
  • 9. 9 Literature ReviewOperations Research methods of usability evaluation 9 Use of techniques such as mathematical modeling to analyze complex situations Used to optimize systems Limited use in usability evaluation or interface improvements Methods used: Markov models Stochastic processes Used for website customization Predict user behavior Research by: Kitajima et al., Thimbleby et al., Jenamani et al. Probabilistic finite state models Include time distributions and transitional probabilities Generate user behavior predictions Research by: Sholl et al. Critical path models Algorithm determines longest time Can also incorporate stochastic process predictions Research by: Baber and Mellor (2001)
  • 10. 10 Literature ReviewOperations Research methods of usability evaluationMarkov models: Kitajima et al and Thimbleby et al. 10 Kitajima et al. Markov models used to predict user behavior Determine number of clicks to find relevant articles After interface improvements, used model to predict number of clicks Number of clicks was reduced Used equation u(i) = 1+Σ Piku(k) Thimbleby et al. Applied Markov chains to several applications: microwave oven and cell phone Used Markov chains to predict number of steps Used Mathematica simulation of microwave to gather information Used a mixture of perfect error-free matrix: Used knowledge factors from 0 to 1, (1 was a fully knowledgeable user) Simulated user behavior Original design took 120 steps (for random user) Improved design took fewer steps (for the random user) Fewer steps considered “easier”
  • 11. 11 Literature ReviewOperations Research methods of usability evaluationSummary 11 Appears to be a viable and useful approach to evaluate interface usability Provides objective quantitative data without need for several iterations of testing Used repeatedly to predict behavior, such as number of clicks and task times Accurately predicts user behavior
  • 12. 12 Summary and Problem Statement 12 Need to use framework describing communication between humans and computers to guide design improvements (Dix et al. was chosen for its simplicity and cyclic structure.) Usability paradigms help identify types of technology that can be used to improve systems and provide direction in how to evaluate systems. WIMP paradigm chosen for its simplicity accommodation of user and system Many subjective measures but not adequate for assessing performance and supporting design changes Objective, quantitative measures often gain the support of management for design changes but are expensive OR methods: Markov models accurately predict human behavior Need to define approach to using both types of measures to evaluate usability and require minimal user testing. Combined use of Dix et al. model subjective system evaluations and OR modeling techniques to predict user behavior of interface Both methods used to produce overall system effectiveness score to compare alternative designs.
  • 13. 13 MethodOverview of system effectiveness score 13 Dix et al. framework Survey for designers- capture the perceptions of importance of each link in HCI framework Survey for user with Markov model prediction of average number of interface actions (clicks) -users rated interfaces with respect to links in the framework Novelty is measure reflects designer’s intent for application and user’s perception of the system Designer weights and user ratings are multiplied and summed across links Weighted sum is divided by Markov model prediction of average number of clicks Score represents perceived usability per action
  • 14. 14 MethodWeighting factor determination 14 Designers expected to be most concerned with cognitive load. Four designers surveyed using the Dix et al. framework: Based on paradigm for application (WIMP), how important is each link to system effectiveness Pair-wise comparisons of links Values ranged between 0 and 0.5 Weighting factors averaged across designers to determine weight for each dimension Weights were used in calculating overall system subjective score (designer’s rankings x user ratings)
  • 15. 15 MethodExperimental task 15 Used a version of Lenovo.com prototype to find and order ThinkPad R60 Twenty participants: 11 males, 9 females Age range: 17-25 Half participants used old version of Lenovo.com website: Required 11 clicks to buy (optimal path) Tabs that separated the features information and the ability to purchase Half of the participants used a new prototype: Required 9 clicks to buy (optimal path) All information about type of computer contained on 1 page Multi-level navigation structure More salient buttons
  • 16. 16 MethodDeveloping Markov Chain models 16 JavaScript recorded user actions Old online ordering system used to identify states: Links, Tabs, Menu options (Radio buttons and popups not included) Used action sequences to create transitional probability matrices Based on actual number of users going from state i to state k. Assumptions of Markov model include: Sum of each row must equal 1 Probability of next interface state only depends on current state To determine average number of clicks to task completion, used Kitajima et al. (2005) u(i) = 1+Σ Piku(k) Need state probability matrix based on action sequences Need average number of steps from one state to another (based on designer analysis)
  • 17. 17 MethodRating system effectiveness (based on Dix framework) 17 Used Dix et al. framework End users rated links On a scale from 1 to 10 Presented framework at end of the task Determined average ratings for each link and used in overall system effectiveness score
  • 18. 18 MethodOverall system effectiveness score and Markov model validation 18 Overall score Used to compare alternative interface design Average designer weights for each dimension Average rating by end users Product of two is partial score Partial score divided by predicted average number of clicks is overall score Highest ratio considered to indicate higher overall system effectiveness Validation T-test used to determine if actual observed number of clicks was significantly different from number of clicks with Markov model. SystemEffectiveness:
  • 19. 19 ResultsAssessment of Markov model assumption 19 Transition from one state must only be dependent on the current state Durbin-Watson test used to assess autocorrelation among user steps in interaction Test statistics were:1.2879 (old) and 2.0815 (new) Normalization procedure applied to original transitional probability matrices. Durbin-Watson test conducted on normalized data Test statistics were: 1.3920 (old) and 2.27 (new) Test revealed mixed evidence Model was accepted and applied to predict average number of clicks
  • 20. 20 ResultsComputation of average number of steps 20 The average number of steps it takes to get from any one state to the other Represents individual u(k) in the Kitajima et al. equation Matrix created by designers of the interface
  • 21. 21 ResultsComputation of average number of clicks 21 Use u(i) = 1+Σ Piku(k) Consider paths to absorbent state to determine average number of clicks Markov model predicted number of clicks for each interface: 11.5 for old (actual 12.9) 9 for new (actual 9.2) T-test used to compare the difference between actual clicks across interfaces T-value: -4.30 with p-value: 0.0004 Actual number of clicks different across interfaces - new was significantly less T-test used to compare actual click count to predicted click count for all subjects: P-value: 0.439 for new P-value: 0.0605 for old No significant difference between actual and predicted on either interface T-tests used to compare predicted clicks across interfaces: P-value: 0.0033 New interface reduced number of clicks
  • 22. 22 ResultsPartial system effectiveness score 22 Each participant rated interfaces on each dimension using scale of 1 to 10 Designers completed pair-wise comparisons Designers expected to rate articulation and observation higher T-test used to compare designer ratings of articulation and observation with performance and presentation Rated articulation and observation higher Average designer weights were multiplied by average user ratings T-test used to compare partial score of new against old for all subjects T-value: 5.08; p-value: < .0001 Partial score for new interface is significantly higher
  • 23. 23 ResultsOverall system effectiveness score 23 Partial score was divided by predicted average number of clicks to yield perceived usability per click New: 0.939 Old: 0.475 T-test used to compare overall score for new and old interfaces for all subjects T-value: 5.62; p-value: < .0001 Overall system effectiveness score for new was significantly higher than old
  • 24. 24 ResultsReducing experimentation 24 Purpose of Markov model was to predict number of clicks and to reduce need of additional user testing. Designers can speculate an average number of steps to transition among state in the new interface and multiply by probabilities determined for original interface (through user testing) Predicted number of clicks for new interface was 9.35 (actual 9.2) T-test used to compare if actual number of clicks was different then the predicted number of clicks T-value: 1.15; p-value: 0.270 Markov model was accurate in predicting the average number of clicks In order to obtain user ratings, focus groups would be necessary Approach significantly reduces time and money necessary for user testing
  • 25. 25 DiscussionDesigner ratings 25 Hypothesis: Average designer weighting factors for articulation and observation will be higher than performance and presentation Designers were concerned with cognitive load, as represented by articulation and observation If customer cannot find what (s)he is looking for, may lead to: Frustration Lost customers Lost revenue Designers realize that effectively reducing cognitive load is important
  • 26. 26 DiscussionImproved usability 26 Hypothesis: New interface will improve perceived usability Multi-level navigation was used to reduce cognitive load: Easier to find and view all options Users could reach many state with 1 click Identified by users of new interface as one of the most usable features More prominent buttons: Aided in easily identifying next steps In original interface, users had difficult time finding customize button Often scrolled up and down page or backtracked to determine what to do next Partial system effectiveness score was higher for new interface (8.6) than the old (5.2)
  • 27. 27 DiscussionHigher system effectiveness score 27 Hypothesis: New interface will produce higher score because of perceived higher usability Old interface degraded performance: From features tab, some found it difficult to identify what to do next Once users found product tab, some scrolled up and down trying to determine what to do next (new interface alleviated both these problems -all information on 1 page) Higher perceived usability and fewer clicks led to higher ratio
  • 28. 28 DiscussionMarkov model accurately predicted average number of clicks 28 Hypothesis: Markov model will accurately predict average number of clicks used equation detailed by Kitajima Because Markov models are used to represent stochastic behavior they proved valid in present work Model revealed the variability among participants but do not show exact magnitude of the error
  • 29. 29 Conclusion 29 Objective was to create new measure of usability Based on: Few quantitative objective measures Many subjective measures insufficient to justify design changes Research supports subjective measure using Dix et al. framework and an objective measure, based on Markov models Method is: effective in objectively selecting among alternative designs and reducing the amount of experimentation necessary Easy to implement Can be used with several alternatives without the need for testing Cannot apply to interfaces where selection of next state depends on previous states and not only current state Future research: Use Markov models to predict next steps, user will take and make relevant interface options more salient to improve usability Find a way to incorporate time-on-task in overall effectiveness score: Perceived time-on-task will impact customer retention Research a method to accurately predict