SlideShare uma empresa Scribd logo
1 de 24
Benchmarking Web Accessibility Evaluation Tools:
10th International Cross-Disciplinary Conference on Web Accessibility
W4A2013
Markel Vigo University of Manchester (UK)
Justin Brown Edith Cowan University (Australia)
Vivienne Conway Edith Cowan University (Australia)
Measuring the Harm of Sole Reliance on Automated Tests
http://dx.doi.org/10.6084/m9.figshare.701216
Problem & Fact
W4A201313 May 2013 2
WWW is not accessible
Evidence
W4A201313 May 2013 3
Webmasters are familiar with accessibility
guidelines
Lazar et al., 2004
Improving web accessibility: a study of webmaster perceptions
Computers in Human Behavior 20(2), 269–288
Hypothesis I
Assuming guidelines do a good job...
H1: Accessibility guidelines awareness is not that
widely spread.
W4A201313 May 2013 4
Evidence II
W4A201313 May 2013 5
Webmasters put compliance logos on non-
compliant websites
Gilbertson and Machin, 2012
Guidelines, icons and marketable skills: an accessibility evaluation of 100 web development company
homepages
W4A 2012
Hypothesis II
Assuming webmasters are not trying to cheat...
H2: A lack of awareness on the negative effects
of overreliance on automated tools.
W4A201313 May 2013 6
• It's easy
• In some scenarios seems like the only
option: web observatories, real-time...
• We don't know how harmful they can be
W4A201313 May 2013 7
Expanding on H2
Why we rely on automated tests
• If we are able to measure these
limitations we can raise awareness
• Inform developers and researchers
• We run a study with 6 tools
• Compute coverage, completeness and
correctness wrt WCAG 2.0
W4A201313 May 2013 8
Expanding on H2
Knowing the limitations of tools
• Coverage: whether a given Success
Criteria (SC) is reported at least once
• Completeness:
• Correctness:
W4A201313 May 2013 9
Method
Computed Metrics
true_ positives
actual _violations
false_ positives
true_ positives+ false_ positives
W4A201313 May 2013 10
Vision Australia
www.visionaustralia.org.au
• Non-profit
• Non-government
• Accessibility resource
Prime Minister
www.pm.gov.au
• Federal Government
• Should abide by the
Transition Strategy
Transperth
www.transperth.wa.gov.au
• Government affiliated
• Used by people with
disabilities
Method
Stimuli
Method
Obtaining the "Ground Truth"
W4A201313 May 2013 11
Ad-hoc sampling
Manual evaluation
Agreement
Ground truth
W4A201313 May 2013 12
Evaluate Compare with
the GT
Method
Computing Metrics
Compute
metrics
T1
For every page in
the sample...
T2
T3
T4
T5
T6
R1
R2
R3
R4
R5
R6
Get reports
GT
M1
M2
M3
M4
M5
M6
Accessibility of Stimuli
W4A201313 May 2013 13
1.1.1
1.2.1
1.2.2
1.2.3
1.2.4
1.2.5
1.3.1
1.3.2
1.3.3
1.4.1
1.4.2
1.4.3
1.4.4
1.4.5
2.1.1
2.1.2
2.2.1
2.2.2
2.3.1
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
2.4.9
2.4.10
3.1.1
3.1.2
3.2.1
3.2.2
3.2.3
3.2.4
3.3.1
3.3.2
3.3.3
3.3.4
4.1.1
4.1.2
violated success criteria
frequency
020406080
1.1.1
1.2.1
1.2.2
1.2.3
1.2.4
1.2.5
1.3.1
1.3.2
1.3.3
1.4.1
1.4.2
1.4.3
1.4.4
1.4.5
2.1.1
2.1.2
2.2.1
2.2.2
2.3.1
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
2.4.9
2.4.10
3.1.1
3.1.2
3.2.1
3.2.2
3.2.3
3.2.4
3.3.1
3.3.2
3.3.3
3.3.4
4.1.1
4.1.2
violated success criteria
frequency
020406080
1.1.1
1.2.1
1.2.2
1.2.3
1.2.4
1.2.5
1.3.1
1.3.2
1.3.3
1.4.1
1.4.2
1.4.3
1.4.4
1.4.5
2.1.1
2.1.2
2.2.1
2.2.2
2.3.1
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
2.4.9
2.4.10
3.1.1
3.1.2
3.2.1
3.2.2
3.2.3
3.2.4
3.3.1
3.3.2
3.3.3
3.3.4
4.1.1
4.1.2
violated success criteria
frequency
020406080
Vision Australia
www.visionaustralia.org.au
Prime Minister
www.pm.gov.au
Transperth
www.transperth.wa.gov.au
• 650 WCAG Success Criteria violations
(A and AA)
• 23-50% of SC are covered by
automated test
• Coverage varies across guidelines and
tools
W4A201313 May 2013 14
Results
Coverage
• Completeness ranges in 14-38%
• Variable across tools and principles
W4A201313 May 2013 15
Results
Completeness per tool
• How conformance levels influence on
completeness
• Wilcoxon Signed Rank: W=21, p<0.05
• Completeness levels are higher for
'A level' SC
W4A201313 May 2013 16
Results
Completeness per type of SC
• How accessibility levels influence on
completeness
• ANOVA: F(2,10)=19.82, p<0.001
• The less accessible a page is the
higher levels of completeness
W4A201313 May 2013 17
Results
Completeness vs. accessibility
• Cronbach's α = 0.96
• Multidimensional Scaling (MDS)
• Tools behave similarly
W4A201313 May 2013 18
Results
Tool Similarity on Completeness
• Tools with lower completeness scores
exhibit higher levels of correctness 93-
96%
• Tools that obtain higher completeness
yield lower correctness 66-71%
• Tools with higher completeness are
also the most incorrect ones
W4A201313 May 2013 19
Results
Correctness
• We corroborate that 50% is the upper limit
for automatising guidelines
• Natural Language Processing?
– Language: 3.1.2 Language of parts
– Domain: 3.3.4 Error prevention
W4A201313 May 2013 20
Implications
Coverage
• Automated tests do a better job...
...on non-accessible sites
...on 'A level' success criteria
• Automated tests aim at catching
stereotypical errors
W4A201313 May 2013 21
Implications
Completeness I
• Strengths of tools can be identified across
WCAG principles and SC
• A method to inform decision making
• Maximising completeness in our sample
of pages
– On all tools: 55% (+17 percentage points)
– On non-commercial tools: 52%
W4A201313 May 2013 22
Implications
Completeness II
Conclusions
• Coverage: 23-50%
W4A201313 May 2013 23
• Completeness: 14-38%
• Higher completeness leads to lower
correctness
Follow up
13 May 2013 24
Contact
@markelvigo | markel.vigo@manchester.ac.uk
Presentation DOI
http://dx.doi.org/10.6084/m9.figshare.701216
Datasets
http://www.markelvigo.info/ds/bench12/index.html
10th International Cross-Disciplinary Conference on Web Accessibility
W4A2013

Mais conteúdo relacionado

Destaque

Chapter No:3 Web Productivity Measurement and Benchmarking
Chapter No:3Web Productivity Measurement and BenchmarkingChapter No:3Web Productivity Measurement and Benchmarking
Chapter No:3 Web Productivity Measurement and BenchmarkingSofthat IT Solutions
 
Benchmarking Web Application Scanners for YOUR Organization
Benchmarking Web Application Scanners for YOUR OrganizationBenchmarking Web Application Scanners for YOUR Organization
Benchmarking Web Application Scanners for YOUR OrganizationDenim Group
 
Benchmarking and assessing your web strategy
Benchmarking and assessing your web strategyBenchmarking and assessing your web strategy
Benchmarking and assessing your web strategylisamarieram
 
Web Analytics Benchmarking Solution
Web Analytics Benchmarking SolutionWeb Analytics Benchmarking Solution
Web Analytics Benchmarking SolutionPhil Pickard
 
Case competitive benchmarking
Case  competitive benchmarkingCase  competitive benchmarking
Case competitive benchmarkingNandeep Nagarkar
 
The Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based TrainingThe Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based TrainingMedia Access Australia
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom
 

Destaque (7)

Chapter No:3 Web Productivity Measurement and Benchmarking
Chapter No:3Web Productivity Measurement and BenchmarkingChapter No:3Web Productivity Measurement and Benchmarking
Chapter No:3 Web Productivity Measurement and Benchmarking
 
Benchmarking Web Application Scanners for YOUR Organization
Benchmarking Web Application Scanners for YOUR OrganizationBenchmarking Web Application Scanners for YOUR Organization
Benchmarking Web Application Scanners for YOUR Organization
 
Benchmarking and assessing your web strategy
Benchmarking and assessing your web strategyBenchmarking and assessing your web strategy
Benchmarking and assessing your web strategy
 
Web Analytics Benchmarking Solution
Web Analytics Benchmarking SolutionWeb Analytics Benchmarking Solution
Web Analytics Benchmarking Solution
 
Case competitive benchmarking
Case  competitive benchmarkingCase  competitive benchmarking
Case competitive benchmarking
 
The Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based TrainingThe Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based Training
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
 

Semelhante a Benchmarking Web Accessibility Evaluation Tools: Measuring Limitations of Automated Tests

Accessibility to mobile interfaces for older people
Accessibility to mobile interfaces for older peopleAccessibility to mobile interfaces for older people
Accessibility to mobile interfaces for older peopleGrupo HULAT
 
Digital Research in Low-Resource Countries
Digital Research in Low-Resource CountriesDigital Research in Low-Resource Countries
Digital Research in Low-Resource CountriesQualtrics
 
Measuring and Testing Website Usability
Measuring and Testing Website UsabilityMeasuring and Testing Website Usability
Measuring and Testing Website UsabilityUserWorks
 
CommCare Workshop_Javetski and Wacksmon_4.22.13
CommCare Workshop_Javetski and Wacksmon_4.22.13CommCare Workshop_Javetski and Wacksmon_4.22.13
CommCare Workshop_Javetski and Wacksmon_4.22.13CORE Group
 
Data on the Web Best Practices: Challenges and Benefits
Data on the Web Best Practices: Challenges and BenefitsData on the Web Best Practices: Challenges and Benefits
Data on the Web Best Practices: Challenges and BenefitsCentro Web
 
OHA Usability Test Plan.pdf
OHA Usability Test Plan.pdfOHA Usability Test Plan.pdf
OHA Usability Test Plan.pdfLucass73
 
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....Di Zhang
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...CambridgeshireInsight
 
AMASED: Access methods for analysing sensitive data
AMASED: Access methods for analysing sensitive dataAMASED: Access methods for analysing sensitive data
AMASED: Access methods for analysing sensitive dataJisc
 
Evaluating the Usability of GrantFinder
Evaluating the Usability of GrantFinderEvaluating the Usability of GrantFinder
Evaluating the Usability of GrantFindersoftwaresatish
 
Knowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at ConcernKnowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at ConcernEllen Ward
 
IWMW 2007: Usability Testing for the WWW
IWMW 2007: Usability Testing for the WWWIWMW 2007: Usability Testing for the WWW
IWMW 2007: Usability Testing for the WWWIWMW
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...DRIscience
 
UX by the numbers: Discovering the why from numbers
UX by the numbers: Discovering the why from numbersUX by the numbers: Discovering the why from numbers
UX by the numbers: Discovering the why from numbersUXPA UK
 
Discovering WHY from numbers
Discovering WHY from numbersDiscovering WHY from numbers
Discovering WHY from numbersWebnographer
 
UKRDDS 1st Workshop 20150423 - plan walkthrough
UKRDDS 1st Workshop 20150423 - plan walkthroughUKRDDS 1st Workshop 20150423 - plan walkthrough
UKRDDS 1st Workshop 20150423 - plan walkthroughChristopher Brown
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceUmmeSalmaM1
 
IWMW 2006: User Testing on a Shoestring Budget (1)
IWMW 2006: User Testing on a Shoestring Budget (1)IWMW 2006: User Testing on a Shoestring Budget (1)
IWMW 2006: User Testing on a Shoestring Budget (1)IWMW
 

Semelhante a Benchmarking Web Accessibility Evaluation Tools: Measuring Limitations of Automated Tests (20)

Accessibility to mobile interfaces for older people
Accessibility to mobile interfaces for older peopleAccessibility to mobile interfaces for older people
Accessibility to mobile interfaces for older people
 
Digital Research in Low-Resource Countries
Digital Research in Low-Resource CountriesDigital Research in Low-Resource Countries
Digital Research in Low-Resource Countries
 
Measuring and Testing Website Usability
Measuring and Testing Website UsabilityMeasuring and Testing Website Usability
Measuring and Testing Website Usability
 
CommCare Workshop_Javetski and Wacksmon_4.22.13
CommCare Workshop_Javetski and Wacksmon_4.22.13CommCare Workshop_Javetski and Wacksmon_4.22.13
CommCare Workshop_Javetski and Wacksmon_4.22.13
 
Data on the Web Best Practices: Challenges and Benefits
Data on the Web Best Practices: Challenges and BenefitsData on the Web Best Practices: Challenges and Benefits
Data on the Web Best Practices: Challenges and Benefits
 
OHA Usability Test Plan.pdf
OHA Usability Test Plan.pdfOHA Usability Test Plan.pdf
OHA Usability Test Plan.pdf
 
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....
Web Site Usability Test - Client Report - Victorian Deaf Society (Ver 1....
 
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
Cambridgeshire Insight Open Data: What we’ve learnt from the unexpected - He...
 
Why Usability Testing Should be Part of your Accessibility Testing Strategy
Why Usability Testing Should be Part of your Accessibility Testing StrategyWhy Usability Testing Should be Part of your Accessibility Testing Strategy
Why Usability Testing Should be Part of your Accessibility Testing Strategy
 
White Paper
White PaperWhite Paper
White Paper
 
AMASED: Access methods for analysing sensitive data
AMASED: Access methods for analysing sensitive dataAMASED: Access methods for analysing sensitive data
AMASED: Access methods for analysing sensitive data
 
Evaluating the Usability of GrantFinder
Evaluating the Usability of GrantFinderEvaluating the Usability of GrantFinder
Evaluating the Usability of GrantFinder
 
Knowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at ConcernKnowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at Concern
 
IWMW 2007: Usability Testing for the WWW
IWMW 2007: Usability Testing for the WWWIWMW 2007: Usability Testing for the WWW
IWMW 2007: Usability Testing for the WWW
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...
 
UX by the numbers: Discovering the why from numbers
UX by the numbers: Discovering the why from numbersUX by the numbers: Discovering the why from numbers
UX by the numbers: Discovering the why from numbers
 
Discovering WHY from numbers
Discovering WHY from numbersDiscovering WHY from numbers
Discovering WHY from numbers
 
UKRDDS 1st Workshop 20150423 - plan walkthrough
UKRDDS 1st Workshop 20150423 - plan walkthroughUKRDDS 1st Workshop 20150423 - plan walkthrough
UKRDDS 1st Workshop 20150423 - plan walkthrough
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
IWMW 2006: User Testing on a Shoestring Budget (1)
IWMW 2006: User Testing on a Shoestring Budget (1)IWMW 2006: User Testing on a Shoestring Budget (1)
IWMW 2006: User Testing on a Shoestring Budget (1)
 

Mais de Markel Vigo

Inferring visual behaviour from user interaction data on a medical dashboard
Inferring visual behaviour from user interaction data on a medical dashboardInferring visual behaviour from user interaction data on a medical dashboard
Inferring visual behaviour from user interaction data on a medical dashboardMarkel Vigo
 
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...Markel Vigo
 
Protégé4US: Harvesting Ontology Authoring Data with Protégé
Protégé4US: Harvesting Ontology Authoring Data with ProtégéProtégé4US: Harvesting Ontology Authoring Data with Protégé
Protégé4US: Harvesting Ontology Authoring Data with ProtégéMarkel Vigo
 
Design Insights for the Next Wave Ontology Authoring Tools
Design Insights for the Next Wave Ontology Authoring ToolsDesign Insights for the Next Wave Ontology Authoring Tools
Design Insights for the Next Wave Ontology Authoring ToolsMarkel Vigo
 
Identifying ontology authoring strategies and patterns
Identifying ontology authoring strategies and patternsIdentifying ontology authoring strategies and patterns
Identifying ontology authoring strategies and patternsMarkel Vigo
 
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...Markel Vigo
 
Challenging Information Foraging Theory: Screen Reader Users are not Always D...
Challenging Information Foraging Theory: Screen Reader Users are not Always D...Challenging Information Foraging Theory: Screen Reader Users are not Always D...
Challenging Information Foraging Theory: Screen Reader Users are not Always D...Markel Vigo
 
Adaptive web accessibility metrics
Adaptive web accessibility metricsAdaptive web accessibility metrics
Adaptive web accessibility metricsMarkel Vigo
 
Acceptance of Mobile Technology in Hedonic Scenarios
Acceptance of Mobile Technology in Hedonic ScenariosAcceptance of Mobile Technology in Hedonic Scenarios
Acceptance of Mobile Technology in Hedonic ScenariosMarkel Vigo
 
Enriching Web Information Scent for Blind Users
Enriching Web Information Scent for Blind UsersEnriching Web Information Scent for Blind Users
Enriching Web Information Scent for Blind UsersMarkel Vigo
 
Transition of Accessibility Evaluation Tools to New Standards
Transition of Accessibility Evaluation Tools to New StandardsTransition of Accessibility Evaluation Tools to New Standards
Transition of Accessibility Evaluation Tools to New StandardsMarkel Vigo
 
Automatic Creation of User Profiles for Achieving Personal Web Accessibility
Automatic Creation of User Profiles for Achieving Personal Web AccessibilityAutomatic Creation of User Profiles for Achieving Personal Web Accessibility
Automatic Creation of User Profiles for Achieving Personal Web AccessibilityMarkel Vigo
 
Evaluating Web Accessibility For Specific Mobile Devices
Evaluating Web Accessibility For Specific Mobile DevicesEvaluating Web Accessibility For Specific Mobile Devices
Evaluating Web Accessibility For Specific Mobile DevicesMarkel Vigo
 

Mais de Markel Vigo (13)

Inferring visual behaviour from user interaction data on a medical dashboard
Inferring visual behaviour from user interaction data on a medical dashboardInferring visual behaviour from user interaction data on a medical dashboard
Inferring visual behaviour from user interaction data on a medical dashboard
 
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...
Constructing Conceptual Knowledge Artefacts: 
 Activity Patterns in the Ontol...
 
Protégé4US: Harvesting Ontology Authoring Data with Protégé
Protégé4US: Harvesting Ontology Authoring Data with ProtégéProtégé4US: Harvesting Ontology Authoring Data with Protégé
Protégé4US: Harvesting Ontology Authoring Data with Protégé
 
Design Insights for the Next Wave Ontology Authoring Tools
Design Insights for the Next Wave Ontology Authoring ToolsDesign Insights for the Next Wave Ontology Authoring Tools
Design Insights for the Next Wave Ontology Authoring Tools
 
Identifying ontology authoring strategies and patterns
Identifying ontology authoring strategies and patternsIdentifying ontology authoring strategies and patterns
Identifying ontology authoring strategies and patterns
 
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...
Detecting Problematic Interactions on the Web. The COPE Project: Coping strat...
 
Challenging Information Foraging Theory: Screen Reader Users are not Always D...
Challenging Information Foraging Theory: Screen Reader Users are not Always D...Challenging Information Foraging Theory: Screen Reader Users are not Always D...
Challenging Information Foraging Theory: Screen Reader Users are not Always D...
 
Adaptive web accessibility metrics
Adaptive web accessibility metricsAdaptive web accessibility metrics
Adaptive web accessibility metrics
 
Acceptance of Mobile Technology in Hedonic Scenarios
Acceptance of Mobile Technology in Hedonic ScenariosAcceptance of Mobile Technology in Hedonic Scenarios
Acceptance of Mobile Technology in Hedonic Scenarios
 
Enriching Web Information Scent for Blind Users
Enriching Web Information Scent for Blind UsersEnriching Web Information Scent for Blind Users
Enriching Web Information Scent for Blind Users
 
Transition of Accessibility Evaluation Tools to New Standards
Transition of Accessibility Evaluation Tools to New StandardsTransition of Accessibility Evaluation Tools to New Standards
Transition of Accessibility Evaluation Tools to New Standards
 
Automatic Creation of User Profiles for Achieving Personal Web Accessibility
Automatic Creation of User Profiles for Achieving Personal Web AccessibilityAutomatic Creation of User Profiles for Achieving Personal Web Accessibility
Automatic Creation of User Profiles for Achieving Personal Web Accessibility
 
Evaluating Web Accessibility For Specific Mobile Devices
Evaluating Web Accessibility For Specific Mobile DevicesEvaluating Web Accessibility For Specific Mobile Devices
Evaluating Web Accessibility For Specific Mobile Devices
 

Último

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Último (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Benchmarking Web Accessibility Evaluation Tools: Measuring Limitations of Automated Tests

  • 1. Benchmarking Web Accessibility Evaluation Tools: 10th International Cross-Disciplinary Conference on Web Accessibility W4A2013 Markel Vigo University of Manchester (UK) Justin Brown Edith Cowan University (Australia) Vivienne Conway Edith Cowan University (Australia) Measuring the Harm of Sole Reliance on Automated Tests http://dx.doi.org/10.6084/m9.figshare.701216
  • 2. Problem & Fact W4A201313 May 2013 2 WWW is not accessible
  • 3. Evidence W4A201313 May 2013 3 Webmasters are familiar with accessibility guidelines Lazar et al., 2004 Improving web accessibility: a study of webmaster perceptions Computers in Human Behavior 20(2), 269–288
  • 4. Hypothesis I Assuming guidelines do a good job... H1: Accessibility guidelines awareness is not that widely spread. W4A201313 May 2013 4
  • 5. Evidence II W4A201313 May 2013 5 Webmasters put compliance logos on non- compliant websites Gilbertson and Machin, 2012 Guidelines, icons and marketable skills: an accessibility evaluation of 100 web development company homepages W4A 2012
  • 6. Hypothesis II Assuming webmasters are not trying to cheat... H2: A lack of awareness on the negative effects of overreliance on automated tools. W4A201313 May 2013 6
  • 7. • It's easy • In some scenarios seems like the only option: web observatories, real-time... • We don't know how harmful they can be W4A201313 May 2013 7 Expanding on H2 Why we rely on automated tests
  • 8. • If we are able to measure these limitations we can raise awareness • Inform developers and researchers • We run a study with 6 tools • Compute coverage, completeness and correctness wrt WCAG 2.0 W4A201313 May 2013 8 Expanding on H2 Knowing the limitations of tools
  • 9. • Coverage: whether a given Success Criteria (SC) is reported at least once • Completeness: • Correctness: W4A201313 May 2013 9 Method Computed Metrics true_ positives actual _violations false_ positives true_ positives+ false_ positives
  • 10. W4A201313 May 2013 10 Vision Australia www.visionaustralia.org.au • Non-profit • Non-government • Accessibility resource Prime Minister www.pm.gov.au • Federal Government • Should abide by the Transition Strategy Transperth www.transperth.wa.gov.au • Government affiliated • Used by people with disabilities Method Stimuli
  • 11. Method Obtaining the "Ground Truth" W4A201313 May 2013 11 Ad-hoc sampling Manual evaluation Agreement Ground truth
  • 12. W4A201313 May 2013 12 Evaluate Compare with the GT Method Computing Metrics Compute metrics T1 For every page in the sample... T2 T3 T4 T5 T6 R1 R2 R3 R4 R5 R6 Get reports GT M1 M2 M3 M4 M5 M6
  • 13. Accessibility of Stimuli W4A201313 May 2013 13 1.1.1 1.2.1 1.2.2 1.2.3 1.2.4 1.2.5 1.3.1 1.3.2 1.3.3 1.4.1 1.4.2 1.4.3 1.4.4 1.4.5 2.1.1 2.1.2 2.2.1 2.2.2 2.3.1 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.9 2.4.10 3.1.1 3.1.2 3.2.1 3.2.2 3.2.3 3.2.4 3.3.1 3.3.2 3.3.3 3.3.4 4.1.1 4.1.2 violated success criteria frequency 020406080 1.1.1 1.2.1 1.2.2 1.2.3 1.2.4 1.2.5 1.3.1 1.3.2 1.3.3 1.4.1 1.4.2 1.4.3 1.4.4 1.4.5 2.1.1 2.1.2 2.2.1 2.2.2 2.3.1 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.9 2.4.10 3.1.1 3.1.2 3.2.1 3.2.2 3.2.3 3.2.4 3.3.1 3.3.2 3.3.3 3.3.4 4.1.1 4.1.2 violated success criteria frequency 020406080 1.1.1 1.2.1 1.2.2 1.2.3 1.2.4 1.2.5 1.3.1 1.3.2 1.3.3 1.4.1 1.4.2 1.4.3 1.4.4 1.4.5 2.1.1 2.1.2 2.2.1 2.2.2 2.3.1 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 2.4.7 2.4.9 2.4.10 3.1.1 3.1.2 3.2.1 3.2.2 3.2.3 3.2.4 3.3.1 3.3.2 3.3.3 3.3.4 4.1.1 4.1.2 violated success criteria frequency 020406080 Vision Australia www.visionaustralia.org.au Prime Minister www.pm.gov.au Transperth www.transperth.wa.gov.au
  • 14. • 650 WCAG Success Criteria violations (A and AA) • 23-50% of SC are covered by automated test • Coverage varies across guidelines and tools W4A201313 May 2013 14 Results Coverage
  • 15. • Completeness ranges in 14-38% • Variable across tools and principles W4A201313 May 2013 15 Results Completeness per tool
  • 16. • How conformance levels influence on completeness • Wilcoxon Signed Rank: W=21, p<0.05 • Completeness levels are higher for 'A level' SC W4A201313 May 2013 16 Results Completeness per type of SC
  • 17. • How accessibility levels influence on completeness • ANOVA: F(2,10)=19.82, p<0.001 • The less accessible a page is the higher levels of completeness W4A201313 May 2013 17 Results Completeness vs. accessibility
  • 18. • Cronbach's α = 0.96 • Multidimensional Scaling (MDS) • Tools behave similarly W4A201313 May 2013 18 Results Tool Similarity on Completeness
  • 19. • Tools with lower completeness scores exhibit higher levels of correctness 93- 96% • Tools that obtain higher completeness yield lower correctness 66-71% • Tools with higher completeness are also the most incorrect ones W4A201313 May 2013 19 Results Correctness
  • 20. • We corroborate that 50% is the upper limit for automatising guidelines • Natural Language Processing? – Language: 3.1.2 Language of parts – Domain: 3.3.4 Error prevention W4A201313 May 2013 20 Implications Coverage
  • 21. • Automated tests do a better job... ...on non-accessible sites ...on 'A level' success criteria • Automated tests aim at catching stereotypical errors W4A201313 May 2013 21 Implications Completeness I
  • 22. • Strengths of tools can be identified across WCAG principles and SC • A method to inform decision making • Maximising completeness in our sample of pages – On all tools: 55% (+17 percentage points) – On non-commercial tools: 52% W4A201313 May 2013 22 Implications Completeness II
  • 23. Conclusions • Coverage: 23-50% W4A201313 May 2013 23 • Completeness: 14-38% • Higher completeness leads to lower correctness
  • 24. Follow up 13 May 2013 24 Contact @markelvigo | markel.vigo@manchester.ac.uk Presentation DOI http://dx.doi.org/10.6084/m9.figshare.701216 Datasets http://www.markelvigo.info/ds/bench12/index.html 10th International Cross-Disciplinary Conference on Web Accessibility W4A2013