SlideShare uma empresa Scribd logo
1 de 27
On Parameter Tuning in Search-Based
Software Engineering:
A Replicated Empirical Study
Abdel Salam Sayyad
Katerina Goseva-Popstojanova
Tim Menzies
Hany Ammar
West Virginia University, USA
International Workshop on Replication in Software
Engineering Research (RESER)
Oct 9, 2013
Sound bites
Search-based Software Engineering
Is here… to stay.
A helper… Not an alternative to human SE

Randomness…
is an essential part of Search Algorithms
… hence the need for statistical examination (A lot to learn from Empirical SE)

Parameter Tuning
A real problem…
Default values (rules of thumb) do exist… and (sadly?) they are being followed

Default parameter values fail to optimize performance…
… As seen in the original study, and in this replication…
No Free Lunch Theorems for Optimization [Wolpert and Macready ‘97+
the same parameter values don’t optimize all algorithms for all problems.
2
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Searching for what?
• Correct solutions…
– Conform to system relationships and constraints.

• Optimal solutions…
– Achieve user objectives/preferences…

• Complex problems have big Search spaces
– Exhaustive search not a practical idea.
5
Genetic Algorithm
• Start with a large population of candidate
solutions… (How large?)
• Evaluate the fitness of your solutions.
• Let your candidate solutions crossover –
exchange genes… (How often?)
• Mutate a small portion of your solutions.
(How small?)
• How do those choices affect performance?
6
Multi-objective Optimization

The Pareto Front

Higher-level
Decision Making

The Chosen Solution

7
Survival of the fittest
(according to NSGA-II [Deb et al. 2002])
Boolean dominance (x Dominates y, or does not):
- In no objective is x worse than y
- In at least one objective, x is better than y

Crowd
pruning

8
Indicator-Based Evolutionary
Algorithm (IBEA) [Zitzler and Kunzli ‘04+
1) For {old generation + new generation} do
– Add up every individual’s amount of dominance with
respect to everyone else

– Sort all instances by F
– Delete worst, recalculate, delete worst, recalculate, …

2) Then, standard GA (cross-over, mutation) on the
survivors  Create a new generation  Back to 1.
9
NSGA-II… the default algorithm
• Much prior work in SBSE (*)
Used NSGA-II

Didn’t state why!

-------------------------(*) Sayyad and Ammar, RAISE’13

10
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
The Original Study
• A. Arcuri and G. Fraser, "On Parameter Tuning in Search
Based Software Engineering," in Proc. SSBSE, 2011, pp.
33-47.
• A. Arcuri and G. Fraser, "Parameter Tuning or Default
Values? An Empirical Investigation in Search-Based
Software Engineering," Empirical Software Engineering,
Feb 2013.

• Problem: generating test vectors for objectoriented software.
• Fitness function: percentage of test coverage.
12
Results of original study
• Different parameter settings cause very large
variance in the performance.
• Default parameter settings perform relatively well,
but are far from optimal on individual problem
instances.

13
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Feature–oriented domain analysis [Kang 1990]
• Feature models = a
lightweight method for
defining a space of options
• De facto standard for
modeling variability, e.g.
Software Product Lines
Cross-Tree Constraints

Cross-Tree Constraints
15
What are the user preferences?
• Suppose each feature had the following metrics:
1. Boolean USED_BEFORE?
2. Integer DEFECTS
3. Real
COST
• Show me the space of “best options” according to the objectives:
1. That satisfies most domain constraints (0 ≤ #violations ≤ 100%)
2. That offers most features
3. Maximize overall feature that were used before. (promote re-use)
4. Minimize overall known defects.
5. Minimize cost.

16
Previous Work *Sayyad et al. ICSE’13+
• IBEA (continuous dominance criterion) beats NSGA-II
and a host of other algorithms based on Boolean
dominance criterion.
• Especially with a high number of objectives.
• Quality indicators:
– Percentage of conforming (useable) solutions
• We’re interested in 100% conforming solutions.

– Hypervolume (how close to optimal?)
– Spread (how diverse?)

17
Setup

18
What are “default settings”?
• Population size = 100
• Crossover rate = 80%
– 60% < Crossover rate < 90%
• [A. E. Eiben and J. E. Smith, Introduction to Evolutionary
Computing.: Springer, 2003.]

• Mutation rate = 1/Features
• [one bit out of the whole string]
19
Research Questions

20
Results [10 sec / algorithm / FM]

21
Answer to RQ1
• RQ1: How Large is the Potential Impact of a
Wrong Choice of Parameter Settings?
• We confirm Arcuri and Fraser’s conclusion:
“Different parameter settings cause very large
variance in the performance.”

22
Answer to RQ2
• RQ2: How Does a “Default” Setting Compare to the
Best and Worst Achievable Performance?
• Arcuri and Fraser concluded that: “Default parameter
settings perform relatively well, but are far from
optimal on individual problem instances.”
• We make a stronger conclusion: “Default parameter
settings perform generally poorly, but might perform
relatively well on individual problem instances.”
23
Answer to RQ3
• RQ3: How does the performance of IBEA’s
best tuning compare to NSGA-II’s best
tuning?

• Our results show that “IBEA’s best tuning
performs generally much better than NSGA-II’s
best tuning.”

24
RQ4: Parameter Training
• Find best tuning for a group of problem instances, apply it
to a new problem instance, would it be best tuning for the
new problem?
• Arcuri and Fraser concluded that: “Tuning should be done
on a very large sample of problem instances. Otherwise, the
obtained parameter settings are likely to be worse than
arbitrary default values.”
• Our conclusion: “Tuning on a sample of problem instances
does not, in general, result in the best parameter values for
a new problem instance, but the obtained setting are
generally better than the defaults settings.”
25
Roadmap

①
②
③
④

Randomness of Search
The original study
The replication
Conclusion
Conclusion
• Default parameter values fail
to optimize performance…

• And, sadly, many SBSE
researchers choose “default”
algorithms (e.g. NSGA-II) along
with “default” parameters.
• Alternatives?
– A long way to go!

Acknowledgment
This research work
was funded by the
Qatar National
Research Fund under
the National Priorities
Research Program

• Parameter control
• Adaptive parameter control
27

Mais conteúdo relacionado

Mais procurados

Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorTim Menzies
 
Experimental design
Experimental designExperimental design
Experimental designDan Toma
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithmsNurhussen Menza
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesLionel Briand
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingJaguaraci Silva
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineeringalessio_ferrari
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...Chakkrit (Kla) Tantithamthavorn
 
Scenario $4$
Scenario $4$Scenario $4$
Scenario $4$Jason121
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsChakkrit (Kla) Tantithamthavorn
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Reviewinventionjournals
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Chakkrit (Kla) Tantithamthavorn
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Chakkrit (Kla) Tantithamthavorn
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesTim Menzies
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionMinh Nguyen
 

Mais procurados (19)

VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
 
[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
 
Experimental design
Experimental designExperimental design
Experimental design
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithms
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
 
Wcre13b.ppt
Wcre13b.pptWcre13b.ppt
Wcre13b.ppt
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software Testing
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
 
Scenario $4$
Scenario $4$Scenario $4$
Scenario $4$
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
 
Wcre13a.ppt
Wcre13a.pptWcre13a.ppt
Wcre13a.ppt
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOps
 
Practical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A ReviewPractical Guidelines to Improve Defect Prediction Model – A Review
Practical Guidelines to Improve Defect Prediction Model – A Review
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
 
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
Complexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software ArchitecturesComplexity Measures for Secure Service-Orieted Software Architectures
Complexity Measures for Secure Service-Orieted Software Architectures
 
Odin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_PredictionOdin2018_Minh_ML_Risk_Prediction
Odin2018_Minh_ML_Risk_Prediction
 

Destaque

Evolución de la web
Evolución de la webEvolución de la web
Evolución de la webJuliana Punk
 
Bases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasBases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasJoana Andrino
 
Plasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement SoftwarePlasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement SoftwareJim Moss
 
Funciones inversas y compuestas
Funciones inversas y compuestasFunciones inversas y compuestas
Funciones inversas y compuestasalbertoalamos09
 
Fracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosFracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosPhirored
 
Top Ten Devices to Get on the Web
Top Ten Devices to Get on the WebTop Ten Devices to Get on the Web
Top Ten Devices to Get on the Webmatthewjfrederick2
 
Legacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded GamesLegacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded GamesAriella Lehrer
 
Sistema de llenado
Sistema de llenadoSistema de llenado
Sistema de llenadoyanirys26
 

Destaque (8)

Evolución de la web
Evolución de la webEvolución de la web
Evolución de la web
 
Bases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemasBases sobre teoria da cor aplicada aos sistemas
Bases sobre teoria da cor aplicada aos sistemas
 
Plasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement SoftwarePlasticity: Workplace Social Engagement Software
Plasticity: Workplace Social Engagement Software
 
Funciones inversas y compuestas
Funciones inversas y compuestasFunciones inversas y compuestas
Funciones inversas y compuestas
 
Fracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburosFracturamiento hidráulico de yacimientos de hidrocarburos
Fracturamiento hidráulico de yacimientos de hidrocarburos
 
Top Ten Devices to Get on the Web
Top Ten Devices to Get on the WebTop Ten Devices to Get on the Web
Top Ten Devices to Get on the Web
 
Legacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded GamesLegacy Games 2013 - Leader in Branded Games
Legacy Games 2013 - Leader in Branded Games
 
Sistema de llenado
Sistema de llenadoSistema de llenado
Sistema de llenado
 

Semelhante a On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study

On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringAbdel Salam Sayyad
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Lionel Briand
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingLionel Briand
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyAbdel Salam Sayyad
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?CS, NcState
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceLionel Briand
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUCS, NcState
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsLionel Briand
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel BriandPtidej Team
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Lionel Briand
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter? CS, NcState
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year JourneyLionel Briand
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Lionel Briand
 
SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...SMART Infrastructure Facility
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimationCS, NcState
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016MLconf
 

Semelhante a On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study (20)

On the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software EngineeringOn the Value of User Preferences in Search-Based Software Engineering
On the Value of User Preferences in Search-Based Software Engineering
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...SMART International Symposium for Next Generation Infrastructure: The roles o...
SMART International Symposium for Next Generation Infrastructure: The roles o...
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Principles of effort estimation
Principles of effort estimationPrinciples of effort estimation
Principles of effort estimation
 
Software Testing
Software TestingSoftware Testing
Software Testing
 
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
 

Mais de Abdel Salam Sayyad

Slide set 3 honesty, academic ethics
Slide set 3  honesty, academic ethicsSlide set 3  honesty, academic ethics
Slide set 3 honesty, academic ethicsAbdel Salam Sayyad
 
Slide set 1 intro to professional ethics
Slide set 1  intro to professional ethicsSlide set 1  intro to professional ethics
Slide set 1 intro to professional ethicsAbdel Salam Sayyad
 
Teaching methods - Active learning
Teaching methods - Active learningTeaching methods - Active learning
Teaching methods - Active learningAbdel Salam Sayyad
 
Software Engineering Code of Ethics
Software Engineering Code of EthicsSoftware Engineering Code of Ethics
Software Engineering Code of EthicsAbdel Salam Sayyad
 
Of Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingOf Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingAbdel Salam Sayyad
 
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAScalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAAbdel Salam Sayyad
 

Mais de Abdel Salam Sayyad (11)

Slide set 5 workplace rights
Slide set 5  workplace rightsSlide set 5  workplace rights
Slide set 5 workplace rights
 
Slide set 4 safety and risk
Slide set 4  safety and riskSlide set 4  safety and risk
Slide set 4 safety and risk
 
Slide set 3 honesty, academic ethics
Slide set 3  honesty, academic ethicsSlide set 3  honesty, academic ethics
Slide set 3 honesty, academic ethics
 
Slide set 2 moral dilemmas
Slide set 2  moral dilemmasSlide set 2  moral dilemmas
Slide set 2 moral dilemmas
 
Slide set 1 intro to professional ethics
Slide set 1  intro to professional ethicsSlide set 1  intro to professional ethics
Slide set 1 intro to professional ethics
 
Teaching methods - Active learning
Teaching methods - Active learningTeaching methods - Active learning
Teaching methods - Active learning
 
Software Engineering Code of Ethics
Software Engineering Code of EthicsSoftware Engineering Code of Ethics
Software Engineering Code of Ethics
 
Of Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision MakingOf Machines and Men: AI and Decision Making
Of Machines and Men: AI and Decision Making
 
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CAScalable Product Line Configuration - ASE 2013 Palo Alto, CA
Scalable Product Line Configuration - ASE 2013 Palo Alto, CA
 
My summary 6-24-2013
My summary 6-24-2013My summary 6-24-2013
My summary 6-24-2013
 
Guest Lecture 1/30/2013
Guest Lecture 1/30/2013Guest Lecture 1/30/2013
Guest Lecture 1/30/2013
 

Último

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study

  • 1. On Parameter Tuning in Search-Based Software Engineering: A Replicated Empirical Study Abdel Salam Sayyad Katerina Goseva-Popstojanova Tim Menzies Hany Ammar West Virginia University, USA International Workshop on Replication in Software Engineering Research (RESER) Oct 9, 2013
  • 2. Sound bites Search-based Software Engineering Is here… to stay. A helper… Not an alternative to human SE Randomness… is an essential part of Search Algorithms … hence the need for statistical examination (A lot to learn from Empirical SE) Parameter Tuning A real problem… Default values (rules of thumb) do exist… and (sadly?) they are being followed Default parameter values fail to optimize performance… … As seen in the original study, and in this replication… No Free Lunch Theorems for Optimization [Wolpert and Macready ‘97+ the same parameter values don’t optimize all algorithms for all problems. 2
  • 3. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 4. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 5. Searching for what? • Correct solutions… – Conform to system relationships and constraints. • Optimal solutions… – Achieve user objectives/preferences… • Complex problems have big Search spaces – Exhaustive search not a practical idea. 5
  • 6. Genetic Algorithm • Start with a large population of candidate solutions… (How large?) • Evaluate the fitness of your solutions. • Let your candidate solutions crossover – exchange genes… (How often?) • Mutate a small portion of your solutions. (How small?) • How do those choices affect performance? 6
  • 7. Multi-objective Optimization The Pareto Front Higher-level Decision Making The Chosen Solution 7
  • 8. Survival of the fittest (according to NSGA-II [Deb et al. 2002]) Boolean dominance (x Dominates y, or does not): - In no objective is x worse than y - In at least one objective, x is better than y Crowd pruning 8
  • 9. Indicator-Based Evolutionary Algorithm (IBEA) [Zitzler and Kunzli ‘04+ 1) For {old generation + new generation} do – Add up every individual’s amount of dominance with respect to everyone else – Sort all instances by F – Delete worst, recalculate, delete worst, recalculate, … 2) Then, standard GA (cross-over, mutation) on the survivors  Create a new generation  Back to 1. 9
  • 10. NSGA-II… the default algorithm • Much prior work in SBSE (*) Used NSGA-II Didn’t state why! -------------------------(*) Sayyad and Ammar, RAISE’13 10
  • 11. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 12. The Original Study • A. Arcuri and G. Fraser, "On Parameter Tuning in Search Based Software Engineering," in Proc. SSBSE, 2011, pp. 33-47. • A. Arcuri and G. Fraser, "Parameter Tuning or Default Values? An Empirical Investigation in Search-Based Software Engineering," Empirical Software Engineering, Feb 2013. • Problem: generating test vectors for objectoriented software. • Fitness function: percentage of test coverage. 12
  • 13. Results of original study • Different parameter settings cause very large variance in the performance. • Default parameter settings perform relatively well, but are far from optimal on individual problem instances. 13
  • 14. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 15. Feature–oriented domain analysis [Kang 1990] • Feature models = a lightweight method for defining a space of options • De facto standard for modeling variability, e.g. Software Product Lines Cross-Tree Constraints Cross-Tree Constraints 15
  • 16. What are the user preferences? • Suppose each feature had the following metrics: 1. Boolean USED_BEFORE? 2. Integer DEFECTS 3. Real COST • Show me the space of “best options” according to the objectives: 1. That satisfies most domain constraints (0 ≤ #violations ≤ 100%) 2. That offers most features 3. Maximize overall feature that were used before. (promote re-use) 4. Minimize overall known defects. 5. Minimize cost. 16
  • 17. Previous Work *Sayyad et al. ICSE’13+ • IBEA (continuous dominance criterion) beats NSGA-II and a host of other algorithms based on Boolean dominance criterion. • Especially with a high number of objectives. • Quality indicators: – Percentage of conforming (useable) solutions • We’re interested in 100% conforming solutions. – Hypervolume (how close to optimal?) – Spread (how diverse?) 17
  • 19. What are “default settings”? • Population size = 100 • Crossover rate = 80% – 60% < Crossover rate < 90% • [A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing.: Springer, 2003.] • Mutation rate = 1/Features • [one bit out of the whole string] 19
  • 21. Results [10 sec / algorithm / FM] 21
  • 22. Answer to RQ1 • RQ1: How Large is the Potential Impact of a Wrong Choice of Parameter Settings? • We confirm Arcuri and Fraser’s conclusion: “Different parameter settings cause very large variance in the performance.” 22
  • 23. Answer to RQ2 • RQ2: How Does a “Default” Setting Compare to the Best and Worst Achievable Performance? • Arcuri and Fraser concluded that: “Default parameter settings perform relatively well, but are far from optimal on individual problem instances.” • We make a stronger conclusion: “Default parameter settings perform generally poorly, but might perform relatively well on individual problem instances.” 23
  • 24. Answer to RQ3 • RQ3: How does the performance of IBEA’s best tuning compare to NSGA-II’s best tuning? • Our results show that “IBEA’s best tuning performs generally much better than NSGA-II’s best tuning.” 24
  • 25. RQ4: Parameter Training • Find best tuning for a group of problem instances, apply it to a new problem instance, would it be best tuning for the new problem? • Arcuri and Fraser concluded that: “Tuning should be done on a very large sample of problem instances. Otherwise, the obtained parameter settings are likely to be worse than arbitrary default values.” • Our conclusion: “Tuning on a sample of problem instances does not, in general, result in the best parameter values for a new problem instance, but the obtained setting are generally better than the defaults settings.” 25
  • 26. Roadmap ① ② ③ ④ Randomness of Search The original study The replication Conclusion
  • 27. Conclusion • Default parameter values fail to optimize performance… • And, sadly, many SBSE researchers choose “default” algorithms (e.g. NSGA-II) along with “default” parameters. • Alternatives? – A long way to go! Acknowledgment This research work was funded by the Qatar National Research Fund under the National Priorities Research Program • Parameter control • Adaptive parameter control 27