SlideShare uma empresa Scribd logo
1 de 46
Software Evolution anno 2014: 
directions and challenges 
Alexander Serebrenik 
@aserebrenik 
a.serebrenik@tue.nl
Time for a new book!
2008 vs. 2014 
From systems to ecosystems
Business-oriented view 
“a set of actors functioning as a unit and 
interacting with a shared market for 
software and services, together with the 
relationships among them.” 
with thanks to International Data Corporation (IDC)
Development-centric view 
a collection of software projects 
that are developed and evolve 
together in the same environment 
with thanks to Bram Adams
Socio-technical view 
a community of persons (end-users, 
developers, debuggers, 
…) contributing to a collection 
of projects
Technical 
Scientific 
Practical 
Legal and ethical
Technical 
challenges
Technical 
challenges 
• eliminate non-names 
• eliminate specific quirks 
• group “similar” names 
– first/last name 
– textual similarity 
– latent semantic analysis 
• (correct groups 
manually)
Technical 
challenges 
• eliminate non-names 
• eliminate specific quirks 
• group “similar” names 
– first/last name 
– textual similarity 
– latent semantic analysis 
• (correct groups 
manually)
Technical 
challenges 
Structured 
data 
2008 
Unstructured 
data 
2014
Technical 
challenges 
Structured 
data 
2008 
Unstructured 
data 
2014
Scientific 
challenges
Scientific 
challenges 
Raw data 
Processed 
data set 
Tools & 
scripts 
#MSR papers 
2004-2009 
Y Y Y 2 
Y Y N 2 
Y P Y 1 
Y P P 2 
Y P N 2 
Y N Y 16 
Y N P 19 
Y N N 64 
P N Y 1 
P N N 2 
N Y N 2 
N P N 1 
N N Y 7 
N N P 2 
N N N 31 
N/A N/A N/A 17 
We share raw data 
but rarely share tools 
– reinventing the 
wheel anybody?
• How can we share our big data with other 
researchers? 
• Different formats, different tools, storage 
Practical 
challenges 
problems, … 
• How can we make our research results useful 
to practitioners and development 
communities? 
• How can we build tools and dashboards that 
integrate our findings?
http://www.intracto.com/blog/online-privacy-belangrijk 
Legal and ethical 
challenges 
(especially for survey data)
k-anonymity
k-anonymity 
l-diversity 
t-closeness
2008 vs. 2014 
From “traditional” to 
“non-traditional” artifacts: 
What is 
software?
http://ctms.engin.umich.edu/CTMS/index.php?example=Introduction&section=SimulinkModeling 
Maintainability??? 
Evolution???
BumbleBee: a 
refactoring tool 
for spreadsheets 
with thanks to Felienne Hermans
http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
• describe evolutionary steps 
• relate to changes of other 
artifacts 
• describe prevalence in 
practice 
• support automation 
http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
New kind of 
verification 
artifacts 
2008 
2009 
2012 
2013
2008 vs. 2014 
From technical to socio-technical 
perspective: 
Who are these 
people? 
What do they do?
> 90% in WordPress & Drupal 
> 95% in FLOSS surveys 
> 87% in GNOME 
> 70% in software-related jobs (NSF) 
MEN
FLOSS 
2013 
Europe,US,CA,AU 
Brazil/Argentina
How can we reliably and efficiently 
identify gender, age, location? 
Technical 
challenges
Name + 
Location = 
Gender
Lonzo ⇒ Alonzo 
w35l3y ⇒ wesley 
Name + 
Location = 
Gender
Heuristics: 
title + first h1 
<title>Ben Kamens</title> 
… 
<h1>We&#8217;re willing to 
be embarrassed about what 
we 
<em>haven&#8217;t</em> 
done&#8230;</h1> 
Ben Kamens We’re willing to be 
embarrassed about what we 
haven’t done… 
Stanford Named 
Entity Tagger 
<PERSON>Ben Kamens</PERSON> 
We’re willing to be embarrassed 
about what we haven’t done…
Quality of gender resolution: Survey 
Self-identification 
As inferred Total 
M F ? 
M 60 3 43 106 
F 2 5 4 11 
+ avatars, other 
social media 
sites (manually) 
Self-identification 
As inferred Total 
M F ? 
M 90 3 13 106 
F 2 9 0 11
22-9-2014 PAGE 42 
.cpp .po 
.jpg 
/test/ 
/library/ .doc 
makefile .sql .conf
Occasional 
contributors 
Frequent 
contributors
How can we reliably and efficiently 
identify human activities? 
Technical 
challenges
How can we reliably and efficiently 
identify human activities? 
Technical 
challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges

Mais conteúdo relacionado

Destaque

An empirical study of the evolution of Eclipse third-party plug-ins
An empirical study of the evolution of Eclipse third-party plug-insAn empirical study of the evolution of Eclipse third-party plug-ins
An empirical study of the evolution of Eclipse third-party plug-ins
Alexander Serebrenik
 
Icsm 2011 you can't control the unfamiliar
Icsm 2011 you can't control the unfamiliarIcsm 2011 you can't control the unfamiliar
Icsm 2011 you can't control the unfamiliar
Alexander Serebrenik
 
Assignment 1 Mhi2003 Ppt Murtaza Ali
Assignment 1 Mhi2003 Ppt Murtaza AliAssignment 1 Mhi2003 Ppt Murtaza Ali
Assignment 1 Mhi2003 Ppt Murtaza Ali
Murtaza Ali
 
English Flip Chart 2010
English Flip Chart 2010English Flip Chart 2010
English Flip Chart 2010
AMuniz
 
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse ReleasesCompatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
Alexander Serebrenik
 
865 social capital
865 social capital865 social capital
865 social capital
Ian Pollock
 
ดวงจันทร์ Ppt
ดวงจันทร์ Pptดวงจันทร์ Ppt
ดวงจันทร์ Ppt
puniga
 

Destaque (20)

An empirical study of the evolution of Eclipse third-party plug-ins
An empirical study of the evolution of Eclipse third-party plug-insAn empirical study of the evolution of Eclipse third-party plug-ins
An empirical study of the evolution of Eclipse third-party plug-ins
 
System7 Five Point
System7 Five PointSystem7 Five Point
System7 Five Point
 
actionreserchpbl
actionreserchpblactionreserchpbl
actionreserchpbl
 
ไตร่ตรองงานวิจัยของฉัน
ไตร่ตรองงานวิจัยของฉันไตร่ตรองงานวิจัยของฉัน
ไตร่ตรองงานวิจัยของฉัน
 
Icsm 2011 you can't control the unfamiliar
Icsm 2011 you can't control the unfamiliarIcsm 2011 you can't control the unfamiliar
Icsm 2011 you can't control the unfamiliar
 
Assignment 1 Mhi2003 Ppt Murtaza Ali
Assignment 1 Mhi2003 Ppt Murtaza AliAssignment 1 Mhi2003 Ppt Murtaza Ali
Assignment 1 Mhi2003 Ppt Murtaza Ali
 
Flowgen: Flowchart-Based Documentation Framework for C++
Flowgen: Flowchart-Based Documentation Framework for C++Flowgen: Flowchart-Based Documentation Framework for C++
Flowgen: Flowchart-Based Documentation Framework for C++
 
De Andrea Nicole James
De Andrea Nicole JamesDe Andrea Nicole James
De Andrea Nicole James
 
English Flip Chart 2010
English Flip Chart 2010English Flip Chart 2010
English Flip Chart 2010
 
Fresh Produce
Fresh ProduceFresh Produce
Fresh Produce
 
Future trends in technology
Future trends in technologyFuture trends in technology
Future trends in technology
 
Saxony Germany
Saxony GermanySaxony Germany
Saxony Germany
 
Regreso A Clase
Regreso A ClaseRegreso A Clase
Regreso A Clase
 
Arts & Crafts Expo
Arts & Crafts ExpoArts & Crafts Expo
Arts & Crafts Expo
 
Chistes
ChistesChistes
Chistes
 
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse ReleasesCompatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
Compatibility Prediction of Eclipse Third-Party Plug-ins in New Eclipse Releases
 
Spain
SpainSpain
Spain
 
865 social capital
865 social capital865 social capital
865 social capital
 
ดวงจันทร์ Ppt
ดวงจันทร์ Pptดวงจันทร์ Ppt
ดวงจันทร์ Ppt
 
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
Security and Emotion: Sentiment Analysis of Security Discussions on GitHubSecurity and Emotion: Sentiment Analysis of Security Discussions on GitHub
Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
 

Semelhante a Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges

Effective Software Development in the 21st Century
Effective Software Development in the 21st CenturyEffective Software Development in the 21st Century
Effective Software Development in the 21st Century
Agileee
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 

Semelhante a Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges (20)

Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Effective Software Development in the 21st Century
Effective Software Development in the 21st CenturyEffective Software Development in the 21st Century
Effective Software Development in the 21st Century
 
Gender, Representation and Online Participation: a Quantitative Study
Gender, Representation and Online Participation: a Quantitative StudyGender, Representation and Online Participation: a Quantitative Study
Gender, Representation and Online Participation: a Quantitative Study
 
Analyzing Big Data's Weakest Link (hint: it might be you)
Analyzing Big Data's Weakest Link  (hint: it might be you)Analyzing Big Data's Weakest Link  (hint: it might be you)
Analyzing Big Data's Weakest Link (hint: it might be you)
 
SBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and AnalysisSBQS 2013 Keynote: Cooperative Testing and Analysis
SBQS 2013 Keynote: Cooperative Testing and Analysis
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
 
Synergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software EngineeringSynergy of Human and Artificial Intelligence in Software Engineering
Synergy of Human and Artificial Intelligence in Software Engineering
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
My presentation at Kent State IAKM
My presentation at Kent State IAKMMy presentation at Kent State IAKM
My presentation at Kent State IAKM
 
Data visualisations as a gateway to programming
Data visualisations as a gateway to programmingData visualisations as a gateway to programming
Data visualisations as a gateway to programming
 
Applying information architecture to university web sites
Applying information architecture to university web sitesApplying information architecture to university web sites
Applying information architecture to university web sites
 
Snowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter CoffeeSnowforce 2017 Keynote - Peter Coffee
Snowforce 2017 Keynote - Peter Coffee
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Sattose talk
Sattose talkSattose talk
Sattose talk
 

Mais de Alexander Serebrenik

“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
Alexander Serebrenik
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
Alexander Serebrenik
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Alexander Serebrenik
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
Alexander Serebrenik
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
Alexander Serebrenik
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
Alexander Serebrenik
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
Alexander Serebrenik
 

Mais de Alexander Serebrenik (20)

Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...
 
Towards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotTowards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBot
 
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
 
Emotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsEmotion Analysis in Software Ecosystems
Emotion Analysis in Software Ecosystems
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
 
Gender and Age in Software Engineering
Gender and Age in Software EngineeringGender and Age in Software Engineering
Gender and Age in Software Engineering
 
Alexander - intro
Alexander - introAlexander - intro
Alexander - intro
 
Diversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomDiversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroom
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
 
Gender and Community Smells
Gender and Community SmellsGender and Community Smells
Gender and Community Smells
 
Bias in MSR Research
Bias in MSR ResearchBias in MSR Research
Bias in MSR Research
 
From team organisation to software quality
From team organisation to software qualityFrom team organisation to software quality
From team organisation to software quality
 
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
 
My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
 
Gender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringGender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software Engineering
 

Último

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Último (20)

9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges

Notas do Editor

  1. Software maintenance is an area of software engineering with deep financial implications. Indeed, maintenance and evolution costs were forecasted to account for more than half of North American and European software budgets in 2010. Similar or even higher figures were reported for countries such as Norway and Chile. In this talk we discuss recent advancement on two popular approaches to assessing evolution of software projects: measuring and mining software. Software metrics, commonly used to measure software, are usually defined at micro level (method, class, package), while the analysis of maintainability and evolution requires insights at macro (system) level. Metrics should, therefore, be aggregated. We discuss recent work on software metrics aggregation techniques, and advocate econometric inequality induces to perform aggregation.   A complementary approach to studying software evolution consists in mining software repositories, e.g., version control systems, bug trackers and mail archives. While abundant information is usually present in such repositories, successful information extraction is often challenged by the necessity to simultaneously analyze different repositories and to combine the information obtained. We propose to apply process mining techniques, originally developed for business process analysis, to address this challenge. However, in order for process mining to become applicable, different software repositories should be combined, and “related” software development events should be matched: e.g., mails sent about a file, modifications of the file and bug reports that can be traced back to it. In this talk we discuss the approach proposed, as well as a series of case studies addressing such aspects of the development process as roles of different developers, the way bug reports are handled and conformance to software engineering standards.
  2. Software ecosystem – collection of software products that are developed and evolve in the same environment [Lungu, 2008]
  3. Examples: Eclipse; Android and iOS app store
  4. Technical challenges: Extracting and combining data from different sources Identifying correspondences across different data sources (identity merging) Dealing with inconsistent and incomplete data Big data analytics special skills and tools needed to store, process and analyse huge amounts of data u
  5. Example of a technical challenge that has to be addressed: Identifying correspondences across different data sources (identity merging). Non-names are root, info….
  6. Example of a technical challenge that has to be addressed: Identifying correspondences across different data sources (identity merging). Non-names are root, info….
  7. From structured data to unstructured data
  8. From structured data to unstructured data. Still there are different stemming algorithms, different information retrieval approaches etc.
  9. Accessibility of data E.g. many apps in Google Play are proprietary and historical information is not accessible Focus on open source software Reproducibility of results Generalisability of results Which research methodology, which metrics, which statistical tools, …
  10. Privacy issues Can we use and combine information about actual developers? Can we make these results freely available? How to reconcile privacy with reproducibility ?
  11. There are several approaches that have been proposed to ensure secure anonymization and that we would like to study in the next future. The concept of k-anonymity [9] tries to ensure that with k-anonymity greater than 1, even with all fields a single person cannot be identified, but k people. Still, k-anonymity has shown not to be sufficient as attackers can discover sensitive attributes in data with low diversity, and together with other information identify a single person. Data with sufficient diversity, l-diversity, should be published [5]. Finally, t-closeness requires that the distribution of sensitive attributes to be close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should less than a threshold t) [4]. In the meantime, we will combine the data internally, as we have done in the case study shown next.
  12. There are several approaches that have been proposed to ensure secure anonymization and that we would like to study in the next future. The concept of k-anonymity [9] tries to ensure that with k-anonymity greater than 1, even with all fields a single person cannot be identified, but k people. Still, k-anonymity has shown not to be sufficient as attackers can discover sensitive attributes in data with low diversity, and together with other information identify a single person. Data with sufficient diversity, l-diversity, should be published [5]. Finally, t-closeness requires that the distribution of sensitive attributes to be close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should less than a threshold t) [4]. In the meantime, we will combine the data internally, as we have done in the case study shown next.
  13. Software ecosystem – collection of software products that are developed and evolve in the same environment [Lungu, 2008]
  14. A toy train consisting of an engine and a car
  15. Evolutionary problems specific to model-driven engineering are related to presence of multiple co-evolving artefacts: meta-models, models and model transformations
  16. Evolutionary problems specific to model-driven engineering are related to presence of multiple co-evolving artefacts: meta-models, models and model transformations
  17. Male dominated
  18. SO: the number of programmers is roughly normally distributed around age 29, though skewed right.
  19. Modern developers have to jungle multiple activities, including source code updates, mails, bug trackers, questions and answers on StackOverflow. We start by looking into data coming from version control repositories of GNOME, and then proceed with analysing mail archives and Stack Overflow question-answering.
  20. Contributing to modern software system (or ecosystem of software systems) is not only coding but also localising, testing, creating images/multimedia, developing libraries, writing documentation, creating build or configuration scripts and/or designing databases. All these activities are somehow reflected in the version control system archives. We use file extensions and file paths to map each one of the activities to groups of files
  21. Arrows indicate that a statistical analysis reveals significant differences between activities linked by the arrow: localization (l10n) has more commits related to it than code, code more than doc or img etc. Occasional = less than 14 commits (median), frequent 14 commits or more