SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Social Interactions around
Cross-System Bug Fixings:
        The Case of
  FreeBSD and OpenBSD
  Gerardo Canfora, Luigi Cerulo,
Marta Cimitile, Massimiliano Di Penta
       dipenta@unisannio.it
Context
  Source code is often reused across different systems
    Unixes (FreeBSD, OpenBSD, Linux)
    Office applications (NeoOffice, OpenOffice)
    Desktop environment apps (KDE or GNOME apps)
  Maintenance might require to propagate bug fixings
    We call this “Cross System Bug Fixing” (CSBF)


  Example:
     FreeBSD, 1996/01/19, file ip_icmp.h:
       – “Added definitions for ICMP router discovery. Reviewed by:
         wollman
     OpenBSD, 1996/08/02, file ip_icmp.h:
       – “ICMP Router Discovery definitions; from FreeBSD”
What we propose
  A method to track CSBFs
  A study on the social characteristics
   and development activity made by
   CSBF committers
    degree, betweenness, brokerage
    commits, lines changed
Detecting CSBF - I
  Step 1: mining cross-referencing commits
    openbsd, atphy.c,2008/09/25 20:47:16,brad,
     Add a driver for the Attansic F1 PHY. From FreeBSD via
     kevlo@
  Step 2: mine commits previously performed on files
   with same name in the other system
    freebsd,atphy.c,2008/05/19 01:12:10,yongari,
     Add Attansic/Atheros F1 PHY driver.
    openbsd, atphy.c,2008/09/25 20:47:16,brad,
     Add a driver for the Attansic F1 PHY. From FreeBSD via
     kevlo@
Detecting CSBF - II
  Step 3: compute file similarity with clone detection
    CCFinder
    Threshold: at least 10% of cloned lines
  Step 4: take the previous change with the highest
   textual similarity in the commit note
    Use of Vector Space models
    Cosine similarity; threshold (0.20) to filter out unrelated
     commits

                  Add Attansic/Atheros F1 PHY driver.

                                    =    0.72

         Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
Building Committers' Network
  We extract communication from mailing
   lists
    Bug fixing mailing lists
  Heuristic similar to the one of Bird et al.
   [2006] to map inconsistent namings /
   emails
    Also, to map committer Ids to mailing list
     names/emails
  Nodes of the network labeled as:
    Committer / other mailing list contributors
    CSBFs committer
Empirical Study
 Goal: analyze the phenomenon of CSBFs
 Purpose: understanding its relevance with
  respect to the social characteristics of the
  involved developers
 Context: CVS repositories and mailing lists
  archives of FreeBSD and OpenBSD
   Period: 1993-2009 (FreeBSD), 1998-2009
    (OpenBSD)
   Commits: 119,000 (FreeBSD), 70,000 (OpenBSD)
Research Questions
  RQ1: How do the source code committers
   and contributors of the two systems
   overlap?
  RQ2: How frequent is the phenomenon of
   CSBFs?
  RQ3: Who are the contributors involved in
   CSBFs?
  RQ4: Are mailing list contributors involved
   in CSBFs more active than others?
RQ1 – Team overlap
                              FreeBSD OpenBSD Both
  Committers                      383      211       26
  Mailing list contribs          8035     3843   359
  Committers and                  213     122        17
  mailing list contributors


  The two projects have less than 10% of
 common contributors →
 the development team of Free and
 Open BSD is really different
RQ2 – Commit filtering
   1000                                           933
    900

    800

    700

    600

    500       439
    400
                                                          296
    300

    200               133                                         120
    100
                              59

     0
                    FreeBSD                             OpenBSD

              Referring commits    Cloned files     Linked commits



          At the end of the filtering not that many but...
RQ2 – Cloned lines in CSBF files




         C source files                        header files
  Percentage smaller for .h files
  Use of preprocessor conditional to make header files system-
   dependent
    #if defined(__FreeBSD__)
RQ3 – CSBF Graph (excerpt)
Blue/cyan: FreeBSD
Red/orange: OpenBSD
Yellow: common
RQ3: social characteristics
  Importance in terms of
    (in/out) degree: number of (incoming/outcoming)
     communication links
    Betweenness: number of communications for which the
     node is in the short path
  Brokerage metrics: useful to analyze the
   communication between two clusters

                                B is a coordinator

                                B is a gatekeeper

                                B is a representative
RQ3 – social characteristics
       Representative
          Gatekeeper
           12
       Coordinator /10
           10
   Betweenness / 1000
           8
          Out-degree
                                                                          Column 1
           6
                In-degree                                                 Column 2
                                                                          Column 3
           4
                  Degree
           2                0   5       10   15    20   25    30     35   40   45    50
           0
                   Row 1            CSBF
                                Row 2             Others
                                              Row 3          Row 4



  All differences statistically significant
  High effect size (Cohen d>1)
  Contributors involved in CSBF have a higher importance in
   the communication and in the flow of communication
   between systems
RQ3 – committers with highest
social metrics
RQ4 – change activity of CSBF
committers and others
        LOC added/removed                 Commits
40000                           1500
                                1000
20000
                                 500

    0                              0
         FreeBSD      OpenBSD          FreeBSD      OpenBSD

           CSBF    Others                CSBF    Others




    All differences statistically significant
    High effect size (Cohen d∼1)
    Contributors involved in CSBF are more active
     than others
Conclusions and Work-in-Progress
  We proposed method to mine CSBF
  We reported a study on FreeBSD and OpenBSD where:
    Development team is almost disjoint
    There is a small, though not negligible portion of CSBF
    Committers involved in CSBF have
     – Higher social importance
     – Higher brokerage level
     – Higher activity in source code commits
  Work-in-progress:
    Better approaches to identify implicit CSBF, tracking and
     linking changes occurring on both systems
    More extensive study on less obvious cases

Mais conteúdo relacionado

Semelhante a Dipenta msr2011-csbf

Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1
Adrian Preda
 
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
Dktechnozone.in
 
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
Mohamedshabana38
 
Network_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.pptNetwork_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.ppt
BlackHat41
 
Layer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.pptLayer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.ppt
BeniamTekeste
 
1b network models
1b network models1b network models
1b network models
kavish dani
 
Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)
Piyush Chand
 
1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf
lohithkart
 

Semelhante a Dipenta msr2011-csbf (20)

Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1Intrebari si raspunsuri CCNA1
Intrebari si raspunsuri CCNA1
 
Basic networking 07-2012
Basic networking 07-2012Basic networking 07-2012
Basic networking 07-2012
 
Ch02
Ch02Ch02
Ch02
 
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)OSI - OSI Reference Model and TCP (Transmission Control Protocol)
OSI - OSI Reference Model and TCP (Transmission Control Protocol)
 
Chapter-2.pdf
Chapter-2.pdfChapter-2.pdf
Chapter-2.pdf
 
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
300192190-Chapter-2-Network-Models-Exercise-Question-With-Solution.pdf
 
Chapter 2: Network Models
Chapter 2: Network ModelsChapter 2: Network Models
Chapter 2: Network Models
 
Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012Network Evolution, Standards, & Layered Architectures 2012
Network Evolution, Standards, & Layered Architectures 2012
 
Assignment izaz sir
Assignment izaz sirAssignment izaz sir
Assignment izaz sir
 
Network_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.pptNetwork_Model. In the field of Computer Networking.ppt
Network_Model. In the field of Computer Networking.ppt
 
Layer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.pptLayer_arc_and_OSI_MODEL.ppt
Layer_arc_and_OSI_MODEL.ppt
 
OSI Pankaj yadav
OSI  Pankaj yadavOSI  Pankaj yadav
OSI Pankaj yadav
 
1b network models
1b network models1b network models
1b network models
 
Ch 2 network
Ch 2 networkCh 2 network
Ch 2 network
 
Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)Robot operating systems (ros) overview & (1)
Robot operating systems (ros) overview & (1)
 
Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)Robot Operating Systems (Ros) Overview & (1)
Robot Operating Systems (Ros) Overview & (1)
 
Chapter 2 network models -computer_network
Chapter 2   network models -computer_networkChapter 2   network models -computer_network
Chapter 2 network models -computer_network
 
Network layers
Network layersNetwork layers
Network layers
 
OSI and TCPIP Model
OSI and TCPIP ModelOSI and TCPIP Model
OSI and TCPIP Model
 
1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf1. Answer the following questions about OSI modela.At which layer.pdf
1. Answer the following questions about OSI modela.At which layer.pdf
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Dipenta msr2011-csbf

  • 1. Social Interactions around Cross-System Bug Fixings: The Case of FreeBSD and OpenBSD Gerardo Canfora, Luigi Cerulo, Marta Cimitile, Massimiliano Di Penta dipenta@unisannio.it
  • 2. Context  Source code is often reused across different systems  Unixes (FreeBSD, OpenBSD, Linux)  Office applications (NeoOffice, OpenOffice)  Desktop environment apps (KDE or GNOME apps)  Maintenance might require to propagate bug fixings  We call this “Cross System Bug Fixing” (CSBF)  Example:  FreeBSD, 1996/01/19, file ip_icmp.h: – “Added definitions for ICMP router discovery. Reviewed by: wollman  OpenBSD, 1996/08/02, file ip_icmp.h: – “ICMP Router Discovery definitions; from FreeBSD”
  • 3. What we propose  A method to track CSBFs  A study on the social characteristics and development activity made by CSBF committers  degree, betweenness, brokerage  commits, lines changed
  • 4. Detecting CSBF - I  Step 1: mining cross-referencing commits  openbsd, atphy.c,2008/09/25 20:47:16,brad, Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@  Step 2: mine commits previously performed on files with same name in the other system  freebsd,atphy.c,2008/05/19 01:12:10,yongari, Add Attansic/Atheros F1 PHY driver.  openbsd, atphy.c,2008/09/25 20:47:16,brad, Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
  • 5. Detecting CSBF - II  Step 3: compute file similarity with clone detection  CCFinder  Threshold: at least 10% of cloned lines  Step 4: take the previous change with the highest textual similarity in the commit note  Use of Vector Space models  Cosine similarity; threshold (0.20) to filter out unrelated commits Add Attansic/Atheros F1 PHY driver. = 0.72 Add a driver for the Attansic F1 PHY. From FreeBSD via kevlo@
  • 6. Building Committers' Network  We extract communication from mailing lists  Bug fixing mailing lists  Heuristic similar to the one of Bird et al. [2006] to map inconsistent namings / emails  Also, to map committer Ids to mailing list names/emails  Nodes of the network labeled as:  Committer / other mailing list contributors  CSBFs committer
  • 7. Empirical Study  Goal: analyze the phenomenon of CSBFs  Purpose: understanding its relevance with respect to the social characteristics of the involved developers  Context: CVS repositories and mailing lists archives of FreeBSD and OpenBSD  Period: 1993-2009 (FreeBSD), 1998-2009 (OpenBSD)  Commits: 119,000 (FreeBSD), 70,000 (OpenBSD)
  • 8. Research Questions  RQ1: How do the source code committers and contributors of the two systems overlap?  RQ2: How frequent is the phenomenon of CSBFs?  RQ3: Who are the contributors involved in CSBFs?  RQ4: Are mailing list contributors involved in CSBFs more active than others?
  • 9. RQ1 – Team overlap FreeBSD OpenBSD Both Committers 383 211 26 Mailing list contribs 8035 3843 359 Committers and 213 122 17 mailing list contributors The two projects have less than 10% of common contributors → the development team of Free and Open BSD is really different
  • 10. RQ2 – Commit filtering 1000 933 900 800 700 600 500 439 400 296 300 200 133 120 100 59 0 FreeBSD OpenBSD Referring commits Cloned files Linked commits At the end of the filtering not that many but...
  • 11. RQ2 – Cloned lines in CSBF files C source files header files  Percentage smaller for .h files  Use of preprocessor conditional to make header files system- dependent  #if defined(__FreeBSD__)
  • 12. RQ3 – CSBF Graph (excerpt) Blue/cyan: FreeBSD Red/orange: OpenBSD Yellow: common
  • 13. RQ3: social characteristics  Importance in terms of  (in/out) degree: number of (incoming/outcoming) communication links  Betweenness: number of communications for which the node is in the short path  Brokerage metrics: useful to analyze the communication between two clusters B is a coordinator B is a gatekeeper B is a representative
  • 14. RQ3 – social characteristics Representative Gatekeeper 12 Coordinator /10 10 Betweenness / 1000 8 Out-degree Column 1 6 In-degree Column 2 Column 3 4 Degree 2 0 5 10 15 20 25 30 35 40 45 50 0 Row 1 CSBF Row 2 Others Row 3 Row 4  All differences statistically significant  High effect size (Cohen d>1)  Contributors involved in CSBF have a higher importance in the communication and in the flow of communication between systems
  • 15. RQ3 – committers with highest social metrics
  • 16. RQ4 – change activity of CSBF committers and others LOC added/removed Commits 40000 1500 1000 20000 500 0 0 FreeBSD OpenBSD FreeBSD OpenBSD CSBF Others CSBF Others  All differences statistically significant  High effect size (Cohen d∼1)  Contributors involved in CSBF are more active than others
  • 17. Conclusions and Work-in-Progress  We proposed method to mine CSBF  We reported a study on FreeBSD and OpenBSD where:  Development team is almost disjoint  There is a small, though not negligible portion of CSBF  Committers involved in CSBF have – Higher social importance – Higher brokerage level – Higher activity in source code commits  Work-in-progress:  Better approaches to identify implicit CSBF, tracking and linking changes occurring on both systems  More extensive study on less obvious cases