SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
Layers


                         An Adaptive Filter-Framework for the
                         Quality Improvement of Open-Source
                                   Software Analysis

                            Advanced Community Information Systems (ACIS)
                                  RWTH Aachen University, Germany
                              Anna Hannemann, Michael Hackstein, Ralf
                                       Klamma, Matthias Jarke
Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
          1                  This slide deck is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Open Source Software Projects

     Layers                Community-driven    Development
                           Voluntary participation
                           Communication, project management and
                            development via Web tools
                           Some successful and famous examples
                           Smaller niche projects
                           A long-tail of unsuccessful projects


Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
          2
Open Source Software Analysis for
                                Software Engineering

     Layers                Understand,  model, simulate and organize
                            community-driven development
                           Agile development practices
                           Distributed and intercultural practices
                           New success factors
                           Long-term freely available datasets
                           Low cost empirical studies


Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
          3
Open Source Software Analysis
                               Research Results

     Layers




Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
          4                       Scacchi, “The Future Research in Free/Open Source Software Development”, 2010
Techniques for Knowledge Mining in
                             Development Repositories

     Layers




                           Results
                                 are only as good as data is!
                           Remember DNA Phantom?
                            “A hypothesized unknown female serial killer as a result of
                            contaminated cotton swabs used for collecting DNA”
                           MineData not Noise!
                            Cleaning of Artifacts from Communication and
Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
                                 Development Repositories Needed
          5
Data Cleaning for Knowledge Mining
                              in Development Repositories

     Layers

                           Data-structure   independence: variable artifacts types
                           Additive filtering: filter only new data
                           Filter nesting: sequence of arbitrary order
                           Consistent data format: cross-medium analysis
                           Consistent and easy-to-use interface
                           Extensibility: continuous evolution
                           Adaptive database insertion
Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
          6
Adaptive-Filtering Approach
                                   Cross-Media Mapping

     Layers
                         Artifact types
                           Mail
                           Comment
                           Post
                           ...
                         Cross-media mapping
                           Assignment of semantic meaning to artifact elements
                           Extensibility to new data sources
Lehrstuhl Informatik 5
(Information Systems)      Same filters for different data
   Prof. Dr. M. Jarke
          7
Adaptive-Filtering Approach
                                       Filter Nesting

     Layers                Sequence     of filters F1, F2, …, FN
                           Results in same predefined format
                           One filter – one cleaning (analysis) task
                           Each filter triggers its predecessor
                           Complex filter as a combination of several filters
                           Filtering triggered on demand
                           Filtering of a subset possible
                           Simple filters first and than analysis of reduced data
Lehrstuhl Informatik 5
(Information Systems)
                            set with more filters of higher complexity
   Prof. Dr. M. Jarke
          8
Adaptive-Filtering Approach
                                    Multi-Threading

     Layers




                           Only  new data is filtered
                           Asynchronous processing: filtered data subset is
                            provided directly to the next analysis task
Lehrstuhl Informatik 5
                           Synchronous processing: wait till the complete data
                            set is filtered
(Information Systems)
   Prof. Dr. M. Jarke
          9
Dataset Reduction and Content
                                     Cleaning Filters

     Layers                Dataset   Reduction Filter (DRF)
                            –  Reduces amount of artifacts
                            –  Select artifacts, which fulfill certain criteria
                            –  Example
                                –  Spam detection
                                –  Artifact classification based on Bayes Decision Rule
                           Content   Cleaning Filter (CRF)
                            –  Modifies content of artifacts
                            –  Example
Lehrstuhl Informatik 5
                                –  Quotation Filter
(Information Systems)
   Prof. Dr. M. Jarke
         10
                                –  Detection of predefined patterns in content
Artifact Transformation Filters

     Layers                Filter
                                 as analysis task
                           Modifies artifact attributes
                           Example:
                            –  Core-Periphery Filter: Separates
                               core of community from periphery
                            –  Hierarchical clustering based on
                               power law distribution


Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
         11
Validation in BioJava, Biopython and
                             BioPerl OSS: Spam Detection

     Layers
                             BioJava




                         Spam and spammer level in mailing lists of OSS
                           Significant amount (up to 60%)
                           Non-monoton
                           Distortion of dynamics
Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
         12
Validation in BioJava, Biopython and
                            BioPerl OSS: Results Distortion

     Layers




                                                 Year 2004, BioJava


                         Mood within project community
                           Summarized sentiment of project Mails per month
                           Positive sentiment of spam advertisement
                           Incorrect sentiment assignment due to quotation
Lehrstuhl Informatik 5
(Information Systems)
   Prof. Dr. M. Jarke
         13
Adaptive Filter-Framework and OSS
                                               Analysis
                           OSS Analysis for SE
     Layers
                            –  Methods/metrics for knowledge mining in company
                               communication and development repositories
                            –  Understanding of community-oriented development:
                               principles, obstacles and advantages
                         !  Data Cleaning: Results are only as good as data is!
                           Adaptive   Filter-Framework
                            –  Significant noise level in data
                            –  Adaptable for any Web artifact format
Lehrstuhl Informatik 5
                            –  Filter nesting
(Information Systems)
   Prof. Dr. M. Jarke
         14                 –  Filter as analysis method

Mais conteúdo relacionado

Mais procurados

download
downloaddownload
downloadbutest
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)IJCSEA Journal
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Amit Sheth
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyChris Evelo
 
Poster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptPoster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptNicolas Bertrand
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways: Chris Evelo
 
Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Carla Lima
 
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)David Yonge-Mallo
 
A Survey on Bioinformatics Tools
A Survey on Bioinformatics ToolsA Survey on Bioinformatics Tools
A Survey on Bioinformatics Toolsidescitation
 
Artista a network for ar tifical immune sys tems
Artista a network for ar tifical immune sys temsArtista a network for ar tifical immune sys tems
Artista a network for ar tifical immune sys temsUltraUploader
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceJustin Johnson
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Mathew Varghese
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Keesthehyve
 
NatashaBME1450.doc
NatashaBME1450.docNatashaBME1450.doc
NatashaBME1450.docbutest
 

Mais procurados (20)

download
downloaddownload
download
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
CSHALS 2013
CSHALS 2013CSHALS 2013
CSHALS 2013
 
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)
MICROARRAY GENE EXPRESSION ANALYSIS USING TYPE 2 FUZZY LOGIC(MGA-FL)
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
Using ontologies to do integrative systems biology
Using ontologies to do integrative systems biologyUsing ontologies to do integrative systems biology
Using ontologies to do integrative systems biology
 
Poster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptPoster Semantic data integration proof of concept
Poster Semantic data integration proof of concept
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
 
Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...Experiences in building an ontology driven image database for ...
Experiences in building an ontology driven image database for ...
 
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)
Task-Specific Query Expansion for Genomics (MultiText Experiments for TREC 2003)
 
A Survey on Bioinformatics Tools
A Survey on Bioinformatics ToolsA Survey on Bioinformatics Tools
A Survey on Bioinformatics Tools
 
Artista a network for ar tifical immune sys tems
Artista a network for ar tifical immune sys temsArtista a network for ar tifical immune sys tems
Artista a network for ar tifical immune sys tems
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real Science
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Kees
 
Use of data
Use of dataUse of data
Use of data
 
NatashaBME1450.doc
NatashaBME1450.docNatashaBME1450.doc
NatashaBME1450.doc
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 

Semelhante a An Adaptive Filter-Framework for the Quality Improvement of Open-Source Software Analysis

An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...Michael Derntl
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Dejan Kovachev
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksLaurens De Vocht
 
Genevestigator
GenevestigatorGenevestigator
GenevestigatorBITS
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data viaNeuroscience Information Framework
 
Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesMilos Kravcik
 
Interactions for Learning as Expressed in an IMS LD Runtime Environment
Interactions for Learning as Expressed in an IMS LD Runtime EnvironmentInteractions for Learning as Expressed in an IMS LD Runtime Environment
Interactions for Learning as Expressed in an IMS LD Runtime EnvironmentMichael Derntl
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessBen Busby
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dcc.titus.brown
 
NetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for LearningNetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for LearningMohamed Amine Chatti
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyAlex Henderson
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1BigData_Europe
 
Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementNeuroMat
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessBen Busby
 
Laskaris mining information_neuroinformatics
Laskaris mining information_neuroinformaticsLaskaris mining information_neuroinformatics
Laskaris mining information_neuroinformaticsLaskaris
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Neuroscience Information Framework
 
Learning Analytics for the Lifelong Long Tail Learner
Learning Analytics for the Lifelong Long Tail LearnerLearning Analytics for the Lifelong Long Tail Learner
Learning Analytics for the Lifelong Long Tail LearnerRalf Klamma
 

Semelhante a An Adaptive Filter-Framework for the Quality Improvement of Open-Source Software Analysis (20)

An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...
 
Researcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social NetworksResearcher Profiling based on Semantic Analysis in Social Networks
Researcher Profiling based on Semantic Analysis in Social Networks
 
Genevestigator
GenevestigatorGenevestigator
Genevestigator
 
BITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics dataBITS - Genevestigator to easily access transcriptomics data
BITS - Genevestigator to easily access transcriptomics data
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data via
 
Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based Communities
 
Interactions for Learning as Expressed in an IMS LD Runtime Environment
Interactions for Learning as Expressed in an IMS LD Runtime EnvironmentInteractions for Learning as Expressed in an IMS LD Runtime Environment
Interactions for Learning as Expressed in an IMS LD Runtime Environment
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data access
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
 
NetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for LearningNetLearn: Social Network Analysis and Visualizations for Learning
NetLearn: Social Network Analysis and Visualizations for Learning
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1
 
Data Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow ManagementData Provenance and Scientific Workflow Management
Data Provenance and Scientific Workflow Management
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_access
 
Laskaris mining information_neuroinformatics
Laskaris mining information_neuroinformaticsLaskaris mining information_neuroinformatics
Laskaris mining information_neuroinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
Learning Analytics for the Lifelong Long Tail Learner
Learning Analytics for the Lifelong Long Tail LearnerLearning Analytics for the Lifelong Long Tail Learner
Learning Analytics for the Lifelong Long Tail Learner
 

Último

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

An Adaptive Filter-Framework for the Quality Improvement of Open-Source Software Analysis

  • 1. Layers An Adaptive Filter-Framework for the Quality Improvement of Open-Source Software Analysis Advanced Community Information Systems (ACIS) RWTH Aachen University, Germany Anna Hannemann, Michael Hackstein, Ralf Klamma, Matthias Jarke Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 1 This slide deck is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
  • 2. Open Source Software Projects Layers   Community-driven Development   Voluntary participation   Communication, project management and development via Web tools   Some successful and famous examples   Smaller niche projects   A long-tail of unsuccessful projects Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 2
  • 3. Open Source Software Analysis for Software Engineering Layers   Understand, model, simulate and organize community-driven development   Agile development practices   Distributed and intercultural practices   New success factors   Long-term freely available datasets   Low cost empirical studies Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 3
  • 4. Open Source Software Analysis Research Results Layers Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 4 Scacchi, “The Future Research in Free/Open Source Software Development”, 2010
  • 5. Techniques for Knowledge Mining in Development Repositories Layers   Results are only as good as data is!   Remember DNA Phantom? “A hypothesized unknown female serial killer as a result of contaminated cotton swabs used for collecting DNA”   MineData not Noise! Cleaning of Artifacts from Communication and Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke Development Repositories Needed 5
  • 6. Data Cleaning for Knowledge Mining in Development Repositories Layers   Data-structure independence: variable artifacts types   Additive filtering: filter only new data   Filter nesting: sequence of arbitrary order   Consistent data format: cross-medium analysis   Consistent and easy-to-use interface   Extensibility: continuous evolution   Adaptive database insertion Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 6
  • 7. Adaptive-Filtering Approach Cross-Media Mapping Layers Artifact types   Mail   Comment   Post   ... Cross-media mapping   Assignment of semantic meaning to artifact elements   Extensibility to new data sources Lehrstuhl Informatik 5 (Information Systems)   Same filters for different data Prof. Dr. M. Jarke 7
  • 8. Adaptive-Filtering Approach Filter Nesting Layers   Sequence of filters F1, F2, …, FN   Results in same predefined format   One filter – one cleaning (analysis) task   Each filter triggers its predecessor   Complex filter as a combination of several filters   Filtering triggered on demand   Filtering of a subset possible   Simple filters first and than analysis of reduced data Lehrstuhl Informatik 5 (Information Systems) set with more filters of higher complexity Prof. Dr. M. Jarke 8
  • 9. Adaptive-Filtering Approach Multi-Threading Layers   Only new data is filtered   Asynchronous processing: filtered data subset is provided directly to the next analysis task Lehrstuhl Informatik 5   Synchronous processing: wait till the complete data set is filtered (Information Systems) Prof. Dr. M. Jarke 9
  • 10. Dataset Reduction and Content Cleaning Filters Layers   Dataset Reduction Filter (DRF) –  Reduces amount of artifacts –  Select artifacts, which fulfill certain criteria –  Example –  Spam detection –  Artifact classification based on Bayes Decision Rule   Content Cleaning Filter (CRF) –  Modifies content of artifacts –  Example Lehrstuhl Informatik 5 –  Quotation Filter (Information Systems) Prof. Dr. M. Jarke 10 –  Detection of predefined patterns in content
  • 11. Artifact Transformation Filters Layers   Filter as analysis task   Modifies artifact attributes   Example: –  Core-Periphery Filter: Separates core of community from periphery –  Hierarchical clustering based on power law distribution Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 11
  • 12. Validation in BioJava, Biopython and BioPerl OSS: Spam Detection Layers BioJava Spam and spammer level in mailing lists of OSS   Significant amount (up to 60%)   Non-monoton   Distortion of dynamics Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 12
  • 13. Validation in BioJava, Biopython and BioPerl OSS: Results Distortion Layers Year 2004, BioJava Mood within project community   Summarized sentiment of project Mails per month   Positive sentiment of spam advertisement   Incorrect sentiment assignment due to quotation Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke 13
  • 14. Adaptive Filter-Framework and OSS Analysis   OSS Analysis for SE Layers –  Methods/metrics for knowledge mining in company communication and development repositories –  Understanding of community-oriented development: principles, obstacles and advantages !  Data Cleaning: Results are only as good as data is!   Adaptive Filter-Framework –  Significant noise level in data –  Adaptable for any Web artifact format Lehrstuhl Informatik 5 –  Filter nesting (Information Systems) Prof. Dr. M. Jarke 14 –  Filter as analysis method