SlideShare uma empresa Scribd logo
1 de 15
Baixar para ler offline
Topological methods




            Presented by:
            Sukhpal Singh
            Thapar University
Topological methods
Topological methods are based on the simple
premise that, given a query that describes
some required features, we are interested in
identifying library assets that come closest to
providing these features. Such methods are
critically dependent on what it means to come
closest, which in turn depends on some
definition of distance between the query and
candidate assets [1].
Categories of Topological methods
• Exclusive approximate retrieval: Methods that
  fall into this category make a distinction
  between two retrieval goals: exact retrieval and
  approximate retrieval, whereby we seek to
  identify library assets that completely satisfy all
  the requirements of the query.
• Inclusive approximate retrieval: Methods that
  fall into this category make no distinction
  between exact retrieval and approximate
  retrieval. Rather, they focus on identifying
  library assets that minimize some measure of
  distance to the query.
Measures of distance can be divided into two
                broad classes
• Measures of functional (semantic) distance,
  which reflect the extent of similarity between
  the functional properties of the query and those
  of candidate components.

• Measures of structural (syntactic) distance,
  which reflect the extent of similarity between
  the structure of (solutions to) the query and
  the structure of candidate components.
Characterizing topological methods
The Google PageRank Algorithm is
  used in Topological methods to
  retrieve a software assets from
        software repository.
What is PageRank?
• In short PageRank is a “vote”, by all the other
  pages on the Web, about how important a
  page is [3].
• A link to a page counts as a vote of support
• PR(A) = (1-d) + d(PR(T1)/C(T1)
  +…+PR(Tn)/C(Tn))
Breaking Down the Equation
• PR(Tn) - Each page has a notion of its own self-importance. That’s “PR(T1)”
  for the first page in the web all the way up to “PR(Tn)” for the last page

• C(Tn) - Each page spreads its vote out evenly amongst all of it’s outgoing
  links. The count, or number, of outgoing links for page 1 is “C(T1)”, “C(Tn)”
  for page n, and so on for all pages.

• PR(Tn)/C(Tn) - so if our page (page A) has a backlink from page “n” the
  share of the vote page A will get is “PR(Tn)/C(Tn)”

• d(… - All these fractions of votes are added together but, to stop the other
  pages having too much influence, this total vote is “damped down” by
  multiplying it by 0.85 (the factor “d”)

• (1 - d) - The (1 – d) bit at the beginning is a bit of probability math magic so
  the “sum of all web pages’ PageRank's will be one”: it adds in the bit lost
  by the d(…. It also means that if a page has no links to it (no backlinks) even
  then it will still get a small PR of 0.15 (i.e. 1 – 0.85).
How is it Calculated?
• The PR of each page depends on the PR of the
  pages pointing to it.
• But we won’t know what PR those pages have
  until the pages pointing to them have their PR
  calculated and so on.
• So what we do is make a guess.
Simple Example



• Each page has one outgoing link (backlink). So that
  means [2] :

• C(T1) = 1 for A
      and
• C(T2) = 1 for B
We don’t know what their PR should be to begin with, so we
         will just guess 1 as a safe random number.


• d (damping factor) = 0.85
• PR(A)= (1 – d) + d(PR(T1)/C(T1))= (1 – d) + d(1/1)

  i.e.

• PR(A)= 0.15 + 0.85 * 1
  =1
• PR(B)= 0.15 + 0.85 * 1
  =1
Let’s Do It Again with Another Number. Let’s try 0 and re-
                            calculate…
• PR(A)= 0.15 + 0.85 * 0
      = 0.15
      = 0.15 + 0.85 *
• PR(B) 0.15
      = 0.2775
• Now we have calculated a “next best guess” so we just plug it in the
  equation again…

• PR(A)= 0.15 + 0.85 * 0.2775
  = 0.385875
• PR(B)= 0.15 + 0.85 * 0.385875
  = 0.47799375

And again…
• PR(A)= 0.15 + 0.85 * 0.47799375
  = 0.5562946875
• PR(B)= 0.15 + 0.85 * 0.5562946875
  = 0.622850484375
Principle
• It doesn’t matter where you start your guess,
  once the PageRank calculations have settled
  down, the “normalized probability
  distribution” (the average PageRank for all
  pages) will be 1.0
• In software repository we are using software
  assets instead of pages and also using
  relationships among software assets based on
  their keywords instead of links.
Summary
References:
[1]   A survey of software reuse libraries A. Mili a,_, R. Mili
      b and R.T. Mittermeir Annals of Software
      Engineering 5 (1998) 349–414 349

[2]   http://wwwdb.stanford.edu/~backrub/google.html
      http://www-db.stanford.edu/~backrub/google.html

[3]   Semantic Component Retrieval in Software
      Engineering Inaugural dissertation zur Erlangung des
      akademischen       Grades eines Doktors der
      Naturwissenschaften der, Universitat Mannheim,
      Mannheim, 2008

Mais conteúdo relacionado

Mais procurados

Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler Diagrams
Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler DiagramsSpherule Diagrams: A Matrix-based Set Visualization Compared with Euler Diagrams
Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler DiagramsMithileysh Sathiyanarayanan
 
Spherule Diagrams with Graph for Social Network Visualization
Spherule Diagrams with Graph for Social Network VisualizationSpherule Diagrams with Graph for Social Network Visualization
Spherule Diagrams with Graph for Social Network VisualizationMithileysh Sathiyanarayanan
 
Data Structure Assignment help , Data Structure Online tutors
Data Structure Assignment help , Data Structure Online tutorsData Structure Assignment help , Data Structure Online tutors
Data Structure Assignment help , Data Structure Online tutorsjohn mayer
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Henock Beyene
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
 

Mais procurados (6)

Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler Diagrams
Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler DiagramsSpherule Diagrams: A Matrix-based Set Visualization Compared with Euler Diagrams
Spherule Diagrams: A Matrix-based Set Visualization Compared with Euler Diagrams
 
Spherule Diagrams with Graph for Social Network Visualization
Spherule Diagrams with Graph for Social Network VisualizationSpherule Diagrams with Graph for Social Network Visualization
Spherule Diagrams with Graph for Social Network Visualization
 
Lecture6 pca
Lecture6 pcaLecture6 pca
Lecture6 pca
 
Data Structure Assignment help , Data Structure Online tutors
Data Structure Assignment help , Data Structure Online tutorsData Structure Assignment help , Data Structure Online tutors
Data Structure Assignment help , Data Structure Online tutors
 
Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01Spsshelp 100608163328-phpapp01
Spsshelp 100608163328-phpapp01
 
Contextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender SystemsContextual Information Elicitation in Travel Recommender Systems
Contextual Information Elicitation in Travel Recommender Systems
 

Destaque

How to Write an Effective Research Paper
How to Write an Effective Research PaperHow to Write an Effective Research Paper
How to Write an Effective Research PaperDr Sukhpal Singh Gill
 
Reduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed ImageReduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed ImageDr Sukhpal Singh Gill
 
Java.NET: Integration of Java and .NET
Java.NET: Integration of Java and .NETJava.NET: Integration of Java and .NET
Java.NET: Integration of Java and .NETDr Sukhpal Singh Gill
 
If you know nothing about HTML, this is where you can start !!
If you know nothing about HTML, this is where you can start !!If you know nothing about HTML, this is where you can start !!
If you know nothing about HTML, this is where you can start !!Dr Sukhpal Singh Gill
 
Reduction of Blocking Artifacts In JPEG Compressed Image
 Reduction of Blocking Artifacts In JPEG Compressed Image Reduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed ImageDr Sukhpal Singh Gill
 
GREEN CLOUD COMPUTING-A Data Center Approach
GREEN CLOUD COMPUTING-A Data Center ApproachGREEN CLOUD COMPUTING-A Data Center Approach
GREEN CLOUD COMPUTING-A Data Center ApproachDr Sukhpal Singh Gill
 
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)Workshop on Basics of Software Engineering (DFD, UML and Project Culture)
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)Dr Sukhpal Singh Gill
 
Software Requirements Specification (SRS) for Online Tower Plotting System (O...
Software Requirements Specification (SRS) for Online Tower Plotting System (O...Software Requirements Specification (SRS) for Online Tower Plotting System (O...
Software Requirements Specification (SRS) for Online Tower Plotting System (O...Dr Sukhpal Singh Gill
 
Case Study Based Software Engineering Project Development: State of Art
Case Study Based Software Engineering Project Development: State of ArtCase Study Based Software Engineering Project Development: State of Art
Case Study Based Software Engineering Project Development: State of ArtDr Sukhpal Singh Gill
 

Destaque (14)

How to Write an Effective Research Paper
How to Write an Effective Research PaperHow to Write an Effective Research Paper
How to Write an Effective Research Paper
 
Reduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed ImageReduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed Image
 
The reuse capability model
The reuse capability modelThe reuse capability model
The reuse capability model
 
Network Topologies
Network TopologiesNetwork Topologies
Network Topologies
 
Java.NET: Integration of Java and .NET
Java.NET: Integration of Java and .NETJava.NET: Integration of Java and .NET
Java.NET: Integration of Java and .NET
 
If you know nothing about HTML, this is where you can start !!
If you know nothing about HTML, this is where you can start !!If you know nothing about HTML, this is where you can start !!
If you know nothing about HTML, this is where you can start !!
 
Reduction of Blocking Artifacts In JPEG Compressed Image
 Reduction of Blocking Artifacts In JPEG Compressed Image Reduction of Blocking Artifacts In JPEG Compressed Image
Reduction of Blocking Artifacts In JPEG Compressed Image
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
GREEN CLOUD COMPUTING-A Data Center Approach
GREEN CLOUD COMPUTING-A Data Center ApproachGREEN CLOUD COMPUTING-A Data Center Approach
GREEN CLOUD COMPUTING-A Data Center Approach
 
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)Workshop on Basics of Software Engineering (DFD, UML and Project Culture)
Workshop on Basics of Software Engineering (DFD, UML and Project Culture)
 
Software Requirements Specification (SRS) for Online Tower Plotting System (O...
Software Requirements Specification (SRS) for Online Tower Plotting System (O...Software Requirements Specification (SRS) for Online Tower Plotting System (O...
Software Requirements Specification (SRS) for Online Tower Plotting System (O...
 
Case Study Based Software Engineering Project Development: State of Art
Case Study Based Software Engineering Project Development: State of ArtCase Study Based Software Engineering Project Development: State of Art
Case Study Based Software Engineering Project Development: State of Art
 
Software Requirement Specification
Software Requirement SpecificationSoftware Requirement Specification
Software Requirement Specification
 
Constructors and Destructors
Constructors and DestructorsConstructors and Destructors
Constructors and Destructors
 

Semelhante a Topological methods

Analysis Of Algorithm
Analysis Of AlgorithmAnalysis Of Algorithm
Analysis Of AlgorithmBashi9675
 
Implementing page rank algorithm using hadoop map reduce
Implementing page rank algorithm using hadoop map reduceImplementing page rank algorithm using hadoop map reduce
Implementing page rank algorithm using hadoop map reduceFarzan Hajian
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystificationRaja R
 
Local Approximation of PageRank
Local Approximation of PageRankLocal Approximation of PageRank
Local Approximation of PageRanksjuyal
 
BigData - PageRank Algorithm with Scala and Spark
BigData - PageRank Algorithm with Scala and SparkBigData - PageRank Algorithm with Scala and Spark
BigData - PageRank Algorithm with Scala and SparkUraz Pokharel
 
PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searchingrahulbindra
 
Optimizing search engines
Optimizing search enginesOptimizing search engines
Optimizing search enginesSwapnil Kotwal
 
Page Rank
Page RankPage Rank
Page RankDiego
 
Page Rank
Page RankPage Rank
Page Rankmaribel
 

Semelhante a Topological methods (20)

Page rank1
Page rank1Page rank1
Page rank1
 
Analysis Of Algorithm
Analysis Of AlgorithmAnalysis Of Algorithm
Analysis Of Algorithm
 
Dm page rank
Dm page rankDm page rank
Dm page rank
 
How Google Works
How Google WorksHow Google Works
How Google Works
 
Page rank algortihm
Page rank algortihmPage rank algortihm
Page rank algortihm
 
Implementing page rank algorithm using hadoop map reduce
Implementing page rank algorithm using hadoop map reduceImplementing page rank algorithm using hadoop map reduce
Implementing page rank algorithm using hadoop map reduce
 
Page rank2
Page rank2Page rank2
Page rank2
 
PageRank
PageRankPageRank
PageRank
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
 
Local Approximation of PageRank
Local Approximation of PageRankLocal Approximation of PageRank
Local Approximation of PageRank
 
BigData - PageRank Algorithm with Scala and Spark
BigData - PageRank Algorithm with Scala and SparkBigData - PageRank Algorithm with Scala and Spark
BigData - PageRank Algorithm with Scala and Spark
 
PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searching
 
Optimizing search engines
Optimizing search enginesOptimizing search engines
Optimizing search engines
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 

Último

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Topological methods

  • 1. Topological methods Presented by: Sukhpal Singh Thapar University
  • 2. Topological methods Topological methods are based on the simple premise that, given a query that describes some required features, we are interested in identifying library assets that come closest to providing these features. Such methods are critically dependent on what it means to come closest, which in turn depends on some definition of distance between the query and candidate assets [1].
  • 3. Categories of Topological methods • Exclusive approximate retrieval: Methods that fall into this category make a distinction between two retrieval goals: exact retrieval and approximate retrieval, whereby we seek to identify library assets that completely satisfy all the requirements of the query. • Inclusive approximate retrieval: Methods that fall into this category make no distinction between exact retrieval and approximate retrieval. Rather, they focus on identifying library assets that minimize some measure of distance to the query.
  • 4. Measures of distance can be divided into two broad classes • Measures of functional (semantic) distance, which reflect the extent of similarity between the functional properties of the query and those of candidate components. • Measures of structural (syntactic) distance, which reflect the extent of similarity between the structure of (solutions to) the query and the structure of candidate components.
  • 6. The Google PageRank Algorithm is used in Topological methods to retrieve a software assets from software repository.
  • 7. What is PageRank? • In short PageRank is a “vote”, by all the other pages on the Web, about how important a page is [3]. • A link to a page counts as a vote of support • PR(A) = (1-d) + d(PR(T1)/C(T1) +…+PR(Tn)/C(Tn))
  • 8. Breaking Down the Equation • PR(Tn) - Each page has a notion of its own self-importance. That’s “PR(T1)” for the first page in the web all the way up to “PR(Tn)” for the last page • C(Tn) - Each page spreads its vote out evenly amongst all of it’s outgoing links. The count, or number, of outgoing links for page 1 is “C(T1)”, “C(Tn)” for page n, and so on for all pages. • PR(Tn)/C(Tn) - so if our page (page A) has a backlink from page “n” the share of the vote page A will get is “PR(Tn)/C(Tn)” • d(… - All these fractions of votes are added together but, to stop the other pages having too much influence, this total vote is “damped down” by multiplying it by 0.85 (the factor “d”) • (1 - d) - The (1 – d) bit at the beginning is a bit of probability math magic so the “sum of all web pages’ PageRank's will be one”: it adds in the bit lost by the d(…. It also means that if a page has no links to it (no backlinks) even then it will still get a small PR of 0.15 (i.e. 1 – 0.85).
  • 9. How is it Calculated? • The PR of each page depends on the PR of the pages pointing to it. • But we won’t know what PR those pages have until the pages pointing to them have their PR calculated and so on. • So what we do is make a guess.
  • 10. Simple Example • Each page has one outgoing link (backlink). So that means [2] : • C(T1) = 1 for A and • C(T2) = 1 for B
  • 11. We don’t know what their PR should be to begin with, so we will just guess 1 as a safe random number. • d (damping factor) = 0.85 • PR(A)= (1 – d) + d(PR(T1)/C(T1))= (1 – d) + d(1/1) i.e. • PR(A)= 0.15 + 0.85 * 1 =1 • PR(B)= 0.15 + 0.85 * 1 =1
  • 12. Let’s Do It Again with Another Number. Let’s try 0 and re- calculate… • PR(A)= 0.15 + 0.85 * 0 = 0.15 = 0.15 + 0.85 * • PR(B) 0.15 = 0.2775 • Now we have calculated a “next best guess” so we just plug it in the equation again… • PR(A)= 0.15 + 0.85 * 0.2775 = 0.385875 • PR(B)= 0.15 + 0.85 * 0.385875 = 0.47799375 And again… • PR(A)= 0.15 + 0.85 * 0.47799375 = 0.5562946875 • PR(B)= 0.15 + 0.85 * 0.5562946875 = 0.622850484375
  • 13. Principle • It doesn’t matter where you start your guess, once the PageRank calculations have settled down, the “normalized probability distribution” (the average PageRank for all pages) will be 1.0 • In software repository we are using software assets instead of pages and also using relationships among software assets based on their keywords instead of links.
  • 15. References: [1] A survey of software reuse libraries A. Mili a,_, R. Mili b and R.T. Mittermeir Annals of Software Engineering 5 (1998) 349–414 349 [2] http://wwwdb.stanford.edu/~backrub/google.html http://www-db.stanford.edu/~backrub/google.html [3] Semantic Component Retrieval in Software Engineering Inaugural dissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften der, Universitat Mannheim, Mannheim, 2008