SlideShare uma empresa Scribd logo
1 de 18
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web sites which usually is a huge set. So the ranking of these web sites is very important. Because much information is contained in the link-structure of the WWW, information such as which pages are linked to others can be used to augment search algorithms.
[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]
SALSA----Idea SALSA is based upon the theory of Markov chains,  and relies on the stochastic properties of random walks  performed on our collection of sites. The input to our scheme consists of a collection of sites  C  which is built around a topic  t . Intuition  suggests that authoritative sites on topic  t  should be visible from many sites in the subgraph induced by  C .  Thus, a random walk on this subgraph will visit t -authorities with high probability.
SALSA----Idea Combine the theory of random walks with the notion  of the two distinct types of Web sites, hubs and  authorities, and actually analyze two different Markov  chains: A chain of hubs and a chain of authorities.  Analyzing both chains allows our approach to give each Web site two distinct scores, a hub score and an  authority score.
[object Object],[object Object],[object Object],[object Object],[object Object]
SALSA the principal community of authorities(hubs) found by the SALSA will be composed of the sites whose entries in the principal eigenvector of  A  ( H ) are the highest.
SALSA----Conclusion SALSA is a new stochastic approach for link structure analysis, which examines random walks on graphs derived from the link structure.  The principal community of authorities(hubs) corresponds to the sites that are most frequently visited by the random walk defined by the authority(hub) Markov chain.
The PageRank Citation Ranking: Bringing Order to the Web Larry Page etc. Stanford University
PageRank----Idea Every page has some number of forward links(outedges) and backlinks(inedges)
PageRank----Idea ,[object Object],[object Object]
PageRank----Idea ,[object Object],A page has high rank if the sum of the ranks of its backlinks is high. This covers both the case when a page has many backlinks and when a page has a few highly ranked backlinks.
PageRank----Definition u: a web page F u :  set of pages u points to  B u :  set of pages that point to u N u =|F u |:  the number of links from u  c: a factor used for normalization The equation is recursive, but it may be computed by starting with any set of ranks and iterating the computation until it converges.
PageRank----definition A problem with above definition:  rank sink If two web pages point to each other but to no other page, during the iteration, this loop will accumulate rank but  never distribute any rank.
PageRank----definition Definition modified: E(u) is some vector over the web pages(for example uniform, favorite page etc.) that corresponds to a source of rank.  E(u) is a user designed parameter.
PageRank----Random Surfer Model ,[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]

Mais conteúdo relacionado

Destaque (8)

091112 R T Andiarena Aranburu
091112 R T Andiarena Aranburu091112 R T Andiarena Aranburu
091112 R T Andiarena Aranburu
 
Jennifer’S Baby Shower
Jennifer’S Baby ShowerJennifer’S Baby Shower
Jennifer’S Baby Shower
 
Twitter And Broadcasting
Twitter And BroadcastingTwitter And Broadcasting
Twitter And Broadcasting
 
Rett Disorder Syndrome
Rett Disorder SyndromeRett Disorder Syndrome
Rett Disorder Syndrome
 
Tara Public Presentation
Tara Public PresentationTara Public Presentation
Tara Public Presentation
 
ondare industriala villabonan
ondare industriala villabonanondare industriala villabonan
ondare industriala villabonan
 
Kajian Tindakan Dalam Pendidikan Upload
Kajian Tindakan Dalam Pendidikan UploadKajian Tindakan Dalam Pendidikan Upload
Kajian Tindakan Dalam Pendidikan Upload
 
Hubungan Antara Tahap Kesediaan Pembelajaran Arahan Kendiri Dengan Tahap Peng...
Hubungan Antara Tahap Kesediaan Pembelajaran Arahan Kendiri Dengan Tahap Peng...Hubungan Antara Tahap Kesediaan Pembelajaran Arahan Kendiri Dengan Tahap Peng...
Hubungan Antara Tahap Kesediaan Pembelajaran Arahan Kendiri Dengan Tahap Peng...
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Pagerank

  • 1. Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web sites which usually is a huge set. So the ranking of these web sites is very important. Because much information is contained in the link-structure of the WWW, information such as which pages are linked to others can be used to augment search algorithms.
  • 2.
  • 3.
  • 4. SALSA----Idea SALSA is based upon the theory of Markov chains, and relies on the stochastic properties of random walks performed on our collection of sites. The input to our scheme consists of a collection of sites C which is built around a topic t . Intuition suggests that authoritative sites on topic t should be visible from many sites in the subgraph induced by C . Thus, a random walk on this subgraph will visit t -authorities with high probability.
  • 5. SALSA----Idea Combine the theory of random walks with the notion of the two distinct types of Web sites, hubs and authorities, and actually analyze two different Markov chains: A chain of hubs and a chain of authorities. Analyzing both chains allows our approach to give each Web site two distinct scores, a hub score and an authority score.
  • 6.
  • 7. SALSA the principal community of authorities(hubs) found by the SALSA will be composed of the sites whose entries in the principal eigenvector of A ( H ) are the highest.
  • 8. SALSA----Conclusion SALSA is a new stochastic approach for link structure analysis, which examines random walks on graphs derived from the link structure. The principal community of authorities(hubs) corresponds to the sites that are most frequently visited by the random walk defined by the authority(hub) Markov chain.
  • 9. The PageRank Citation Ranking: Bringing Order to the Web Larry Page etc. Stanford University
  • 10. PageRank----Idea Every page has some number of forward links(outedges) and backlinks(inedges)
  • 11.
  • 12.
  • 13. PageRank----Definition u: a web page F u : set of pages u points to B u : set of pages that point to u N u =|F u |: the number of links from u c: a factor used for normalization The equation is recursive, but it may be computed by starting with any set of ranks and iterating the computation until it converges.
  • 14. PageRank----definition A problem with above definition: rank sink If two web pages point to each other but to no other page, during the iteration, this loop will accumulate rank but never distribute any rank.
  • 15. PageRank----definition Definition modified: E(u) is some vector over the web pages(for example uniform, favorite page etc.) that corresponds to a source of rank. E(u) is a user designed parameter.
  • 16.
  • 17.
  • 18.