SlideShare a Scribd company logo
1 of 16
LOCAL PAGERANK
APPROXIMATION
Group 21
Shashank Juyal (201305537)
Roopgundeep S Sodhi (201101047)
K Prathyusha (201025173)
Bharathi S (201350885)
CONTENT
 What is PageRank?
 Objective
 PageRank for whole Dataset
 Local Approximation of PageRank
 Experiments and Results
 Challenges and Issues
 Conclusion and Future Scope
 References
WHAT IS PAGERANK?
 Named after Larry Page, cofounder of Google
 PageRank is an algorithm used by Google
Search to rank websites in their search engine
results.
 Way of measuring the importance of website
pages
 Works by counting the number and quality of
links to a page to determine a rough estimate of
how important the website is.
OBJECTIVE
 In General, for PageRank calculation, a global
computation is needed
 But there are situations in which PageRank scores
are required for just a small subset of the nodes.
 Suppose a web site owner want to promote his
website in search engine rankings in order to attract
traffic of potential clients.
 So he is interested only in the PageRank score of his
own website but not in the PageRank scores of all
other web pages.
OBJECTIVE
 Global PageRank computation for the entire web
graph is out of the question for most users, as it
requires significant resources and knowhow.
 That is why Local Approximation of Page-Rank
is required.
PAGERANK FOR WHOLE DATASET
 We traversed through the dataset and applied the
algorithm proposed by Page and Brin on the set directly.
1. In that approach, Page Rank for each page is calculated
based on the back links which are pointing to that page.
2. A given Page-Rank value of a page is equally divided
among the forward-links of that page. The page to which
it has pointed will use that value to calculate its own page
rank.
3. Additional factor has also to be considered which will
make sure that the page-rank algorithm converges
(especially in cases where loops are present).
PAGERANK FOR WHOLE DATASET
 Algorithm
(Proposed by Larry Page and Sergie Brin )
Where,
-PR(X) is the PageRank of page X, initial value of 1
-PR(Ti) is the PageRank of pages Ti which link to page A,
-C(Ti) is the number of backward links on page Ti and
-d is a damping factor which can be set between 0 & 1.
Iterate over pages
Calculate for each page
PR(X) = (1-d) + d ( PR(T1) / C(T1) + ... +
PR(Tn) / C(Tn))
Till PR(X-1)=PR(X) for all pages
LOCAL PAGERANK APPROXIMATION
 Given a node (page), we have to calculate the approximate
page rank:
 The Algorithm crawls the sub-graph of radius r around the
given node (page) “backwards” in BFS order. For each node
(page) v at layer t, the algorithm calculates the influence of
v on given node at radius t.
 It sums up the influence values, weighted by some factor.
For that the algorithm uses the recursive property of
influence: the influence of v on given node at radius t
equals the average influence of the out-neighbours of v on
given node at radius t−1.
LOCAL PAGERANK APPROXIMATION
 Now we can have two approaches to consider the
value 'r‘
1. Run the algorithm with r, which is guaranteed
to be an upper bound
2. Run the algorithm without knowing r a priori,
and stop the algorithm whenever we notice that
the value of Page-Rank does not change by
much.
EXPERIMENTS AND RESULTS
1. Error Percentage for different pageids and radius
values
EXPERIMENTS AND RESULTS
2. Time taken by local approximation for different
pageids and radius values
CHALLENGES AND ISSUES
 Loading small indexes into memory created problem. But
we resolved it by increasing the heap size allocated for the
Virtual Machine
 Deciding the threshold value during the implementation of
pruning.
 There is no unique value for threshold as it varies widely
for different PageRank values.
 Choose wisely !!
CONCLUSIONS AND FUTURE SCOPE
 Normal Procedure to calculate PageRank consider whole
DataSet for its computation which is time and resource
consuming and also not feasible in most of the situations.
 So Local approximation of PageRank can be predicted by
just calculating PageRank over nodes in a smaller graph
without calculating PageRank for all the nodes in the
dataset
 The results obtained are very near to the original
PageRank results with the average error rate of 15 -20 %.
CONCLUSIONS AND FUTURE SCOPE
 The implementation of algorithm and the correctness of the
value depend upon the radius defined for the smaller
graph.
 Smaller the radius, higher the error rate and vice versa.
 But on increasing the radius, the complexity increases
exponentially as the number of in links we have to deal
with becomes very large.
 Generally a value of r=3-4 is taken.
 Pruning techniques can be used to increase the value of r in
which the procedure removes all nodes whose influence is
below some threshold value T from layer r.
REFERENCES
 Ziv Bar-Yossef, Li-Tal Mashiach, and Google Haifa
Engineering Center, Haifa, Israel, Local Approximation
of PageRank and Reverse PageRank, October 26–30,
2008, ACM 978-1-59593-991-3/08/10
 Lawrence Page, Sergey Brin, Rajeev Motwani, Terry
Winograd, The PageRank Citation Ranking: Bringing
Order to the Web, January 29, 1998, Stanford InfoLab
 Yen-Yu Chen, Qingquin Gan, Torsten Suel, Local
Methods for Estimating PageRank Values, November
8-13, 2004, CIKM’04
Local Approximation of PageRank

More Related Content

Similar to Local Approximation of PageRank (20)

Page rank2
Page rank2Page rank2
Page rank2
 
I04015559
I04015559I04015559
I04015559
 
Page Rank Link Farm Detection
Page Rank Link Farm DetectionPage Rank Link Farm Detection
Page Rank Link Farm Detection
 
Dm page rank
Dm page rankDm page rank
Dm page rank
 
PageRank Algorithm In data mining
PageRank Algorithm In data miningPageRank Algorithm In data mining
PageRank Algorithm In data mining
 
PageRank Algorithm
PageRank AlgorithmPageRank Algorithm
PageRank Algorithm
 
Search engine page rank demystification
Search engine page rank demystificationSearch engine page rank demystification
Search engine page rank demystification
 
PageRank & Searching
PageRank & SearchingPageRank & Searching
PageRank & Searching
 
J046045558
J046045558J046045558
J046045558
 
PageRank in Multithreading
PageRank in MultithreadingPageRank in Multithreading
PageRank in Multithreading
 
PageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_HabibPageRank_algorithm_Nfaoui_El_Habib
PageRank_algorithm_Nfaoui_El_Habib
 
Pagerank
PagerankPagerank
Pagerank
 
Seo and page rank algorithm
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithm
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
PageRank
PageRankPageRank
PageRank
 
Ranking Web Pages
Ranking Web PagesRanking Web Pages
Ranking Web Pages
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 
Page Rank
Page RankPage Rank
Page Rank
 

Recently uploaded

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 

Recently uploaded (20)

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 

Local Approximation of PageRank

  • 1. LOCAL PAGERANK APPROXIMATION Group 21 Shashank Juyal (201305537) Roopgundeep S Sodhi (201101047) K Prathyusha (201025173) Bharathi S (201350885)
  • 2. CONTENT  What is PageRank?  Objective  PageRank for whole Dataset  Local Approximation of PageRank  Experiments and Results  Challenges and Issues  Conclusion and Future Scope  References
  • 3. WHAT IS PAGERANK?  Named after Larry Page, cofounder of Google  PageRank is an algorithm used by Google Search to rank websites in their search engine results.  Way of measuring the importance of website pages  Works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.
  • 4. OBJECTIVE  In General, for PageRank calculation, a global computation is needed  But there are situations in which PageRank scores are required for just a small subset of the nodes.  Suppose a web site owner want to promote his website in search engine rankings in order to attract traffic of potential clients.  So he is interested only in the PageRank score of his own website but not in the PageRank scores of all other web pages.
  • 5. OBJECTIVE  Global PageRank computation for the entire web graph is out of the question for most users, as it requires significant resources and knowhow.  That is why Local Approximation of Page-Rank is required.
  • 6. PAGERANK FOR WHOLE DATASET  We traversed through the dataset and applied the algorithm proposed by Page and Brin on the set directly. 1. In that approach, Page Rank for each page is calculated based on the back links which are pointing to that page. 2. A given Page-Rank value of a page is equally divided among the forward-links of that page. The page to which it has pointed will use that value to calculate its own page rank. 3. Additional factor has also to be considered which will make sure that the page-rank algorithm converges (especially in cases where loops are present).
  • 7. PAGERANK FOR WHOLE DATASET  Algorithm (Proposed by Larry Page and Sergie Brin ) Where, -PR(X) is the PageRank of page X, initial value of 1 -PR(Ti) is the PageRank of pages Ti which link to page A, -C(Ti) is the number of backward links on page Ti and -d is a damping factor which can be set between 0 & 1. Iterate over pages Calculate for each page PR(X) = (1-d) + d ( PR(T1) / C(T1) + ... + PR(Tn) / C(Tn)) Till PR(X-1)=PR(X) for all pages
  • 8. LOCAL PAGERANK APPROXIMATION  Given a node (page), we have to calculate the approximate page rank:  The Algorithm crawls the sub-graph of radius r around the given node (page) “backwards” in BFS order. For each node (page) v at layer t, the algorithm calculates the influence of v on given node at radius t.  It sums up the influence values, weighted by some factor. For that the algorithm uses the recursive property of influence: the influence of v on given node at radius t equals the average influence of the out-neighbours of v on given node at radius t−1.
  • 9. LOCAL PAGERANK APPROXIMATION  Now we can have two approaches to consider the value 'r‘ 1. Run the algorithm with r, which is guaranteed to be an upper bound 2. Run the algorithm without knowing r a priori, and stop the algorithm whenever we notice that the value of Page-Rank does not change by much.
  • 10. EXPERIMENTS AND RESULTS 1. Error Percentage for different pageids and radius values
  • 11. EXPERIMENTS AND RESULTS 2. Time taken by local approximation for different pageids and radius values
  • 12. CHALLENGES AND ISSUES  Loading small indexes into memory created problem. But we resolved it by increasing the heap size allocated for the Virtual Machine  Deciding the threshold value during the implementation of pruning.  There is no unique value for threshold as it varies widely for different PageRank values.  Choose wisely !!
  • 13. CONCLUSIONS AND FUTURE SCOPE  Normal Procedure to calculate PageRank consider whole DataSet for its computation which is time and resource consuming and also not feasible in most of the situations.  So Local approximation of PageRank can be predicted by just calculating PageRank over nodes in a smaller graph without calculating PageRank for all the nodes in the dataset  The results obtained are very near to the original PageRank results with the average error rate of 15 -20 %.
  • 14. CONCLUSIONS AND FUTURE SCOPE  The implementation of algorithm and the correctness of the value depend upon the radius defined for the smaller graph.  Smaller the radius, higher the error rate and vice versa.  But on increasing the radius, the complexity increases exponentially as the number of in links we have to deal with becomes very large.  Generally a value of r=3-4 is taken.  Pruning techniques can be used to increase the value of r in which the procedure removes all nodes whose influence is below some threshold value T from layer r.
  • 15. REFERENCES  Ziv Bar-Yossef, Li-Tal Mashiach, and Google Haifa Engineering Center, Haifa, Israel, Local Approximation of PageRank and Reverse PageRank, October 26–30, 2008, ACM 978-1-59593-991-3/08/10  Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd, The PageRank Citation Ranking: Bringing Order to the Web, January 29, 1998, Stanford InfoLab  Yen-Yu Chen, Qingquin Gan, Torsten Suel, Local Methods for Estimating PageRank Values, November 8-13, 2004, CIKM’04