This document discusses personalized search and re-ranking search results based on a user's profile and past behavior. It describes extracting features from query logs covering 27 days of search data to train a classifier. Features include documents clicked and time spent by both the same and different users for a given query. The model is trained using LambdaMART ranking algorithm on 24 days of data and validated on 3 days. It then re-ranks the top 10 search results for test queries based on the extracted features to provide a personalized search ranking. Evaluation on a test platform showed an NDCG score higher than the baseline, indicating more relevant results.
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
Personalized Search Features Group 44
1. April 17, 2014 Group 44
Personalized
Re-rank
Features
Faculty Mentor : Dr. Vasudev
Verma
Swapna Kidambi
Meenal Goyal
Sumit Mishra
Chetan Jain
2. What is personalized search ?
April 17, 2014 Group 44
“ Search Results that vary based on searcher’s profile and past behaviour “
• Today’s Problems :
• Search Engines being impersonal.
• User may not find relevant results as it does not consider user
expertise level.
• As the number of web-page results increase, information overload
problem becomes severe and remedy would be the results according to
user’s preferences.
3. Search engines return results plainly based on the submitted query text and not based
on the context intended and users favor context based personalised search results.
Advantages :
• User gets the expected results faster.
• Only relevant data will be shown
Challenges:
• Dataset given is in the form of numbers.
• Specific emphasis on adaptation efficiency prohibits us from directly applying most
of existing domain adaptation methods and for generic ranking model and
personalised search, adaption efficiency is crucial because:
• Such an operation must be executable on the scale of all the search engine users.
• Handling the dynamic nature of users’ search intent and at the same time the need to
offer the searchers a great experience quickly.
Why Personalization ?
April 17, 2014 Group 44
4. Elements of Personalized search
April 17, 2014 Group 44
We are provided with a 27 day dataset containing :
• Session id
• User id
• Queries hit in a session and the top 10 results it fetched
• Documents clicked and
• The time duration for which these documents were viewed.
We have last 3 data-set as the testing data.
5. April 17, 2014 Group 44
• Divided the training dataset (27 days)
• Training Data (24 days)
• Validation Data (3 days)
Extraction of the features to train the classifier :
• Broadly, features for a given query take into account :
• Same query hit by the same user in history and the results it fetched
• Same query hit by different users in history and results they fetched
• Different queries hit by same user in history and their results
Our Approach
6. April 17, 2014 Group 44
• Features also embed information about :
• Documents clicked in the retrieved documents.
• Time spent on clicked documents.
So , we have , for a query , information about :
• all documents that a user clicked , skipped , missed
• time spent on documents
• documents relevant to user in previous searches
• documents relevant to query in previous searches
Our Approach
7. April 17, 2014 Group 44
• We have a set of features for each query in training data.
• Trained a classifier based on the features extracted and improved the model
with help of validation data.
• On getting query, found its features based on the data-set.
• Model along with this feature set retrieves the top relevant documents.
Feature Extraction:
• Our aim was to extract features for every training, validation and test user-
query-document triplet.(u , q(u) , d(q,u) ).
Our Approach
8. April 17, 2014 Group 44
Workflow
Training Data(24
GB)
Validation Data(3
GB)
Query Terms
Model
Set of Features
for all queries in
Data
Set of features
for query terms
Ranked output
of 10
Documents.
Feature Extraction Training a model using LambdaMart
Feature Extraction for query terms
Given to LAmbdaMart
9. How do we train the model and get the results?
• Ranklib - RankLib is a library of learning to rank algorithms.
• Lambda MART - LambdaMART is the boosted tree version of
LambdaRank, which is based on RankNet.
• It takes as input a set of urls with the feature values for each of the url
and produces the ranked output.
April 17, 2014 Group 44
Our Approach(Tools used)
10. • To check the results , uploaded the output file on the yandex website .
• We obtained an accuracy more than the baseline which is 0.49 NDCG(a score
to find how much accurate an output is) score.
April 17, 2014 Group 44
Observations