SlideShare uma empresa Scribd logo
1 de 68
Baixar para ler offline
Apache Mahout - Tutorial
                     Cataldo Musto, Ph.D.



 Corso di Accesso Intelligente all’Informazione ed Elaborazione del Linguaggio Naturale
Università degli Studi di Bari – Dipartimento di Informatica – A.A. 2012/2013

                                    11/01/2013
Outline
• What is Mahout ?
  – Overview
• How to use Mahout ?
  – Java snippets
Part 1



What is Mahout?
Goal
• Mahout is a Java library
  – Implementing Machine Learning techniques
Goal
• Mahout is a Java library
  – Implementing Machine Learning techniques
     •   Clustering
     •   Classification
     •   Recommendation
     •   Frequent ItemSet
Goal
• Mahout is a Java library
  – Implementing Machine Learning technique
     •   Clustering
     •   Classification
     •   Recommendation
     •   Frequent ItemSet
  – Scalable
What can we do?
• Currently Mahout supports mainly four use cases:
   – Recommendation - takes users' behavior and from that
     tries to find items users might like.
   – Clustering - takes e.g. text documents and groups them
     into groups of topically related documents.
   – Classification - learns from exisiting categorized
     documents what documents of a specific category look like
     and is able to assign unlabelled documents to the
     (hopefully) correct category.
   – Frequent itemset mining - takes a set of item groups
     (terms in a query session, shopping cart content) and
     identifies, which individual items usually appear together.
Algorithms
• Recommendation
  – User-based Collaborative Filtering
  – Item-based Collaborative Filering
  – SlopeOne Recommenders
  – Singular Value Decomposition
Algorithms
• Clustering
  – Canopy
  – K-Means
  – Fuzzy K-Means
  – Hierarchical Clustering
  – Minhash Clustering

  (and much more…)
Algorithms
• Classification
   –   Logistic Regression
   –   Bayes
   –   Support Vector Machines
   –   Perceptrons
   –   Neural Networks
   –   Restricted Boltzmann Machines
   –   Hidden Markov Models

   (and much more…)
Mahout in the
Apache Software Foundation
Focus
• In this tutorial we will focus just on:
   – Recommendation
Recommendation
• Mahout implements a Collaborative Filtering
  framework
   – Popularized by Amazon and others
   – Uses hystorical data (ratings, clicks, and purchases) to
     provide recommendations
      • User-based: Recommend items by finding similar users. This is
        often harder to scale because of the dynamic nature of users.
      • Item-based: Calculate similarity between items and make
        recommendations. Items usually don't change much, so this often
        can be computed offline.
      • Slope-One: A very fast and simple item-based recommendation
        approach applicable when users have given ratings (and not just
        boolean preferences).
Recommendation - Architecture
Recommendation - Architecture




                 Physical Storage
Recommendation - Architecture




                 Data Model




                 Physical Storage
Recommendation - Architecture


                 Recommender




                 Data Model




                 Physical Storage
Recommendation - Architecture
                 External Application



                 Recommender




                 Data Model




                 Physical Storage
Recommendation in Mahout
• Input: raw data (user preferences)
• Output: preferences estimation

• Step 1
  – Mapping raw data into a DataModel Mahout-
    compliant
• Step 2
  – Tuning recommender components
     • Similarity measure, neighborhood, etc.
Recommendation Components
• Mahout implements interfaces to these key
  abstractions:
   – DataModel
      • Methods for mapping raw data to a Mahout-compliant form
   – UserSimilarity
      • Methods to calculate the degree of correlation between two users
   – ItemSimilarity
      • Methods to calculate the degree of correlation between two items
   – UserNeighborhood
      • Methods to define the concept of ‘neighborhood’
   – Recommender
      • Methods to implement the recommendation step itself
Components: DataModel
• A DataModel is the interface to draw information
  about user preferences.

• Which sources is it possible to draw?
  – Database
     • MySQLJDBCDataModel
  – External Files
     • FileDataModel
  – Generic (preferences directly feed through Java code)
     • GenericDataModel
Components: DataModel
• GenericDataModel
  – Feed through Java calls
• FileDataModel
  – CSV (Comma Separated Values)
• JDBCDataModel
  – JDBC Driver
  – Standard database structure
FileDataModel – CSV input
Components: DataModel
• Regardless the source, they all share a
  common implementation.
• Basic object: Preference
  – Preference is a triple (user,item,score)
  – Stored in UserPreferenceArray
Components: DataModel
• Basic object: Preference
  – Preference is a triple (user,item,score)
  – Stored in UserPreferenceArray


• Two implementations
  – GenericUserPreferenceArray
     • It stores numerical preference, as well.
  – BooleanUserPreferenceArray
     • It skips numerical preference values.
Components: UserSimilarity
• UserSimilarity defines a notion of similarity between two
  Users.
   – (respectively) ItemSimilarity defines a notion of similarity
     between two Items.

• Which definition of similarity are available?
   –   Pearson Correlation
   –   Spearman Correlation
   –   Euclidean Distance
   –   Tanimoto Coefficient
   –   LogLikelihood Similarity

   – Already implemented!
Pearson’s vs. Euclidean distance
Components: UserNeighborhood
• UserSimilarity defines a notion of similarity between two
  Users.
   – (respectively) ItemSimilarity defines a notion of similarity
     between two Items.

• Which definition of neighborhood are available?
   – Nearest N users
       • The first N users with the highest similarity are labeled as ‘neighbors’
   – Tresholds
       • Users whose similarity is above a threshold are labeled as ‘neighbors’

   – Already implemented!
Components: Recommender
• Given a DataModel, a definition of similarity between
  users (items) and a definition of neighborhood, a
  recommender produces as output an estimation of
  relevance for each unseen item

• Which recommendation algorithms are implemented?
   –   User-based CF
   –   Item-based CF
   –   SVD-based CF
   –   SlopeOne CF

   (and much more…)
Recommendation Engines
Recap
• Hundreds of possible implementations of a CF-based
  recommender!
   – 6 different recommendation algorithms
   – 2 different neighborhood definitions
   – 5 different similarity definitions

• Evaluation fo the different implementations is actually very
  time-consuming
   – The strength of Mahout lies in that it is possible to save time in
     the evaluation of the different combinations of the parameters!
   – Standard interface for the evaluation of a Recommender
     System
Evaluation
• Mahout provides classes for the evaluation of a
  recommender system
  – Prediction-based measures
     • Mean Average Error
     • RMSE (Root Mean Square Error)
  – IR-based measures
     •   Precision
     •   Recall
     •   F1-measure
     •   F1@n
     •   NDCG (ranking measure)
Evaluation
• Prediction-based Measures
  – Class: AverageAbsoluteDifferenceEvaluator
  – Method: evaluate()
  – Parameters:
     •   Recommender implementation
     •   DataModel implementation
     •   TrainingSet size (e.g. 70%)
     •   % of the data to use in the evaluation (smaller % for
         fast prototyping)
Evaluation
• IR-based Measures
  – Class: GenericRecommenderIRStatsEvaluator
  – Method: evaluate()
  – Parameters:
    •   Recommender implementation
    •   DataModel implementation
    •   Relevance Threshold (mean+standard deviation)
    •   % of the data to use in the evaluation (smaller % for
        fast prototyping)
Part 2



How to use Mahout?
Download Mahout
• Official Release
  – The latest Mahout release is 0.7
  – Available at:
    http://www.apache.org/dyn/closer.cgi/mahout


• Java 1.6.x or greater.
• Hadoop is not mandatory!
Example 1: preferences
import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import org.apache.mahout.cf.taste.model.Preference;
import org.apache.mahout.cf.taste.model.PreferenceArray;

class CreatePreferenceArray {

    private CreatePreferenceArray() {
    }

    public static void main(String[] args) {
      PreferenceArray user1Prefs = new GenericUserPreferenceArray(2);

        user1Prefs.setUserID(0, 1L);

        user1Prefs.setItemID(0, 101L);
        user1Prefs.setValue(0, 2.0f);

        user1Prefs.setItemID(1, 102L);
        user1Prefs.setValue(1, 3.0f);

        Preference pref = user1Prefs.get(1);
        System.out.println(pref);
    }
}
Example 1: preferences
import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import org.apache.mahout.cf.taste.model.Preference;
import org.apache.mahout.cf.taste.model.PreferenceArray;

class CreatePreferenceArray {

    private CreatePreferenceArray() {
    }

    public static void main(String[] args) {
      PreferenceArray user1Prefs = new GenericUserPreferenceArray(2);

        user1Prefs.setUserID(0, 1L);

        user1Prefs.setItemID(0, 101L);
        user1Prefs.setValue(0, 2.0f);              Score 2 for Item 101

        user1Prefs.setItemID(1, 102L);
        user1Prefs.setValue(1, 3.0f);

        Preference pref = user1Prefs.get(1);
        System.out.println(pref);
    }
}
Example 2: data model

• PreferenceArray stores the preferences of a
  single user

• Where do the preferences of all the users are
  stored?
  – An HashMap? No.
  – Mahout introduces data structures optimized for
    recommendation tasks
     • HashMap are replaced by FastByIDMap
Example 2: data model
import     org.apache.mahout.cf.taste.impl.common.FastByIDMap;
import     org.apache.mahout.cf.taste.impl.model.GenericDataModel;
import     org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import     org.apache.mahout.cf.taste.model.DataModel;
import     org.apache.mahout.cf.taste.model.PreferenceArray;

class CreateGenericDataModel {

    private CreateGenericDataModel() {
    }

    public static void main(String[] args) {
      FastByIDMap<PreferenceArray> preferences = new FastByIDMap<PreferenceArray>();
      PreferenceArray prefsForUser1 = new GenericUserPreferenceArray(10);
      prefsForUser1.setUserID(0, 1L);
      prefsForUser1.setItemID(0, 101L);
      prefsForUser1.setValue(0, 3.0f);
      prefsForUser1.setItemID(1, 102L);
      prefsForUser1.setValue(1, 4.5f);

        preferences.put(1L, prefsForUser1);

        DataModel model = new GenericDataModel(preferences);
        System.out.println(model);
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));         FileDataModel
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);
                                                                               2 neighbours
        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }                                                                  Top-1 Recommendation
}                                                                      for User 1
Example 4: MovieLens Recommender

• Download the GroupLens dataset (100k)
  – Its format is already Mahout compliant


• Now we can run the recommendation
  framework against a state-of-the-art dataset
Example 4: MovieLens Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("ua.base"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 20);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 4: MovieLens Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("ua.base"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(10, 50);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }                                                                       We can play with
}                                                                            parameters!
Example 5: evaluation
class EvaluatorIntro {
                                              Ensures the consistency between
    private EvaluatorIntro() {
    }                                            different evaluation runs.
    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                        70%training
                                                  (evaluation on the whole
                                                          dataset)
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                           Recommendation Engine
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
        RecommenderEvaluator rmse = new RMSEEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                          We can add more measures
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
        RecommenderEvaluator rmse = new RMSEEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        double rmse = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);

        System.out.println(score);
        System.out.println(rmse);
    }
}
Example 6: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                      Precision@5 , Recall@5, etc.
}
Example 6: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                      Precision@5 , Recall@5, etc.
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Neighborhood Size
Example 6b: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(200, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                        Set Neighborhood to 200
}
Example 6b: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                        Set Neighborhood to 500
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Similarity Measure (e.g. Euclidean One)
Example 6c: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new EuclideanDistanceSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                            Set Euclidean Distance
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Recommendation Engine (e.g. SlopeOne)
Example 6d: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
          @Override
          public Recommender buildRecommender(DataModel model) throws TasteException {
                  return new SlopeOneRecommender(model);
             }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                    Change Recommendation
}                                                          Algorithm
Example 7: item-based recommender
• Mahout provides Java classes for building an
  item-based recommender system
  – Amazon-like
  – Recommendations are based on similarities
    among items (generally pre-computed offline)
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
                                                                                 ItemSimilarity
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
                                                             No Neighborhood definition for item-
                                                                    based recommenders
End. Do you want more?
Do you want more?
• Recommendation
  – Deploy of a Mahout-based Web Recommender
  – Integration with Hadoop
  – Integration of content-based information
  – Custom similarities, Custom recommenders, Re-
    scoring functions


• Classification, Clustering and Pattern Mining

Mais conteúdo relacionado

Mais procurados

Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingIRJET Journal
 
Intro to web scraping with Python
Intro to web scraping with PythonIntro to web scraping with Python
Intro to web scraping with PythonMaris Lemba
 
Remote Method Invocation (RMI)
Remote Method Invocation (RMI)Remote Method Invocation (RMI)
Remote Method Invocation (RMI)Peter R. Egli
 
Grid search (parameter tuning)
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)Akhilesh Joshi
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data MiningAtul Khanna
 
2 mobile development frameworks and tools dark temp
2   mobile development frameworks and tools dark temp2   mobile development frameworks and tools dark temp
2 mobile development frameworks and tools dark tempShahid Riaz
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Data mining-primitives-languages-and-system-architectures2641
Data mining-primitives-languages-and-system-architectures2641Data mining-primitives-languages-and-system-architectures2641
Data mining-primitives-languages-and-system-architectures2641Aiswaryadevi Jaganmohan
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Hayim Makabee
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt Poojamanic
 

Mais procurados (20)

Handling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random UndersamplingHandling Imbalanced Data: SMOTE vs. Random Undersampling
Handling Imbalanced Data: SMOTE vs. Random Undersampling
 
Mahout
MahoutMahout
Mahout
 
Intro to web scraping with Python
Intro to web scraping with PythonIntro to web scraping with Python
Intro to web scraping with Python
 
Remote Method Invocation (RMI)
Remote Method Invocation (RMI)Remote Method Invocation (RMI)
Remote Method Invocation (RMI)
 
Amqp Basic
Amqp BasicAmqp Basic
Amqp Basic
 
Grid search (parameter tuning)
Grid search (parameter tuning)Grid search (parameter tuning)
Grid search (parameter tuning)
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Genetic algorithms in Data Mining
Genetic algorithms in Data MiningGenetic algorithms in Data Mining
Genetic algorithms in Data Mining
 
Java rmi
Java rmiJava rmi
Java rmi
 
2 mobile development frameworks and tools dark temp
2   mobile development frameworks and tools dark temp2   mobile development frameworks and tools dark temp
2 mobile development frameworks and tools dark temp
 
Regularization
RegularizationRegularization
Regularization
 
Elastic search mind mapping
Elastic search mind mappingElastic search mind mapping
Elastic search mind mapping
 
Grid search, pipeline, featureunion
Grid search, pipeline, featureunionGrid search, pipeline, featureunion
Grid search, pipeline, featureunion
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Web Scraping
Web ScrapingWeb Scraping
Web Scraping
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Ajax presentation
Ajax presentationAjax presentation
Ajax presentation
 
Data mining-primitives-languages-and-system-architectures2641
Data mining-primitives-languages-and-system-architectures2641Data mining-primitives-languages-and-system-architectures2641
Data mining-primitives-languages-and-system-architectures2641
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Machine learning ppt
Machine learning ppt Machine learning ppt
Machine learning ppt
 

Destaque

Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahoutsscdotopen
 
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whyKorea Sdec
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Cataldo Musto
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutTed Dunning
 
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionVarad Meru
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle CompetitionsDataRobot
 

Destaque (8)

Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahout
 
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the why
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
 
Intro to Apache Mahout
Intro to Apache MahoutIntro to Apache Mahout
Intro to Apache Mahout
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An Introduction
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions
 

Semelhante a Tutorial Mahout - Recommendation

Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender systemKaren Li
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSpark Summit
 
Sparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsMaya Hristakeva
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engineKeeyong Han
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignData Science London
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Teacher training material
Teacher training materialTeacher training material
Teacher training materialVikram Parmar
 
Apache Mahout
Apache MahoutApache Mahout
Apache MahoutAjit Koti
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Sonya Liberman
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsNavisro Analytics
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.ASHISH JAGTAP
 
Running with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightRunning with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightChris Price
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutKorea Sdec
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 

Semelhante a Tutorial Mahout - Recommendation (20)

Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
Sparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research Recommendations
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems Design
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Teacher training material
Teacher training materialTeacher training material
Teacher training material
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.
 
Running with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightRunning with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsight
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of Mahout
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 

Mais de Cataldo Musto

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...Cataldo Musto
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationCataldo Musto
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Cataldo Musto
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Cataldo Musto
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Cataldo Musto
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Cataldo Musto
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Cataldo Musto
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsCataldo Musto
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Cataldo Musto
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeCataldo Musto
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemCataldo Musto
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Cataldo Musto
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...Cataldo Musto
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfCataldo Musto
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Cataldo Musto
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesCataldo Musto
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsCataldo Musto
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?Cataldo Musto
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Cataldo Musto
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkCataldo Musto
 

Mais de Cataldo Musto (20)

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender System
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart Cities
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social Network
 

Último

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 

Último (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

Tutorial Mahout - Recommendation

  • 1. Apache Mahout - Tutorial Cataldo Musto, Ph.D. Corso di Accesso Intelligente all’Informazione ed Elaborazione del Linguaggio Naturale Università degli Studi di Bari – Dipartimento di Informatica – A.A. 2012/2013 11/01/2013
  • 2. Outline • What is Mahout ? – Overview • How to use Mahout ? – Java snippets
  • 3. Part 1 What is Mahout?
  • 4. Goal • Mahout is a Java library – Implementing Machine Learning techniques
  • 5. Goal • Mahout is a Java library – Implementing Machine Learning techniques • Clustering • Classification • Recommendation • Frequent ItemSet
  • 6. Goal • Mahout is a Java library – Implementing Machine Learning technique • Clustering • Classification • Recommendation • Frequent ItemSet – Scalable
  • 7. What can we do? • Currently Mahout supports mainly four use cases: – Recommendation - takes users' behavior and from that tries to find items users might like. – Clustering - takes e.g. text documents and groups them into groups of topically related documents. – Classification - learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. – Frequent itemset mining - takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.
  • 8. Algorithms • Recommendation – User-based Collaborative Filtering – Item-based Collaborative Filering – SlopeOne Recommenders – Singular Value Decomposition
  • 9. Algorithms • Clustering – Canopy – K-Means – Fuzzy K-Means – Hierarchical Clustering – Minhash Clustering (and much more…)
  • 10. Algorithms • Classification – Logistic Regression – Bayes – Support Vector Machines – Perceptrons – Neural Networks – Restricted Boltzmann Machines – Hidden Markov Models (and much more…)
  • 11. Mahout in the Apache Software Foundation
  • 12. Focus • In this tutorial we will focus just on: – Recommendation
  • 13. Recommendation • Mahout implements a Collaborative Filtering framework – Popularized by Amazon and others – Uses hystorical data (ratings, clicks, and purchases) to provide recommendations • User-based: Recommend items by finding similar users. This is often harder to scale because of the dynamic nature of users. • Item-based: Calculate similarity between items and make recommendations. Items usually don't change much, so this often can be computed offline. • Slope-One: A very fast and simple item-based recommendation approach applicable when users have given ratings (and not just boolean preferences).
  • 15. Recommendation - Architecture Physical Storage
  • 16. Recommendation - Architecture Data Model Physical Storage
  • 17. Recommendation - Architecture Recommender Data Model Physical Storage
  • 18. Recommendation - Architecture External Application Recommender Data Model Physical Storage
  • 19. Recommendation in Mahout • Input: raw data (user preferences) • Output: preferences estimation • Step 1 – Mapping raw data into a DataModel Mahout- compliant • Step 2 – Tuning recommender components • Similarity measure, neighborhood, etc.
  • 20. Recommendation Components • Mahout implements interfaces to these key abstractions: – DataModel • Methods for mapping raw data to a Mahout-compliant form – UserSimilarity • Methods to calculate the degree of correlation between two users – ItemSimilarity • Methods to calculate the degree of correlation between two items – UserNeighborhood • Methods to define the concept of ‘neighborhood’ – Recommender • Methods to implement the recommendation step itself
  • 21. Components: DataModel • A DataModel is the interface to draw information about user preferences. • Which sources is it possible to draw? – Database • MySQLJDBCDataModel – External Files • FileDataModel – Generic (preferences directly feed through Java code) • GenericDataModel
  • 22. Components: DataModel • GenericDataModel – Feed through Java calls • FileDataModel – CSV (Comma Separated Values) • JDBCDataModel – JDBC Driver – Standard database structure
  • 24. Components: DataModel • Regardless the source, they all share a common implementation. • Basic object: Preference – Preference is a triple (user,item,score) – Stored in UserPreferenceArray
  • 25. Components: DataModel • Basic object: Preference – Preference is a triple (user,item,score) – Stored in UserPreferenceArray • Two implementations – GenericUserPreferenceArray • It stores numerical preference, as well. – BooleanUserPreferenceArray • It skips numerical preference values.
  • 26. Components: UserSimilarity • UserSimilarity defines a notion of similarity between two Users. – (respectively) ItemSimilarity defines a notion of similarity between two Items. • Which definition of similarity are available? – Pearson Correlation – Spearman Correlation – Euclidean Distance – Tanimoto Coefficient – LogLikelihood Similarity – Already implemented!
  • 28. Components: UserNeighborhood • UserSimilarity defines a notion of similarity between two Users. – (respectively) ItemSimilarity defines a notion of similarity between two Items. • Which definition of neighborhood are available? – Nearest N users • The first N users with the highest similarity are labeled as ‘neighbors’ – Tresholds • Users whose similarity is above a threshold are labeled as ‘neighbors’ – Already implemented!
  • 29. Components: Recommender • Given a DataModel, a definition of similarity between users (items) and a definition of neighborhood, a recommender produces as output an estimation of relevance for each unseen item • Which recommendation algorithms are implemented? – User-based CF – Item-based CF – SVD-based CF – SlopeOne CF (and much more…)
  • 31. Recap • Hundreds of possible implementations of a CF-based recommender! – 6 different recommendation algorithms – 2 different neighborhood definitions – 5 different similarity definitions • Evaluation fo the different implementations is actually very time-consuming – The strength of Mahout lies in that it is possible to save time in the evaluation of the different combinations of the parameters! – Standard interface for the evaluation of a Recommender System
  • 32. Evaluation • Mahout provides classes for the evaluation of a recommender system – Prediction-based measures • Mean Average Error • RMSE (Root Mean Square Error) – IR-based measures • Precision • Recall • F1-measure • F1@n • NDCG (ranking measure)
  • 33. Evaluation • Prediction-based Measures – Class: AverageAbsoluteDifferenceEvaluator – Method: evaluate() – Parameters: • Recommender implementation • DataModel implementation • TrainingSet size (e.g. 70%) • % of the data to use in the evaluation (smaller % for fast prototyping)
  • 34. Evaluation • IR-based Measures – Class: GenericRecommenderIRStatsEvaluator – Method: evaluate() – Parameters: • Recommender implementation • DataModel implementation • Relevance Threshold (mean+standard deviation) • % of the data to use in the evaluation (smaller % for fast prototyping)
  • 35. Part 2 How to use Mahout?
  • 36. Download Mahout • Official Release – The latest Mahout release is 0.7 – Available at: http://www.apache.org/dyn/closer.cgi/mahout • Java 1.6.x or greater. • Hadoop is not mandatory!
  • 37. Example 1: preferences import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.Preference; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreatePreferenceArray { private CreatePreferenceArray() { } public static void main(String[] args) { PreferenceArray user1Prefs = new GenericUserPreferenceArray(2); user1Prefs.setUserID(0, 1L); user1Prefs.setItemID(0, 101L); user1Prefs.setValue(0, 2.0f); user1Prefs.setItemID(1, 102L); user1Prefs.setValue(1, 3.0f); Preference pref = user1Prefs.get(1); System.out.println(pref); } }
  • 38. Example 1: preferences import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.Preference; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreatePreferenceArray { private CreatePreferenceArray() { } public static void main(String[] args) { PreferenceArray user1Prefs = new GenericUserPreferenceArray(2); user1Prefs.setUserID(0, 1L); user1Prefs.setItemID(0, 101L); user1Prefs.setValue(0, 2.0f); Score 2 for Item 101 user1Prefs.setItemID(1, 102L); user1Prefs.setValue(1, 3.0f); Preference pref = user1Prefs.get(1); System.out.println(pref); } }
  • 39. Example 2: data model • PreferenceArray stores the preferences of a single user • Where do the preferences of all the users are stored? – An HashMap? No. – Mahout introduces data structures optimized for recommendation tasks • HashMap are replaced by FastByIDMap
  • 40. Example 2: data model import org.apache.mahout.cf.taste.impl.common.FastByIDMap; import org.apache.mahout.cf.taste.impl.model.GenericDataModel; import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.DataModel; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreateGenericDataModel { private CreateGenericDataModel() { } public static void main(String[] args) { FastByIDMap<PreferenceArray> preferences = new FastByIDMap<PreferenceArray>(); PreferenceArray prefsForUser1 = new GenericUserPreferenceArray(10); prefsForUser1.setUserID(0, 1L); prefsForUser1.setItemID(0, 101L); prefsForUser1.setValue(0, 3.0f); prefsForUser1.setItemID(1, 102L); prefsForUser1.setValue(1, 4.5f); preferences.put(1L, prefsForUser1); DataModel model = new GenericDataModel(preferences); System.out.println(model); } }
  • 41. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 42. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); FileDataModel UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 43. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); 2 neighbours List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 44. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } Top-1 Recommendation } for User 1
  • 45. Example 4: MovieLens Recommender • Download the GroupLens dataset (100k) – Its format is already Mahout compliant • Now we can run the recommendation framework against a state-of-the-art dataset
  • 46. Example 4: MovieLens Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("ua.base")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 20); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 47. Example 4: MovieLens Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("ua.base")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(10, 50); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } We can play with } parameters!
  • 48. Example 5: evaluation class EvaluatorIntro { Ensures the consistency between private EvaluatorIntro() { } different evaluation runs. public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } }
  • 49. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } }
  • 50. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } 70%training (evaluation on the whole dataset)
  • 51. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } Recommendation Engine
  • 52. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); RecommenderEvaluator rmse = new RMSEEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } We can add more measures
  • 53. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); RecommenderEvaluator rmse = new RMSEEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); double rmse = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); System.out.println(rmse); } }
  • 54. Example 6: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Precision@5 , Recall@5, etc. }
  • 55. Example 6: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Precision@5 , Recall@5, etc. }
  • 56. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Neighborhood Size
  • 57. Example 6b: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(200, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Neighborhood to 200 }
  • 58. Example 6b: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Neighborhood to 500 }
  • 59. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Similarity Measure (e.g. Euclidean One)
  • 60. Example 6c: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new EuclideanDistanceSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Euclidean Distance }
  • 61. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Recommendation Engine (e.g. SlopeOne)
  • 62. Example 6d: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { return new SlopeOneRecommender(model); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Change Recommendation } Algorithm
  • 63. Example 7: item-based recommender • Mahout provides Java classes for building an item-based recommender system – Amazon-like – Recommendations are based on similarities among items (generally pre-computed offline)
  • 64. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } }
  • 65. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } } ItemSimilarity
  • 66. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } } No Neighborhood definition for item- based recommenders
  • 67. End. Do you want more?
  • 68. Do you want more? • Recommendation – Deploy of a Mahout-based Web Recommender – Integration with Hadoop – Integration of content-based information – Custom similarities, Custom recommenders, Re- scoring functions • Classification, Clustering and Pattern Mining