SlideShare uma empresa Scribd logo
1 de 13
Baixar para ler offline
Exploration / Exploitation Trade-off in Mobile Context-
                Aware Recommender Systems

            Djallel Bouneffouf, Amel Bouzeghoub & Alda Lopes Gançarski

     Department of Computer Science, Télécom SudParis, UMR CNRS Samovar, 91011 Evry
                                       Cedex, France

{Djallel.Bouneffouf, Amel.Bouzeghoub, Alda.Gancarski}@it-
                       sudparis.eu



        Abstract. Most existing approaches in Mobile Context-Aware Recommender
        Systems focus on recommending relevant items to users taking into account
        contextual information, such as time, location, or social aspects. However, none
        of them has considered the problem of user’s content dynamicity. We introduce
        in this paper an algorithm that tackles this dynamicity. It is based on dynamic
        exploration/exploitation and can adaptively balance the two aspects by learning
        automatically the optimal tradeoff. We also include a metric to decide which
        user’s situation is most relevant to exploration or exploitation. Within a deliber-
        ately designed offline simulation framework we conduct evaluations with real
        online event log data. The experimental results demonstrate that our algorithm
        outperforms surveyed algorithms.

        Keywords: recommender system; machine learning; exploration/exploitation
        dilemma


1       Introduction

Mobile technologies have made access to a huge collection of information, anywhere
and anytime. In particular, most professional mobile users acquire and maintain a
large amount of content in their repository. Moreover, the content of such repository
changes dynamically, undergoes frequent updates. In this sense, recommender sys-
tems must promptly identify the importance of new documents, while adapting to the
fading value of old documents. In such a setting, it is crucial to identify and recom-
mend interesting content for users.
A considerable amount of research has been done in recommending interesting con-
tent for mobile users. Earlier techniques in Context-Aware Recommender Systems
(CRS) [3, 6, 12, 5, 22, 27, 26] are based solely on the computational behavior of the
user to model his interests regarding his surrounding environment like location, time
and near people (the user’s situation). The main limitation of such approaches is that
they do not take into account the dynamicity of the user’s content. This gives rise to
another category of recommendation techniques that try to tackle this limitation by
using collaborative, content-based or hybrid filtering techniques. Collaborative filter-

adfa, p. 1, 2011.
© Springer-Verlag Berlin Heidelberg 2011
ing, by finding similarities through the users’ history, gives an interesting recommen-
dation only if the overlap between users’ history is high and the user’s content is stat-
ic[18]. Content-based filtering, identify new documents which match with an existing
user’s profile, however, the recommended documents are always similar to the docu-
ments previously selected by the user [15]. Hybrid approaches have been developed
by combining the two latest techniques; so that, the inability of collaborative filtering
to recommend new documents is reduced by combining it with content-based filtering
[13]. However, the user’s content in mobile undergoes frequent changes. These issues
make content-based and collaborative filtering approaches difficult to apply [8].
The bandit algorithm is a well-known solution that has the advantage of following the
evolution of the user’s content and addresses this problem as a need for balancing
exploration/exploitation (exr/exp) tradeoff. A bandit algorithm B exploits its past
experience to select documents that appear more frequently. Besides, these seemingly
optimal documents may in fact be suboptimal, because of the imprecision in B’s
knowledge. In order to avoid this undesired situation, B has to explore documents by
choosing seemingly suboptimal documents so as to gather more information about
them. Exploitation can decrease short-term user’s satisfaction since some suboptimal
documents may be chosen. However, obtaining information about the documents’
average rewards (i.e., exploration) can refine B’s estimate of the documents’ rewards
and in turn increases long-term user’s satisfaction. Clearly, neither a purely exploring
nor a purely exploiting algorithm works well, and a good tradeoff is needed. One
classical solution to the multi-armed bandit problem is the ε-greedy strategy [9]. With
the probability 1-ε, this algorithm chooses the best documents based on current
knowledge; and with the probability ε, it uniformly chooses any other documents
uniformly. The ε parameter controls essentially the exp/exr tradeoff between exploita-
tion and exploration. One drawback of this algorithm is that it is difficult to decide the
optimal value in advance. Instead, we introduce an algorithm named contextual-ε-
greedy that involves two main ideas to achieve this goal by balance adaptively the
exp/exr tradeoff. In the first idea, we extend the ε-greedy strategy with an update of
the ε value by doing an exr/exp-tradeoff using a finite set of ε candidates. In the sec-
ond idea, we take into account the user’s situation to define the exploration value, and
then we select suitable situations for either exploration or exploitation.
The rest of the paper is organized as follows. Section 2 gives the key notions used
throughout this paper. Section 3 reviews some related works. Section 4 presents our
MCRS model and Section 5 describes the algorithms involved in the proposed ap-
proach. The experimental evaluation is illustrated in Section 6. The last section con-
cludes the paper and points out possible directions for future work.


2      Key Notions

In this section, we sketch briefly the key notions that will be of use in the remainder
of this paper.
    The user’s model: The user’s model is structured as a case based, which is com-
posed of a set of situations with their corresponding user’s preferences, denoted U =
{(Si; UPi)}, where Si is the user’s situation and UPi its corresponding user’s prefer-
ences.
    The user’s preferences: The user’s preferences are deduced during the user’s
navigation activities. A navigation activity expresses the following sequence of
events: (i) the user’s logs in the system and navigates across documents to get the
desired information; (ii) the user expresses his preferences on the visited documents.
We assume that the preference is the information that we extract from the user’s sys-
tem interaction, for example the number of clicks on the visited documents or the time
spent on a document. Let UP be the preferences submitted by a specific user in the
system at a given situation. Each document in UP is represented as a single vector
d=(c1,...,cn), where ci (i=1, .., n) is the value of a component characterizing the prefer-
ences of d. We consider the following components: the total number of clicks on d,
the total time spent reading d and the number of times d was recommended.
    Context: A user’s context C is a multi-ontology representation where each ontolo-
gy corresponds to a context dimension C=(OLocation, OTime, OSocial). Each dimension
models and manages a context information type. We focus on these three dimensions
since they cover all needed information. These ontologies are described in [1, 24, 25]
and are not developed in this paper.
    Situation: A situation is an instantiation of the user’s context. We consider a situa-
tion as a triple S = (OLocation.xi, OTime.xj, OSocial.xk) where xi, xj and xk are ontology con-
cepts or instances. Suppose the following data are sensed from the user’s mobile
phone: the GPS shows the latitude and longitude of a point "48.8925349, 2.2367939";
the local time is "Mon Oct 3 12:10:00 2011" and the calendar states "meeting with
Paul Gerard". The corresponding situation is:
S=(OLocation."48.89, 2.23",OTime."Mon_Oct_3_12:10:00_2011",OSocial. "Paul_Gerard").
To build a more abstracted situation, we interpret the user’s behavior from this low-
level multimodal sensor data using ontologies reasoning means. For example, from S,
we obtain the following situation:
MeetingAtRestaurant=(OLocation.Restaurant, OTime.Work_day, OSocial.Financial_client).
For simplification reasons, we adopt in the rest of the paper the following notation:
S = (xi, xj, xk). The previous example situation became thus:
MeetingAtRestaurant=(Restaurant, Work_day, Financial_client).
Among the set of captured situations, some of them are characterized as high-level
critical situations.
    High Level Critical Situation (HLCS): A HLCS is a situation where the user
needs the best information that can be recommended by the system, for instance when
the user is in professional meeting. In such a situation, the system should not recom-
mend information which does not consider the user’s preferences. In other words, the
system must exclusively perform exploitation rather than exploration-oriented learning.
In the other case, for instance where the user is using his information system at home
or at night in vacation S = (home, vacation, friends). The system can make some explo-
ration by recommending the user some information which ignores their interest. The
HLCS situations are for the moment predefined by domain expert. In our case we con-
duct the study with professional mobile users, which is described in detail in section 6,
their HLCS are for example S1= (company, Monday morning, colleague), S2= (restau-
rant, midday, client) or S3= (company, morning, manager).
3      Related Work

We refer, in the following, recent recommendation techniques that tackle both making
dynamic exr/exp (bandit algorithm) and considering the user’s situation in recom-
mendation.


3.1    Bandit Algorithms Overview (ε-greedy)
The (exr/exp) tradeoff was firstly studied in reinforcement learning in 1980's, and later
flourished in other fields of machine learning [16, 19]. Very frequently used in rein-
forcement learning to study the (exr/exp) tradeoff, the multi-armed bandit problem was
originally described by Robbins [16]. The ε-greedy is the one of the most used strate-
gy to solve the bandit problem and was first described in [20]. The ε-greedy strategy
choose a random document with epsilon-frequency (ε), and choose otherwise the doc-
ument with the highest estimated mean, the estimation is based on the rewards ob-
served thus far. ε must be in the open interval [0, 1] and its choice is left to the user.
    The first variant of the ε-greedy strategy is what [9, 14] refer to as the ε-beginning
strategy. This strategy makes exploration all at once at the beginning. For a given
number I ∈ N of iterations, the documents are randomly pulled during the εI first itera-
tions. During the remaining (1−ε)I iterations, the document of highest estimated mean
is pulled. Another variant of the ε-greedy strategy is what Cesa-Bianchi and Fisher [14]
call the ε-decreasing strategy. In this strategy, the document with the highest estimated
mean is always pulled except when a random document is pulled instead with an εi
frequency, where n is the index of the current round. The value of the decreasing εi is
given by εi = {ε0/ i} where ε0 ∈ ]0,1]. Besides ε-decreasing, four other strategies are
presented in [4]. Those strategies are not described here because the experiments done
by [4] seem to show that, with carefully chosen parameters, ε-decreasing is always as
good as the other strategies. Compared to the standard multi-armed bandit problem
with a fixed set of possible actions, in CRS, old documents may expire and new docu-
ments may frequently emerge. Therefore it may not be desirable to perform the explo-
ration all at once at the beginning as in [9] or to decrease monotonically the effort on
exploration as the decreasing strategy in [14].
    Few research works are dedicated to study the contextual bandit problem on Rec-
ommender System, where they consider user’s behavior as the context of the bandit
problem. In [10], authors extend the ε-greedy strategy by updating the exploration
value ε dynamically. At each iteration, they run a sampling procedure to select a new ε
from a finite set of candidates. The probabilities associated to the candidates are uni-
formly initialized and updated with the Exponentiated Gradient (EG) [10]. This updat-
ing rule increases the probability of a candidate ε if it leads to a user’s click. Compared
to both ε-beginning and decreasing strategy, this technique improves the results. In
[13], authors model the recommendation as a contextual bandit problem. They propose
an approach in which a learning algorithm selects sequentially documents to serve
users based on contextual information about the users and the documents. To maxim-
ize the total number of user’s clicks, this work proposes the LINUCB algorithm that is
computationally efficient.
    The authors in [4, 9, 13, 14, 21] describe a smart way to balance exploration and
exploitation. However, none of them consider the user’s situation during the recom-
mendation.
3.2    Managing the User’s Situation
Few research works are dedicated to manage the user’s situation on recommendation.
In [7, 17] the authors propose a method which consists of building a dynamic user’s
profile based on time and user’s experience. The user’s preferences in the user’s profile
are weighted according to the situation (time, location) and the user’s behavior. To
model the evolution on the user’s preferences according to his temporal situation in
different periods, (like workday or vacations), the weighted association for the con-
cepts in the user’s profile is established for every new experience of the user. The us-
er’s activity combined with the user's profile are used together to filter and recommend
relevant content. Another work [12] describes a CRS operating on three dimensions of
context that complement each other to get highly targeted. First, the CRS analyzes
information such as clients’ address books to estimate the level of social affinity
among the users. Second, it combines social affinity with the spatiotemporal dimen-
sions and the user’s history in order to improve the quality of the recommendations. In
[3], the authors present a technique to perform user-based collaborative filtering. Each
user’s mobile device stores all explicit ratings made by its owner as well as ratings
received from other users. Only users in spatiotemporal proximity are able to exchange
ratings and they show how this provides a natural filtering based on social contexts.
    Each work cited above tries to recommend interesting information to users on con-
textual situation; however they do not consider the evolution of the user’s content. As
shown above, none of the mentioned works tackles both problems of exr/exp dy-
namicity and user’s situation consideration in the exr/exp strategy. This is precisely
what we intend to do with our approach, exploiting the following new features: (1) we
propose for exr/exp-tradeoff a calibration, which consists in updating the value of ε
dynamically. At each iteration, we run the ε-decreasing algorithm to select a new ε
from a finite set of candidates. The weight associated to the candidates increases if it
leads to a user click. (2) Our intuition is that, considering the HLCS of the situation
when managing the exr/exp-tradeoff, improves the long-term rewards of the recom-
mender system, which is not considered in the surveyed approaches as far as we know.


4      MCRS Model

In our recommender system, the recommendation of documents is modeled as a con-
textual bandit problem including the user’s situation information. Formally, a bandit
algorithm proceeds in discrete trials t = 1…T. For each trial t, the algorithm performs
the following tasks:
    Task 1: Let St be the current user’s situation, and PS the set of past situations. The
system compares St with the situations in PS in order to choose the most similar Sp
using the following semantic similarity metric:
            p          
                                    t c 
          S = arg max   α j  sim j x j ,x j         (1)
              sc PS                          
                        j
    In Eq.1, simj is the similarity metric related to dimension j between two concepts xt
and xc and αj the weight associated to dimension j. This similarity depends on how
closely xt and xc are related in the corresponding ontology (location, time or social).
We use the same similarity measure as [24] defined by Eq. 2:
                  t c  
          sim j x j , x j  2 
                                    deph( LCS )            (2)
                                  (deph( x c )  deph( x tj ))
                                           j
In Eq. 2, LCS is the Least Common Subsumer of xjt and xjc, and deph is the number
of nodes in the path from the node to the ontology root.
    Task 2: Let D be the document collection and Dp  D the set of documents rec-
ommended in situation Sp. After retrieving Sp, the system observes the user’s behavior
when reading each document dt  Dp. Based on observed rewards, the algorithm
chooses the document dp with the greater reward rt,.
    Task 3: The algorithm improves its arm-selection strategy with the new observa-
tion: in situation St, document dp obtained a reward rt.
    In the field of document recommandation, when a document is presented to the us-
er and this one selects it by a click, a reward of 1 is incurred; otherwise, the reward is
0. With this definition of reward, the expected reward of a document is precisely its
Click Through Rate (CTR). The CTR is the average number of clicks on a recom-
mended document, computed dividing the total number of clicks on it by the number
of times it was recommended.


5      Exploration/Exploitation Tradeoff
   In this section, we firstly present the ε-greedy algorithm (sub-section 5.1) and our
proposed exr/exp-tradeoff algorithm named decreasing-ε-greedy (sub-section 5.2). In
sub-section 5.3, we describe the contextual-ε-greedy algorithm that extends the de-
creasing-ε-greedy in order to consider HLCS situations.


5.1    The ε-greedy() Algorithm
    The ε-greedy strategy is sketched in Algorithm 1. For a given user’s situation, the
algorithm recommends a predefined number of documents, specified by parameter N.
In this algorithm, UP={d1,…,dP} is the set of documents corresponding to the current
user’s preferences; D={d1,….,dN} is the set of documents to recommend; getCTR (Alg.
1) is the function which estimates the CTR of a given document; Random (Alg. 1) is
the function returning a random element from a given set; q is a random value uniform-
ly distributed over [0, 1] which defines the exploration/exploitation tradeoff; ε is the
probability of recommending a random exploratory document.
Algorithm 1 The ε-greedy               algorithm
Input: ε, UP, D = Ø, N                 Output: D
For i = 1 to i = N do
      argmaxUP(getCTR(d))              if    q > ε and q = Random({0,1})
di =
      Random(UP)                      otherwise
      D = D ∪ di Endfor


5.2    The decreasing-ε-greedy
To improve exploitation of the ε-greedy algorithm, we propose to extend it with set-
ting out the optimal trade-off value ε. we iteratively update it by the method (decreas-
ing-ε-greedy (Alg.2). First we assume that we have a finite number of candidate val-
ues for ε, denoted by Hε= (ε1, ... , εT). Our goal is to select the optimal ε from Hε. To
this end, we apply the ε-decreasing strategy to propose an εi, and then we use a set of
weights w = (w1,...,wT) to keep track of the feedback of each εi, wi is increased if we
receive a number of clicks li from the user when we use εi.
Algorithm 2 decreasing-ε-greedy ()
Input: Hε, N, St, UP, n     Output: D
εε = Ø, 𝜏 = 0, wi = 1, i = 1,…,T
For t = 1 to n do
     𝜏 = 0.01* t
       argmax(i)(wi) if q ≤ 𝜏 and q = Random ([0,1])
εi =
       Random(Hε-εε)                  otherwise
D = ε-greedy(εi, UP, D = Ø, N);
Receive a click feedback li from the user
wi = wi+ li; εε = εε ∪ εi; Endfor
    In Alg. 2. εε is the set of ε that have been previously selected, n is the number of it-
eration of the learning algorithm, i is the identifier of ε and 𝜏 is the probability of pro-
posing the argmaxi (wi), this parameter starts at low value and iteratively increases
until the end of the learning ( 𝜏 =1).


5.3    Contextual-ε-greedy
    To improve the adaptation of the ε-greedy algorithm to HLCS situations, we com-
pare current user’s situation St with the HLCS class of situations, and depending on the
similarity between the current situation St and its most similar situation Sv ∈ HLCS,
two scenarios are then possible: if sim(St, Sv) ≥ B and Sv ∈ HLCS, where B is a similar-
ity threshold value and sim(S t ,S v ) = sim x t ,x v  (Eq. 1), the system recommends
                                         j j j
                                         j
documents with a greedy strategy and insert the St on the HLCS class of situations;
otherwise the system uses the decreasing-ε-greedy() method described in Alg.2. Alg.3
summarizes the functional steps of the contextual-ε-greedy() method.
Algorithm 3: contextual-ε-greedy()
Input: HLCS, N, n, St, Hε, St, D = Ø Output: D
                                     
                             j  
S v= arg max   α j  sim j x v ,x c 
                                    j 

     Sc HLCS
               j                     
If sim(St, Sv) ≥ B and Sv ∈ HLCS then
      For i = 1 to N do
             di = argmaxd∈(UP–D) (CTR(d));
             D = D ∪ di;Endfor
      HLCS ∪ St
else D = decreasing-ε-greedy(Hε, N, St, UPv, n);
Endif
6           Experimental Evaluation

In order to evaluate empirically the performance of our approach, and in the absence
of a standard evaluation framework, we propose an evaluation framework based on a
diary set of study entries. The main objectives of the experimental evaluation are: (1)
to find the optimal threshold B value described in (Section 5.3) and (2) to evaluate the
performance of the proposed algorithms (Alg. 2 and Alg. 3) w. r. t. the ε variation and
the dataset size. In the following, we describe our experimental datasets and then
present and discuss the obtained results.


6.1         Evaluation Framework

We have conducted a diary study with the collaboration of the French software com-
pany Nomalys1. This company provides a history application, which records the time,
current location, social and navigation information of its users during their application
use. The diary study has taken 18 months and has generated 178 369 diary situation
entries. Each diary situation entry represents the capture, of contextual time, location
and social information. For each entry, the captured data are replaced with more ab-
stracted information using time, spatial and social ontologies. Table 1 illustrates three
examples of such transformations.

                                 Table 1. Semantic diary situation

      IDS              Users            Time             Place                   Client
      1                Paul             Workday          Paris                Finance client
      2                Fabrice          Workday          Roubaix              Social client
      3                Jhon             Holiday          Paris                Telecom client

     From the diary study, we have obtained a total of 2 759 283 entries concerning
the user’s navigation, expressed with an average of 15.47 entries per situation. Table
2 illustrates examples of such diary navigation entries, where Click is the number of
clicks on a document; Time is the time spent reading a document, and Interest is the
direct interest expressed by stars (the maximum number of stars is five).

                                 Table 2. Diary navigation entries

       IdDoc           IDS              Click            Time                    Interest
       1               1                2                2’                      **
       2               1                4                3’                      ***
       3               1                8                5’                      *****




1
    Nomalys is a company that provides a graphical application on Smartphones allowing users to
     access their company’s data.
6.2    Finding the Optimal B Threshold Value
In order to set out the threshold similarity value, we use a manual classification as a
baseline and compare it with the results obtained by our technique. So, we take a ran-
dom sampling of 10% of the situation entries, and we manually group similar situa-
tions; then we compare the constructed groups with the results obtained by our simi-
larity algorithm, with different threshold values.




               Fig. 1. Effect of B threshold value on the similarity accuracy

Figure 1 shows the effect of varying the threshold situation similarity parameter B in
the interval [0, 3] on the overall precision. Results show that the best performance is
obtained when B has the value 2.4 achieving a precision of 0.849. Consequently, we
use the identified optimal threshold value (B = 2.4) of the situation similarity measure
for testing our MCRS.


6.3    Experimental Results
In our experiments, we firstly have collected the 3000 situations with an occurrence
greater than 100 to be statistically meaningful. Then, we have sampled 10000 docu-
ments that have been shown on any of these situations. The testing step consists of
evaluating the algorithms for each testing situation using the average expected CTR.
The expected CTR for a particular iteration is the ratio between the total number of
expected clicks and the total number of displays. Then, we calculate the average ex-
pected CTR over every 100 iterations. We have run the simulation until the number of
iterations reaches 1000 and the number of documents (n) returned by the recommend-
er system for each situation is 10. In the first experiment, we have evaluated the two
exr/exp algorithms described in section 5. In addition to a pure exploitation baseline,
we have compared our algorithms (decreasing-ε-greedy and contextual-ε-greedy) to
the algorithms described in section 3): ε-greedy; ε-beginning, ε-decreasing and EG. In
Fig. 2, the horizontal axis is the number of iterations and the vertical axis is the per-
formance metric.
Fig. 2. Average expected CTR for different exr/exp algorithms

We have parametered the different algorithms as follows: For the ε-greedy algorithm,
we experiment with two parameter values: 0.5 and 0.9. The ε-decreasing, EG, de-
creasing-ε-greedy and contextual-ε-greedy share the same set of candidates {εi = 1-
0.1 * i, i = 1,...,10}. The ε-decreasing starts with the largest value and reduces it by
0.1 every 100 iterations until ε reaches the smallest value. Overall exr/exp algorithms
have better performance than the baseline. However, for the first 200 iterations, with
pure exploitation the exploitation baseline achieves a faster increase convergence. But
in the long run, all exr/exp algorithms catch up and improve the average CTR at con-
vergence. We have several interesting observations regarding the behaviors of the
different exr/exp algorithms. Even if the decreasing-ε-greedy algorithm does not con-
sider HLCS on recommendation, it has a similar convergence threshold than contex-
tual-ε-greedy. The contextual-ε-greedy method effectively learns the optimal ε and its
convergence rate is close to the best setting. But for decreasing-ε-greedy, it is only
after the 500 iterations that, the algorithm begins to show its advantage. For the ε-
decreasing algorithm, the converged CTR increases as ε decreases (less exploration).
Even after convergence, the 0.9-greedy and 0.5-greedy algorithms still give respec-
tively 90% and 50% of the opportunities to documents with low CTR, which decreas-
es significantly their expected CTR average. While the ε-decreasing algorithm con-
verges to a higher CTR, its overall performance is not as good as ε-beginning. Its
CTR drops a lot at the early stage because of more exploration but does not converge
faster.
The contextual-ε-greedy algorithm has the best convergence rate, and increases CTR
by a factor of 2 over the baseline and outperforms over all other exr/exp algorithms.
The improvement comes from a dynamic tradeoff between exr/exp, controlled by the
critical situation estimation. At the early stage, this algorithm takes full advantage of
exploration without wasting opportunities to establish good CTR.


6.4    Size of data
To compare the algorithms when data is sparse in our experiments, we reduce data
sizes of 30%, 20%, 10%, 5%, and 1%, respectively. To better visualize the compari-
son results, figures 3 shows algorithms’ average CTR graphs with the previous re-
ferred data sparseness levels. Our first observation is that, in all data sparseness lev-
els, all algorithms are useful. Note that making the decrease of data size does not sig-
nificantly improve the performance of the ε-greedy (0,5-greedy, 0,9-greedy) in sparse
data. Although very different from the EG algorithms, the Decreasing-ε-greedy tends
to very similar results. Except for exploitation baseline the ε-beginning seems to have
a rather poor performance, its results are worse than any other strategy independently
of the chosen parameters. The reason probably lies in the fact that the ε-beginning
makes the exploration only at the beginning. The contextual-ε-greedy outperforms all
of the other strategies on the dataset, by a factor of 2.




                   Fig. 3. Average expected CTR for different data size


7      Conclusion

In this paper, we study the problem of exploitation and exploration in mobile context-
aware recommender systems and propose a novel approach that balances adaptively
exr/exp regarding the user’s situation. We have presented an evaluation protocol
based on real mobile navigation contexts obtained from a diary study conducted with
collaboration with the Nomalys French company. We have evaluated our approach
according to the proposed evaluation protocol and show that it is effective. In order to
evaluate the performance of the proposed algorithm, we compare it with other stand-
ard exr/exp strategies. The experimental results demonstrate that our algorithm per-
forms better on CTR in various configurations. In the future, we plan to evaluate the
scalability of the algorithm on-board a mobile device and investigate other public
benchmark and integrate some of information as contextual to increase the reproduci-
bility of the study.


References
 1. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski: Following the User’s Interests in Mo-
    bile Context-Aware recommender systems. AINA Workshops, pp. 657-662. Fukuoka, Ja-
    pan (2012).
2. G. Adomavicius, B. Mobasher, F. Ricci, Alexander Tuzhilin. Context-Aware Recom-
    mender Systems. AI Magazine. 32(3): 67-80, 2011.
 3. S. Alexandre, C. Moira Norrie, M. Grossniklaus and B.Signer, “Spatio-Temporal Proximi-
    ty as a Basis for Collaborative Filtering in Mobile Environments”. CAiSE, 2006.
 4. P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite Time Analysis of the Multiarmed Bandit
    Problem. Machine Learning, 2, 235–256, 2002.
 5. R. Bader , E. Neufeld , W. Woerndl , V. Prinz, Context-aware POI recommendations in an
    automotive scenario using multi-criteria decision making methods, Proceedings of the
    2011 Workshop on Context-awareness in Retrieval and Recommendation, p.23-30, 2011.
 6. L. Baltrunas, B. Ludwig, S. Peer, and F. Ricci. Context relevance assessment and exploita-
    tion in mobile recommender systems. Personal and Ubiquitous Computing, 2011.
 7. V. Bellotti, B. Begole, E.H. Chi, N. Ducheneaut, J. Fang, E. Isaacs, Activity-Based Seren-
    dipitous Recommendations. Proceedings On the Move, 2008.
 8. W. Chu and S. Park. Personalized recommendation on dynamic content using predictive
    bilinear models. In Proc. of World Wide Web, pages 691–700, 2009.
 9. E. Even-Dar, S. Mannor, and Y. Mansour. PAC Bounds for Multi-Armed Bandit and Mar-
    kov Decision Processes. Computational Learning Theory, 255–270, 2002.
10. J. Kivinen and K. Manfred, Warmuth. Exponentiated gradient versus gradient descent for
    linear predictors. Information and Computation, 132-163, 1997.
11. J. Langford and T. Zhang. The epoch-greedy algorithm for contextual multi-armed bandits.
    In Advances in Neural Information Processing Systems 20, 2008.
12. R. Lakshmish, P. Deepak, P. Ramana, G. Kutila, D. Garg, V. Karthik, K. Shivkumar , A
    Mobile context-aware, Social Recommender System for Low-End Mobile Device. Mobile
    Data Management: Systems, Services and Middleware CAESAR:, 2009.
13. Li. Lihong, C. Wei, J. Langford, E. Schapire. A Contextual-Bandit Approach to Personal-
    ized News Document Recommendation. CoRR, Presented at the Nineteenth International
    Conference on World Wide Web, Raleigh, Vol. abs/1002.4058, 2010.
14. S. Mannor and J. Tsitsiklis. The Sample Complexity of Exploration in the Multi-Armed
    Bandit Problem. Computational Learning Theory, 2003.
15. D. Mladenic. Text-learning and related intelligent agents: A survey. IEEE Intelligent
    Agents, pages 44–54, 1999.
16. H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the Ameri-
    can Mathematical Society, 1952.
17. G. Samaras, C. Panayiotou. Personalized portals for the wireless user based on mobile
    agents. Proc. 2nd Int'l workshop on Mobile Commerce, pp. 70-74, 2002.
18. J. B. Schafer, J. Konstan, and J. Riedi. Recommender systems in e-commerce. In Proc. of
    the 1st ACM Conf. on Electronic Commerce, 1999.
19. R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
20. C. J.Watkins, Learning from Delayed Rewards. Ph.D. thesis. Cambridge University, 1989.
21. Li. Wei, X. Wang, R. Zhang, Y. Cui, J. Mao, R. Jin. Exploitation and Exploration in a Per-
    formance based Contextual Advertising System: KDD, pp. 133-138. 2010.
22. W. Woerndl, Florian Schulze: Capturing, Analyzing and Utilizing Context-Based Infor-
    mation About User Activities on Smartphones. Activity Context Representation, 2011.
23. Z. Wu and M. Palmer. Verb Semantics and Lexical Selection. In Proceedings of the 32nd
    Annual Meeting of the Association for Computational Linguistics, 133-138, 1994.
24. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski.: Hybrid-ε-greedy for Mobile Context-
    Aware Recommender System. PAKDD (1) 2012: 468-479
25. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski: Considering the High Level Critical
    Situations in Context-Aware Recommender Systems. IMMoA 2012: 26-32
26. D. Bouneffouf,: L'apprentissage automatique, une étape importante dans l'adaptation des
    systèmes d'information à l'utilisateur. inforsid 2011 : 427-428
27. D. Bouneffouf,: Applying machine learning techniques to improve user acceptance on
    ubiquitous environment. CAISE Doctoral Consortium 2011

Mais conteúdo relacionado

Mais procurados

IRJET- Personalize Travel Recommandation based on Facebook Data
IRJET- Personalize Travel Recommandation based on Facebook DataIRJET- Personalize Travel Recommandation based on Facebook Data
IRJET- Personalize Travel Recommandation based on Facebook DataIRJET Journal
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGrubhubTech
 
Using user personalized ontological profile to infer semantic knowledge for p...
Using user personalized ontological profile to infer semantic knowledge for p...Using user personalized ontological profile to infer semantic knowledge for p...
Using user personalized ontological profile to infer semantic knowledge for p...Joao Luis Tavares
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.docbutest
 
Identifying ghost users using social media metadata - University College London
Identifying ghost users using social media metadata - University College LondonIdentifying ghost users using social media metadata - University College London
Identifying ghost users using social media metadata - University College LondonGreg Kawere
 
Friend Recommendation on Social Network Site Based on Their Life Style
Friend Recommendation on Social Network Site Based on Their Life StyleFriend Recommendation on Social Network Site Based on Their Life Style
Friend Recommendation on Social Network Site Based on Their Life Stylepaperpublications3
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesJPINFOTECH JAYAPRAKASH
 
organic information design
organic information designorganic information design
organic information designrosa qin
 
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEM
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEMPRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEM
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEMIAEME Publication
 
Learning Process Interaction Aided by an Adapter Agent
Learning Process Interaction Aided by an Adapter AgentLearning Process Interaction Aided by an Adapter Agent
Learning Process Interaction Aided by an Adapter Agentpaperpublications3
 
Ontological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systemsOntological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systemsvikramadityajakkula
 
Amazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine LearningAmazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine Learningijtsrd
 
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18Yuanjing Sun
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual informationeSAT Journals
 

Mais procurados (18)

IRJET- Personalize Travel Recommandation based on Facebook Data
IRJET- Personalize Travel Recommandation based on Facebook DataIRJET- Personalize Travel Recommandation based on Facebook Data
IRJET- Personalize Travel Recommandation based on Facebook Data
 
Ce4201528546
Ce4201528546Ce4201528546
Ce4201528546
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
 
Using user personalized ontological profile to infer semantic knowledge for p...
Using user personalized ontological profile to infer semantic knowledge for p...Using user personalized ontological profile to infer semantic knowledge for p...
Using user personalized ontological profile to infer semantic knowledge for p...
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.doc
 
Identifying ghost users using social media metadata - University College London
Identifying ghost users using social media metadata - University College LondonIdentifying ghost users using social media metadata - University College London
Identifying ghost users using social media metadata - University College London
 
Phd defence presentation
Phd defence presentationPhd defence presentation
Phd defence presentation
 
At4102337341
At4102337341At4102337341
At4102337341
 
Friend Recommendation on Social Network Site Based on Their Life Style
Friend Recommendation on Social Network Site Based on Their Life StyleFriend Recommendation on Social Network Site Based on Their Life Style
Friend Recommendation on Social Network Site Based on Their Life Style
 
Major
MajorMajor
Major
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomes
 
organic information design
organic information designorganic information design
organic information design
 
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEM
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEMPRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEM
PRE-PROCESSING TECHNIQUES FOR FACIAL EMOTION RECOGNITION SYSTEM
 
Learning Process Interaction Aided by an Adapter Agent
Learning Process Interaction Aided by an Adapter AgentLearning Process Interaction Aided by an Adapter Agent
Learning Process Interaction Aided by an Adapter Agent
 
Ontological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systemsOntological and clustering approach for content based recommendation systems
Ontological and clustering approach for content based recommendation systems
 
Amazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine LearningAmazon Product Review Sentiment Analysis with Machine Learning
Amazon Product Review Sentiment Analysis with Machine Learning
 
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18
Thesis Blurb intro-Multisensory Warning Cue Evaluation in Driving-Jan18
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual information
 

Semelhante a Exploration exploitation trade off in mobile context-aware recommender systems

MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSMODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSijasuc
 
Modeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware SystemsModeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware Systemsijasuc
 
Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...IOSR Journals
 
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsA Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsLoc Nguyen
 
Recommender systems: a novel approach based on singular value decomposition
Recommender systems: a novel approach based on singular  value decompositionRecommender systems: a novel approach based on singular  value decomposition
Recommender systems: a novel approach based on singular value decompositionIJECEIAES
 
History and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfHistory and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfssuser5b0f5e
 
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...WTHS
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Editor IJAIEM
 
Costomization of recommendation system using collaborative filtering algorith...
Costomization of recommendation system using collaborative filtering algorith...Costomization of recommendation system using collaborative filtering algorith...
Costomization of recommendation system using collaborative filtering algorith...eSAT Publishing House
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.docbutest
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.docbutest
 
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...IJDKP
 
Advances In Collaborative Filtering
Advances In Collaborative FilteringAdvances In Collaborative Filtering
Advances In Collaborative FilteringScott Donald
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender Systemtheijes
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender Systemtheijes
 
Machine learning based recommender system for e-commerce
Machine learning based recommender system for e-commerceMachine learning based recommender system for e-commerce
Machine learning based recommender system for e-commerceIAESIJAI
 
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...IJMTST Journal
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingIRJET Journal
 
factorization methods
factorization methodsfactorization methods
factorization methodsShaina Raza
 

Semelhante a Exploration exploitation trade off in mobile context-aware recommender systems (20)

MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSMODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
 
Modeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware SystemsModeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware Systems
 
Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...Multidirectional Product Support System for Decision Making In Textile Indust...
Multidirectional Product Support System for Decision Making In Textile Indust...
 
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent ItemsetsA Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
A Novel Collaborative Filtering Algorithm by Bit Mining Frequent Itemsets
 
Recommender systems: a novel approach based on singular value decomposition
Recommender systems: a novel approach based on singular  value decompositionRecommender systems: a novel approach based on singular  value decomposition
Recommender systems: a novel approach based on singular value decomposition
 
History and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdfHistory and Overview of the Recommender Systems.pdf
History and Overview of the Recommender Systems.pdf
 
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...
Paper Gloria Cea - Goal-Oriented Design Methodology Applied to User Interface...
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 
Costomization of recommendation system using collaborative filtering algorith...
Costomization of recommendation system using collaborative filtering algorith...Costomization of recommendation system using collaborative filtering algorith...
Costomization of recommendation system using collaborative filtering algorith...
 
ICWL 2009
ICWL 2009ICWL 2009
ICWL 2009
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.doc
 
BenMartine.doc
BenMartine.docBenMartine.doc
BenMartine.doc
 
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
 
Advances In Collaborative Filtering
Advances In Collaborative FilteringAdvances In Collaborative Filtering
Advances In Collaborative Filtering
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
 
Machine learning based recommender system for e-commerce
Machine learning based recommender system for e-commerceMachine learning based recommender system for e-commerce
Machine learning based recommender system for e-commerce
 
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...
Effective Cross-Domain Collaborative Filtering using Temporal Domain – A Brie...
 
Travel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social NetworkingTravel Recommendation Approach using Collaboration Filter in Social Networking
Travel Recommendation Approach using Collaboration Filter in Social Networking
 
factorization methods
factorization methodsfactorization methods
factorization methods
 

Último

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 

Último (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 

Exploration exploitation trade off in mobile context-aware recommender systems

  • 1. Exploration / Exploitation Trade-off in Mobile Context- Aware Recommender Systems Djallel Bouneffouf, Amel Bouzeghoub & Alda Lopes Gançarski Department of Computer Science, Télécom SudParis, UMR CNRS Samovar, 91011 Evry Cedex, France {Djallel.Bouneffouf, Amel.Bouzeghoub, Alda.Gancarski}@it- sudparis.eu Abstract. Most existing approaches in Mobile Context-Aware Recommender Systems focus on recommending relevant items to users taking into account contextual information, such as time, location, or social aspects. However, none of them has considered the problem of user’s content dynamicity. We introduce in this paper an algorithm that tackles this dynamicity. It is based on dynamic exploration/exploitation and can adaptively balance the two aspects by learning automatically the optimal tradeoff. We also include a metric to decide which user’s situation is most relevant to exploration or exploitation. Within a deliber- ately designed offline simulation framework we conduct evaluations with real online event log data. The experimental results demonstrate that our algorithm outperforms surveyed algorithms. Keywords: recommender system; machine learning; exploration/exploitation dilemma 1 Introduction Mobile technologies have made access to a huge collection of information, anywhere and anytime. In particular, most professional mobile users acquire and maintain a large amount of content in their repository. Moreover, the content of such repository changes dynamically, undergoes frequent updates. In this sense, recommender sys- tems must promptly identify the importance of new documents, while adapting to the fading value of old documents. In such a setting, it is crucial to identify and recom- mend interesting content for users. A considerable amount of research has been done in recommending interesting con- tent for mobile users. Earlier techniques in Context-Aware Recommender Systems (CRS) [3, 6, 12, 5, 22, 27, 26] are based solely on the computational behavior of the user to model his interests regarding his surrounding environment like location, time and near people (the user’s situation). The main limitation of such approaches is that they do not take into account the dynamicity of the user’s content. This gives rise to another category of recommendation techniques that try to tackle this limitation by using collaborative, content-based or hybrid filtering techniques. Collaborative filter- adfa, p. 1, 2011. © Springer-Verlag Berlin Heidelberg 2011
  • 2. ing, by finding similarities through the users’ history, gives an interesting recommen- dation only if the overlap between users’ history is high and the user’s content is stat- ic[18]. Content-based filtering, identify new documents which match with an existing user’s profile, however, the recommended documents are always similar to the docu- ments previously selected by the user [15]. Hybrid approaches have been developed by combining the two latest techniques; so that, the inability of collaborative filtering to recommend new documents is reduced by combining it with content-based filtering [13]. However, the user’s content in mobile undergoes frequent changes. These issues make content-based and collaborative filtering approaches difficult to apply [8]. The bandit algorithm is a well-known solution that has the advantage of following the evolution of the user’s content and addresses this problem as a need for balancing exploration/exploitation (exr/exp) tradeoff. A bandit algorithm B exploits its past experience to select documents that appear more frequently. Besides, these seemingly optimal documents may in fact be suboptimal, because of the imprecision in B’s knowledge. In order to avoid this undesired situation, B has to explore documents by choosing seemingly suboptimal documents so as to gather more information about them. Exploitation can decrease short-term user’s satisfaction since some suboptimal documents may be chosen. However, obtaining information about the documents’ average rewards (i.e., exploration) can refine B’s estimate of the documents’ rewards and in turn increases long-term user’s satisfaction. Clearly, neither a purely exploring nor a purely exploiting algorithm works well, and a good tradeoff is needed. One classical solution to the multi-armed bandit problem is the ε-greedy strategy [9]. With the probability 1-ε, this algorithm chooses the best documents based on current knowledge; and with the probability ε, it uniformly chooses any other documents uniformly. The ε parameter controls essentially the exp/exr tradeoff between exploita- tion and exploration. One drawback of this algorithm is that it is difficult to decide the optimal value in advance. Instead, we introduce an algorithm named contextual-ε- greedy that involves two main ideas to achieve this goal by balance adaptively the exp/exr tradeoff. In the first idea, we extend the ε-greedy strategy with an update of the ε value by doing an exr/exp-tradeoff using a finite set of ε candidates. In the sec- ond idea, we take into account the user’s situation to define the exploration value, and then we select suitable situations for either exploration or exploitation. The rest of the paper is organized as follows. Section 2 gives the key notions used throughout this paper. Section 3 reviews some related works. Section 4 presents our MCRS model and Section 5 describes the algorithms involved in the proposed ap- proach. The experimental evaluation is illustrated in Section 6. The last section con- cludes the paper and points out possible directions for future work. 2 Key Notions In this section, we sketch briefly the key notions that will be of use in the remainder of this paper. The user’s model: The user’s model is structured as a case based, which is com- posed of a set of situations with their corresponding user’s preferences, denoted U =
  • 3. {(Si; UPi)}, where Si is the user’s situation and UPi its corresponding user’s prefer- ences. The user’s preferences: The user’s preferences are deduced during the user’s navigation activities. A navigation activity expresses the following sequence of events: (i) the user’s logs in the system and navigates across documents to get the desired information; (ii) the user expresses his preferences on the visited documents. We assume that the preference is the information that we extract from the user’s sys- tem interaction, for example the number of clicks on the visited documents or the time spent on a document. Let UP be the preferences submitted by a specific user in the system at a given situation. Each document in UP is represented as a single vector d=(c1,...,cn), where ci (i=1, .., n) is the value of a component characterizing the prefer- ences of d. We consider the following components: the total number of clicks on d, the total time spent reading d and the number of times d was recommended. Context: A user’s context C is a multi-ontology representation where each ontolo- gy corresponds to a context dimension C=(OLocation, OTime, OSocial). Each dimension models and manages a context information type. We focus on these three dimensions since they cover all needed information. These ontologies are described in [1, 24, 25] and are not developed in this paper. Situation: A situation is an instantiation of the user’s context. We consider a situa- tion as a triple S = (OLocation.xi, OTime.xj, OSocial.xk) where xi, xj and xk are ontology con- cepts or instances. Suppose the following data are sensed from the user’s mobile phone: the GPS shows the latitude and longitude of a point "48.8925349, 2.2367939"; the local time is "Mon Oct 3 12:10:00 2011" and the calendar states "meeting with Paul Gerard". The corresponding situation is: S=(OLocation."48.89, 2.23",OTime."Mon_Oct_3_12:10:00_2011",OSocial. "Paul_Gerard"). To build a more abstracted situation, we interpret the user’s behavior from this low- level multimodal sensor data using ontologies reasoning means. For example, from S, we obtain the following situation: MeetingAtRestaurant=(OLocation.Restaurant, OTime.Work_day, OSocial.Financial_client). For simplification reasons, we adopt in the rest of the paper the following notation: S = (xi, xj, xk). The previous example situation became thus: MeetingAtRestaurant=(Restaurant, Work_day, Financial_client). Among the set of captured situations, some of them are characterized as high-level critical situations. High Level Critical Situation (HLCS): A HLCS is a situation where the user needs the best information that can be recommended by the system, for instance when the user is in professional meeting. In such a situation, the system should not recom- mend information which does not consider the user’s preferences. In other words, the system must exclusively perform exploitation rather than exploration-oriented learning. In the other case, for instance where the user is using his information system at home or at night in vacation S = (home, vacation, friends). The system can make some explo- ration by recommending the user some information which ignores their interest. The HLCS situations are for the moment predefined by domain expert. In our case we con- duct the study with professional mobile users, which is described in detail in section 6, their HLCS are for example S1= (company, Monday morning, colleague), S2= (restau- rant, midday, client) or S3= (company, morning, manager).
  • 4. 3 Related Work We refer, in the following, recent recommendation techniques that tackle both making dynamic exr/exp (bandit algorithm) and considering the user’s situation in recom- mendation. 3.1 Bandit Algorithms Overview (ε-greedy) The (exr/exp) tradeoff was firstly studied in reinforcement learning in 1980's, and later flourished in other fields of machine learning [16, 19]. Very frequently used in rein- forcement learning to study the (exr/exp) tradeoff, the multi-armed bandit problem was originally described by Robbins [16]. The ε-greedy is the one of the most used strate- gy to solve the bandit problem and was first described in [20]. The ε-greedy strategy choose a random document with epsilon-frequency (ε), and choose otherwise the doc- ument with the highest estimated mean, the estimation is based on the rewards ob- served thus far. ε must be in the open interval [0, 1] and its choice is left to the user. The first variant of the ε-greedy strategy is what [9, 14] refer to as the ε-beginning strategy. This strategy makes exploration all at once at the beginning. For a given number I ∈ N of iterations, the documents are randomly pulled during the εI first itera- tions. During the remaining (1−ε)I iterations, the document of highest estimated mean is pulled. Another variant of the ε-greedy strategy is what Cesa-Bianchi and Fisher [14] call the ε-decreasing strategy. In this strategy, the document with the highest estimated mean is always pulled except when a random document is pulled instead with an εi frequency, where n is the index of the current round. The value of the decreasing εi is given by εi = {ε0/ i} where ε0 ∈ ]0,1]. Besides ε-decreasing, four other strategies are presented in [4]. Those strategies are not described here because the experiments done by [4] seem to show that, with carefully chosen parameters, ε-decreasing is always as good as the other strategies. Compared to the standard multi-armed bandit problem with a fixed set of possible actions, in CRS, old documents may expire and new docu- ments may frequently emerge. Therefore it may not be desirable to perform the explo- ration all at once at the beginning as in [9] or to decrease monotonically the effort on exploration as the decreasing strategy in [14]. Few research works are dedicated to study the contextual bandit problem on Rec- ommender System, where they consider user’s behavior as the context of the bandit problem. In [10], authors extend the ε-greedy strategy by updating the exploration value ε dynamically. At each iteration, they run a sampling procedure to select a new ε from a finite set of candidates. The probabilities associated to the candidates are uni- formly initialized and updated with the Exponentiated Gradient (EG) [10]. This updat- ing rule increases the probability of a candidate ε if it leads to a user’s click. Compared to both ε-beginning and decreasing strategy, this technique improves the results. In [13], authors model the recommendation as a contextual bandit problem. They propose an approach in which a learning algorithm selects sequentially documents to serve users based on contextual information about the users and the documents. To maxim- ize the total number of user’s clicks, this work proposes the LINUCB algorithm that is computationally efficient. The authors in [4, 9, 13, 14, 21] describe a smart way to balance exploration and exploitation. However, none of them consider the user’s situation during the recom- mendation.
  • 5. 3.2 Managing the User’s Situation Few research works are dedicated to manage the user’s situation on recommendation. In [7, 17] the authors propose a method which consists of building a dynamic user’s profile based on time and user’s experience. The user’s preferences in the user’s profile are weighted according to the situation (time, location) and the user’s behavior. To model the evolution on the user’s preferences according to his temporal situation in different periods, (like workday or vacations), the weighted association for the con- cepts in the user’s profile is established for every new experience of the user. The us- er’s activity combined with the user's profile are used together to filter and recommend relevant content. Another work [12] describes a CRS operating on three dimensions of context that complement each other to get highly targeted. First, the CRS analyzes information such as clients’ address books to estimate the level of social affinity among the users. Second, it combines social affinity with the spatiotemporal dimen- sions and the user’s history in order to improve the quality of the recommendations. In [3], the authors present a technique to perform user-based collaborative filtering. Each user’s mobile device stores all explicit ratings made by its owner as well as ratings received from other users. Only users in spatiotemporal proximity are able to exchange ratings and they show how this provides a natural filtering based on social contexts. Each work cited above tries to recommend interesting information to users on con- textual situation; however they do not consider the evolution of the user’s content. As shown above, none of the mentioned works tackles both problems of exr/exp dy- namicity and user’s situation consideration in the exr/exp strategy. This is precisely what we intend to do with our approach, exploiting the following new features: (1) we propose for exr/exp-tradeoff a calibration, which consists in updating the value of ε dynamically. At each iteration, we run the ε-decreasing algorithm to select a new ε from a finite set of candidates. The weight associated to the candidates increases if it leads to a user click. (2) Our intuition is that, considering the HLCS of the situation when managing the exr/exp-tradeoff, improves the long-term rewards of the recom- mender system, which is not considered in the surveyed approaches as far as we know. 4 MCRS Model In our recommender system, the recommendation of documents is modeled as a con- textual bandit problem including the user’s situation information. Formally, a bandit algorithm proceeds in discrete trials t = 1…T. For each trial t, the algorithm performs the following tasks: Task 1: Let St be the current user’s situation, and PS the set of past situations. The system compares St with the situations in PS in order to choose the most similar Sp using the following semantic similarity metric: p   t c  S = arg max   α j  sim j x j ,x j   (1) sc PS   j In Eq.1, simj is the similarity metric related to dimension j between two concepts xt and xc and αj the weight associated to dimension j. This similarity depends on how closely xt and xc are related in the corresponding ontology (location, time or social). We use the same similarity measure as [24] defined by Eq. 2:  t c  sim j x j , x j  2  deph( LCS ) (2) (deph( x c )  deph( x tj )) j
  • 6. In Eq. 2, LCS is the Least Common Subsumer of xjt and xjc, and deph is the number of nodes in the path from the node to the ontology root. Task 2: Let D be the document collection and Dp  D the set of documents rec- ommended in situation Sp. After retrieving Sp, the system observes the user’s behavior when reading each document dt  Dp. Based on observed rewards, the algorithm chooses the document dp with the greater reward rt,. Task 3: The algorithm improves its arm-selection strategy with the new observa- tion: in situation St, document dp obtained a reward rt. In the field of document recommandation, when a document is presented to the us- er and this one selects it by a click, a reward of 1 is incurred; otherwise, the reward is 0. With this definition of reward, the expected reward of a document is precisely its Click Through Rate (CTR). The CTR is the average number of clicks on a recom- mended document, computed dividing the total number of clicks on it by the number of times it was recommended. 5 Exploration/Exploitation Tradeoff In this section, we firstly present the ε-greedy algorithm (sub-section 5.1) and our proposed exr/exp-tradeoff algorithm named decreasing-ε-greedy (sub-section 5.2). In sub-section 5.3, we describe the contextual-ε-greedy algorithm that extends the de- creasing-ε-greedy in order to consider HLCS situations. 5.1 The ε-greedy() Algorithm The ε-greedy strategy is sketched in Algorithm 1. For a given user’s situation, the algorithm recommends a predefined number of documents, specified by parameter N. In this algorithm, UP={d1,…,dP} is the set of documents corresponding to the current user’s preferences; D={d1,….,dN} is the set of documents to recommend; getCTR (Alg. 1) is the function which estimates the CTR of a given document; Random (Alg. 1) is the function returning a random element from a given set; q is a random value uniform- ly distributed over [0, 1] which defines the exploration/exploitation tradeoff; ε is the probability of recommending a random exploratory document. Algorithm 1 The ε-greedy algorithm Input: ε, UP, D = Ø, N Output: D For i = 1 to i = N do argmaxUP(getCTR(d)) if q > ε and q = Random({0,1}) di = Random(UP) otherwise D = D ∪ di Endfor 5.2 The decreasing-ε-greedy To improve exploitation of the ε-greedy algorithm, we propose to extend it with set- ting out the optimal trade-off value ε. we iteratively update it by the method (decreas- ing-ε-greedy (Alg.2). First we assume that we have a finite number of candidate val-
  • 7. ues for ε, denoted by Hε= (ε1, ... , εT). Our goal is to select the optimal ε from Hε. To this end, we apply the ε-decreasing strategy to propose an εi, and then we use a set of weights w = (w1,...,wT) to keep track of the feedback of each εi, wi is increased if we receive a number of clicks li from the user when we use εi. Algorithm 2 decreasing-ε-greedy () Input: Hε, N, St, UP, n Output: D εε = Ø, 𝜏 = 0, wi = 1, i = 1,…,T For t = 1 to n do 𝜏 = 0.01* t argmax(i)(wi) if q ≤ 𝜏 and q = Random ([0,1]) εi = Random(Hε-εε) otherwise D = ε-greedy(εi, UP, D = Ø, N); Receive a click feedback li from the user wi = wi+ li; εε = εε ∪ εi; Endfor In Alg. 2. εε is the set of ε that have been previously selected, n is the number of it- eration of the learning algorithm, i is the identifier of ε and 𝜏 is the probability of pro- posing the argmaxi (wi), this parameter starts at low value and iteratively increases until the end of the learning ( 𝜏 =1). 5.3 Contextual-ε-greedy To improve the adaptation of the ε-greedy algorithm to HLCS situations, we com- pare current user’s situation St with the HLCS class of situations, and depending on the similarity between the current situation St and its most similar situation Sv ∈ HLCS, two scenarios are then possible: if sim(St, Sv) ≥ B and Sv ∈ HLCS, where B is a similar- ity threshold value and sim(S t ,S v ) = sim x t ,x v  (Eq. 1), the system recommends  j j j j documents with a greedy strategy and insert the St on the HLCS class of situations; otherwise the system uses the decreasing-ε-greedy() method described in Alg.2. Alg.3 summarizes the functional steps of the contextual-ε-greedy() method. Algorithm 3: contextual-ε-greedy() Input: HLCS, N, n, St, Hε, St, D = Ø Output: D     j  S v= arg max   α j  sim j x v ,x c  j  Sc HLCS  j  If sim(St, Sv) ≥ B and Sv ∈ HLCS then For i = 1 to N do di = argmaxd∈(UP–D) (CTR(d)); D = D ∪ di;Endfor HLCS ∪ St else D = decreasing-ε-greedy(Hε, N, St, UPv, n); Endif
  • 8. 6 Experimental Evaluation In order to evaluate empirically the performance of our approach, and in the absence of a standard evaluation framework, we propose an evaluation framework based on a diary set of study entries. The main objectives of the experimental evaluation are: (1) to find the optimal threshold B value described in (Section 5.3) and (2) to evaluate the performance of the proposed algorithms (Alg. 2 and Alg. 3) w. r. t. the ε variation and the dataset size. In the following, we describe our experimental datasets and then present and discuss the obtained results. 6.1 Evaluation Framework We have conducted a diary study with the collaboration of the French software com- pany Nomalys1. This company provides a history application, which records the time, current location, social and navigation information of its users during their application use. The diary study has taken 18 months and has generated 178 369 diary situation entries. Each diary situation entry represents the capture, of contextual time, location and social information. For each entry, the captured data are replaced with more ab- stracted information using time, spatial and social ontologies. Table 1 illustrates three examples of such transformations. Table 1. Semantic diary situation IDS Users Time Place Client 1 Paul Workday Paris Finance client 2 Fabrice Workday Roubaix Social client 3 Jhon Holiday Paris Telecom client From the diary study, we have obtained a total of 2 759 283 entries concerning the user’s navigation, expressed with an average of 15.47 entries per situation. Table 2 illustrates examples of such diary navigation entries, where Click is the number of clicks on a document; Time is the time spent reading a document, and Interest is the direct interest expressed by stars (the maximum number of stars is five). Table 2. Diary navigation entries IdDoc IDS Click Time Interest 1 1 2 2’ ** 2 1 4 3’ *** 3 1 8 5’ ***** 1 Nomalys is a company that provides a graphical application on Smartphones allowing users to access their company’s data.
  • 9. 6.2 Finding the Optimal B Threshold Value In order to set out the threshold similarity value, we use a manual classification as a baseline and compare it with the results obtained by our technique. So, we take a ran- dom sampling of 10% of the situation entries, and we manually group similar situa- tions; then we compare the constructed groups with the results obtained by our simi- larity algorithm, with different threshold values. Fig. 1. Effect of B threshold value on the similarity accuracy Figure 1 shows the effect of varying the threshold situation similarity parameter B in the interval [0, 3] on the overall precision. Results show that the best performance is obtained when B has the value 2.4 achieving a precision of 0.849. Consequently, we use the identified optimal threshold value (B = 2.4) of the situation similarity measure for testing our MCRS. 6.3 Experimental Results In our experiments, we firstly have collected the 3000 situations with an occurrence greater than 100 to be statistically meaningful. Then, we have sampled 10000 docu- ments that have been shown on any of these situations. The testing step consists of evaluating the algorithms for each testing situation using the average expected CTR. The expected CTR for a particular iteration is the ratio between the total number of expected clicks and the total number of displays. Then, we calculate the average ex- pected CTR over every 100 iterations. We have run the simulation until the number of iterations reaches 1000 and the number of documents (n) returned by the recommend- er system for each situation is 10. In the first experiment, we have evaluated the two exr/exp algorithms described in section 5. In addition to a pure exploitation baseline, we have compared our algorithms (decreasing-ε-greedy and contextual-ε-greedy) to the algorithms described in section 3): ε-greedy; ε-beginning, ε-decreasing and EG. In Fig. 2, the horizontal axis is the number of iterations and the vertical axis is the per- formance metric.
  • 10. Fig. 2. Average expected CTR for different exr/exp algorithms We have parametered the different algorithms as follows: For the ε-greedy algorithm, we experiment with two parameter values: 0.5 and 0.9. The ε-decreasing, EG, de- creasing-ε-greedy and contextual-ε-greedy share the same set of candidates {εi = 1- 0.1 * i, i = 1,...,10}. The ε-decreasing starts with the largest value and reduces it by 0.1 every 100 iterations until ε reaches the smallest value. Overall exr/exp algorithms have better performance than the baseline. However, for the first 200 iterations, with pure exploitation the exploitation baseline achieves a faster increase convergence. But in the long run, all exr/exp algorithms catch up and improve the average CTR at con- vergence. We have several interesting observations regarding the behaviors of the different exr/exp algorithms. Even if the decreasing-ε-greedy algorithm does not con- sider HLCS on recommendation, it has a similar convergence threshold than contex- tual-ε-greedy. The contextual-ε-greedy method effectively learns the optimal ε and its convergence rate is close to the best setting. But for decreasing-ε-greedy, it is only after the 500 iterations that, the algorithm begins to show its advantage. For the ε- decreasing algorithm, the converged CTR increases as ε decreases (less exploration). Even after convergence, the 0.9-greedy and 0.5-greedy algorithms still give respec- tively 90% and 50% of the opportunities to documents with low CTR, which decreas- es significantly their expected CTR average. While the ε-decreasing algorithm con- verges to a higher CTR, its overall performance is not as good as ε-beginning. Its CTR drops a lot at the early stage because of more exploration but does not converge faster. The contextual-ε-greedy algorithm has the best convergence rate, and increases CTR by a factor of 2 over the baseline and outperforms over all other exr/exp algorithms. The improvement comes from a dynamic tradeoff between exr/exp, controlled by the critical situation estimation. At the early stage, this algorithm takes full advantage of exploration without wasting opportunities to establish good CTR. 6.4 Size of data To compare the algorithms when data is sparse in our experiments, we reduce data sizes of 30%, 20%, 10%, 5%, and 1%, respectively. To better visualize the compari-
  • 11. son results, figures 3 shows algorithms’ average CTR graphs with the previous re- ferred data sparseness levels. Our first observation is that, in all data sparseness lev- els, all algorithms are useful. Note that making the decrease of data size does not sig- nificantly improve the performance of the ε-greedy (0,5-greedy, 0,9-greedy) in sparse data. Although very different from the EG algorithms, the Decreasing-ε-greedy tends to very similar results. Except for exploitation baseline the ε-beginning seems to have a rather poor performance, its results are worse than any other strategy independently of the chosen parameters. The reason probably lies in the fact that the ε-beginning makes the exploration only at the beginning. The contextual-ε-greedy outperforms all of the other strategies on the dataset, by a factor of 2. Fig. 3. Average expected CTR for different data size 7 Conclusion In this paper, we study the problem of exploitation and exploration in mobile context- aware recommender systems and propose a novel approach that balances adaptively exr/exp regarding the user’s situation. We have presented an evaluation protocol based on real mobile navigation contexts obtained from a diary study conducted with collaboration with the Nomalys French company. We have evaluated our approach according to the proposed evaluation protocol and show that it is effective. In order to evaluate the performance of the proposed algorithm, we compare it with other stand- ard exr/exp strategies. The experimental results demonstrate that our algorithm per- forms better on CTR in various configurations. In the future, we plan to evaluate the scalability of the algorithm on-board a mobile device and investigate other public benchmark and integrate some of information as contextual to increase the reproduci- bility of the study. References 1. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski: Following the User’s Interests in Mo- bile Context-Aware recommender systems. AINA Workshops, pp. 657-662. Fukuoka, Ja- pan (2012).
  • 12. 2. G. Adomavicius, B. Mobasher, F. Ricci, Alexander Tuzhilin. Context-Aware Recom- mender Systems. AI Magazine. 32(3): 67-80, 2011. 3. S. Alexandre, C. Moira Norrie, M. Grossniklaus and B.Signer, “Spatio-Temporal Proximi- ty as a Basis for Collaborative Filtering in Mobile Environments”. CAiSE, 2006. 4. P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite Time Analysis of the Multiarmed Bandit Problem. Machine Learning, 2, 235–256, 2002. 5. R. Bader , E. Neufeld , W. Woerndl , V. Prinz, Context-aware POI recommendations in an automotive scenario using multi-criteria decision making methods, Proceedings of the 2011 Workshop on Context-awareness in Retrieval and Recommendation, p.23-30, 2011. 6. L. Baltrunas, B. Ludwig, S. Peer, and F. Ricci. Context relevance assessment and exploita- tion in mobile recommender systems. Personal and Ubiquitous Computing, 2011. 7. V. Bellotti, B. Begole, E.H. Chi, N. Ducheneaut, J. Fang, E. Isaacs, Activity-Based Seren- dipitous Recommendations. Proceedings On the Move, 2008. 8. W. Chu and S. Park. Personalized recommendation on dynamic content using predictive bilinear models. In Proc. of World Wide Web, pages 691–700, 2009. 9. E. Even-Dar, S. Mannor, and Y. Mansour. PAC Bounds for Multi-Armed Bandit and Mar- kov Decision Processes. Computational Learning Theory, 255–270, 2002. 10. J. Kivinen and K. Manfred, Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132-163, 1997. 11. J. Langford and T. Zhang. The epoch-greedy algorithm for contextual multi-armed bandits. In Advances in Neural Information Processing Systems 20, 2008. 12. R. Lakshmish, P. Deepak, P. Ramana, G. Kutila, D. Garg, V. Karthik, K. Shivkumar , A Mobile context-aware, Social Recommender System for Low-End Mobile Device. Mobile Data Management: Systems, Services and Middleware CAESAR:, 2009. 13. Li. Lihong, C. Wei, J. Langford, E. Schapire. A Contextual-Bandit Approach to Personal- ized News Document Recommendation. CoRR, Presented at the Nineteenth International Conference on World Wide Web, Raleigh, Vol. abs/1002.4058, 2010. 14. S. Mannor and J. Tsitsiklis. The Sample Complexity of Exploration in the Multi-Armed Bandit Problem. Computational Learning Theory, 2003. 15. D. Mladenic. Text-learning and related intelligent agents: A survey. IEEE Intelligent Agents, pages 44–54, 1999. 16. H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the Ameri- can Mathematical Society, 1952. 17. G. Samaras, C. Panayiotou. Personalized portals for the wireless user based on mobile agents. Proc. 2nd Int'l workshop on Mobile Commerce, pp. 70-74, 2002. 18. J. B. Schafer, J. Konstan, and J. Riedi. Recommender systems in e-commerce. In Proc. of the 1st ACM Conf. on Electronic Commerce, 1999. 19. R. Sutton and A. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. 20. C. J.Watkins, Learning from Delayed Rewards. Ph.D. thesis. Cambridge University, 1989. 21. Li. Wei, X. Wang, R. Zhang, Y. Cui, J. Mao, R. Jin. Exploitation and Exploration in a Per- formance based Contextual Advertising System: KDD, pp. 133-138. 2010. 22. W. Woerndl, Florian Schulze: Capturing, Analyzing and Utilizing Context-Based Infor- mation About User Activities on Smartphones. Activity Context Representation, 2011. 23. Z. Wu and M. Palmer. Verb Semantics and Lexical Selection. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, 133-138, 1994. 24. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski.: Hybrid-ε-greedy for Mobile Context- Aware Recommender System. PAKDD (1) 2012: 468-479 25. D. Bouneffouf, A. Bouzeghoub & A. L. Gançarski: Considering the High Level Critical Situations in Context-Aware Recommender Systems. IMMoA 2012: 26-32
  • 13. 26. D. Bouneffouf,: L'apprentissage automatique, une étape importante dans l'adaptation des systèmes d'information à l'utilisateur. inforsid 2011 : 427-428 27. D. Bouneffouf,: Applying machine learning techniques to improve user acceptance on ubiquitous environment. CAISE Doctoral Consortium 2011