SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
UFAL - Universidade Federal de Alagoas
   UFAL - Instituto de Computa¸˜o
                               ca




             Naive Bayes

         Jonathas Magalh˜es a
          jonathas@ic.ufal.br




          Magalh˜es, J.J.
                a           IA – 2013    1
Naive Bayes



   Baseado no Teorema de Bayes:

                                           P(B|A) ∗ P(A)
                       P(A|B) =                          .           (1)
                                               P(B)

   Seja X (A1 , A2 , ..., An , C ) um conjunto de dados de treinamento;
   Onde C1 , C 2, ..., Ck s˜o classes dos poss´
                           a                  ıveis valores de C ;
   R ´ um novo registro que deve ser classificado.
     e
   Os valores que R assume em X s˜o a1 , a2 , ..., an .
                                 a




                         Magalh˜es, J.J.
                               a              IA – 2013         2
Naive Bayes



Passos do algoritmo:
  1   Calcular a probabilidade P(C = Ci |R), i = 1, 2, ..., k;
  2   A sa´ ´ a classe Cj tal que P(C = Cj |R) seja m´xima.
          ıda e                                      a
A probabilidade de uma instˆncia pertencer a uma classe ´ dada por:
                           a                            e

                P(C = Ci |A1 = a1 , A2 = a2 , ..., An = an ) =
              P(A1 = a1 |C = Ci ) ∗ P(A2 = a2 |C = Ci ) ∗ ...        (2)
                         ∗P(An = an |C = Ci ) ∗ P(C = Ci ).




                           Magalh˜es, J.J.
                                 a           IA – 2013           3
Naive Bayes – Exemplo

Considere os seguintes dados:


 X 1 : Tempo de utiliza¸˜o
                       ca      X 2 : N´mero postagens
                                      u                    Y : Passou na disciplina
            2                            4                          N˜o
                                                                      a
            3                            6                          N˜o
                                                                      a
            4                            8                          N˜o
                                                                      a
            4                            4                          N˜o
                                                                      a
            5                            7                          N˜o
                                                                      a
            6                            5                          N˜o
                                                                      a
            6                            6                          Sim
            6                            5                          Sim
            7                            7                          Sim
            8                            5                          Sim
            8                            6                          Sim
           10                            10                         Sim




                             Magalh˜es, J.J.
                                   a           IA – 2013              4
Naive Bayes




Discretizando os valores:
    Baixo: {0, 1, 2, 3}
    M´dio: {4, 5, 6, 7}
     e
    Alto: {8, 9, 10, 11}




                            Magalh˜es, J.J.
                                  a           IA – 2013   5
Naive Bayes


Dados discretizados:




                       Magalh˜es, J.J.
                             a           IA – 2013   6
Naive Bayes




Deseja-se predizer se um aluno (instˆncia R) com...
                                    a
    Um n´mero m´dio de postagens (Postagens = m´dio), e;
        u      e                               e
    Um n´mero m´dio de tempo de utiliza¸˜o (Tempo = m´dio).
        u      e                       ca            e
...Ir´ passar de ano ou n˜o (Passar = ?).
     a                   a




                         Magalh˜es, J.J.
                               a           IA – 2013   7
Naive Bayes




Calculando (Passar = sim | R):
    P(passou = sim | tempo = m´dio, postagens = m´dio) =
                                  e                  e
    P(postagens = m´dio | passou = sim) * P(tempo = m´dio |
                    e                                e
    passou = sim) * P(passou=sim).
         P(postagens = m´dio | passou = sim) = ?
                        e
         P(tempo = m´dio | passou = sim) = ?
                      e
         P(passou=sim) = ?




                        Magalh˜es, J.J.
                              a           IA – 2013   8
Naive Bayes

                                             5
P(postagens = m´dio | passou = sim) =
               e                             6   = 0.83




                      Magalh˜es, J.J.
                            a           IA – 2013         9
Naive Bayes




Calculando (Passar = sim | R):
    P(passou = sim | tempo = m´dio, postagens = m´dio) =
                                  e                   e
    0.83 * P(tempo = m´dio | passou = sim) * P(passou=sim).
                       e
         P(postagens = m´dio | passou = sim) = 0.83
                          e
         P(tempo = m´dio | passou = sim) = ?
                     e
         P(passou=sim) = ?




                        Magalh˜es, J.J.
                              a           IA – 2013   10
Naive Bayes

                                        3
P(tempo = m´dio | passou = sim) =
           e                            6   = 0.5




                      Magalh˜es, J.J.
                            a           IA – 2013   11
Naive Bayes




Calculando (Passar = sim | R):
    P(passou = sim | tempo = m´dio, postagens = m´dio) =
                                e                e
    0.83 * 0.5 * P(passou=sim).
         P(postagens = m´dio | passou = sim) = 0.83
                         e
         P(tempo = m´dio | passou = sim) = 0.5
                       e
         P(passou=sim) = ?




                        Magalh˜es, J.J.
                              a           IA – 2013   12
Naive Bayes

                                        6
P(tempo = m´dio | passou = sim) =
           e                            12   = 0.5




                      Magalh˜es, J.J.
                            a           IA – 2013    13
Naive Bayes




Calculando (Passar = sim | R):
    P(passou = sim | tempo = m´dio, postagens = m´dio) =
                              e                  e
    0.83 * 0.5 * 0.5 = 0.21.
         P(postagens = m´dio | passou = sim) = 0.83;
                        e
         P(tempo = m´dio | passou = sim) = 0.5;
                      e
         P(passou=sim) = 0.5.




                        Magalh˜es, J.J.
                              a           IA – 2013    14
Naive Bayes




Calculando (Passar = n˜o | R):
                      a
    P(passou = n˜o | tempo = m´dio, postagens = m´dio) =
                   a          e                  e
    0.5 * 0.67 * 0.5 = 0.17.
         P(postagens = m´dio | passou = n˜o) = 0.5;
                        e                a
         P(tempo = m´dio | passou = n˜o) = 0.67;
                      e               a
         P(passou=sim) = 0.5.




                        Magalh˜es, J.J.
                              a           IA – 2013   15
Naive Bayes




Classificando a instˆncia:
                   a
    (Passar = n˜o | R) = 0.17;
               a
    (Passar = sim | R) = 0.21;
    Como (Passar = sim | R) > (Passar = n˜o | R), logo a predi¸˜o
                                             a                ca
    sobre o aluno ´ que ele passar´ na disciplina.
                  e               a




                            Magalh˜es, J.J.
                                  a           IA – 2013   16
Perguntas?




Magalh˜es, J.J.
      a           IA – 2013   17

Mais conteúdo relacionado

Mais de Jonathas Magalhães

An Ontology Based Approach for Sharing Distributed Educational
An Ontology Based Approach for Sharing Distributed EducationalAn Ontology Based Approach for Sharing Distributed Educational
An Ontology Based Approach for Sharing Distributed EducationalJonathas Magalhães
 
A Recommender System for Predicting User Engagement in Twitter
A Recommender System for Predicting User Engagement in TwitterA Recommender System for Predicting User Engagement in Twitter
A Recommender System for Predicting User Engagement in TwitterJonathas Magalhães
 
Social Query: A Query Routing System for Twitter
Social Query: A Query Routing System for TwitterSocial Query: A Query Routing System for Twitter
Social Query: A Query Routing System for TwitterJonathas Magalhães
 
A Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterA Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterJonathas Magalhães
 
Predicting Potential Responders in Twitter: A Query Routing Algorithm
Predicting Potential Responders in Twitter: A Query Routing AlgorithmPredicting Potential Responders in Twitter: A Query Routing Algorithm
Predicting Potential Responders in Twitter: A Query Routing AlgorithmJonathas Magalhães
 
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...Jonathas Magalhães
 
Improving a Recommender System Through Integration of User Profiles: a Semant...
Improving a Recommender System Through Integration of User Profiles: a Semant...Improving a Recommender System Through Integration of User Profiles: a Semant...
Improving a Recommender System Through Integration of User Profiles: a Semant...Jonathas Magalhães
 

Mais de Jonathas Magalhães (8)

Probabilidade
ProbabilidadeProbabilidade
Probabilidade
 
An Ontology Based Approach for Sharing Distributed Educational
An Ontology Based Approach for Sharing Distributed EducationalAn Ontology Based Approach for Sharing Distributed Educational
An Ontology Based Approach for Sharing Distributed Educational
 
A Recommender System for Predicting User Engagement in Twitter
A Recommender System for Predicting User Engagement in TwitterA Recommender System for Predicting User Engagement in Twitter
A Recommender System for Predicting User Engagement in Twitter
 
Social Query: A Query Routing System for Twitter
Social Query: A Query Routing System for TwitterSocial Query: A Query Routing System for Twitter
Social Query: A Query Routing System for Twitter
 
A Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on TwitterA Query Routing Model to Rank Expertcandidates on Twitter
A Query Routing Model to Rank Expertcandidates on Twitter
 
Predicting Potential Responders in Twitter: A Query Routing Algorithm
Predicting Potential Responders in Twitter: A Query Routing AlgorithmPredicting Potential Responders in Twitter: A Query Routing Algorithm
Predicting Potential Responders in Twitter: A Query Routing Algorithm
 
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...
An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...
 
Improving a Recommender System Through Integration of User Profiles: a Semant...
Improving a Recommender System Through Integration of User Profiles: a Semant...Improving a Recommender System Through Integration of User Profiles: a Semant...
Improving a Recommender System Through Integration of User Profiles: a Semant...
 

Naive Bayes

  • 1. UFAL - Universidade Federal de Alagoas UFAL - Instituto de Computa¸˜o ca Naive Bayes Jonathas Magalh˜es a jonathas@ic.ufal.br Magalh˜es, J.J. a IA – 2013 1
  • 2. Naive Bayes Baseado no Teorema de Bayes: P(B|A) ∗ P(A) P(A|B) = . (1) P(B) Seja X (A1 , A2 , ..., An , C ) um conjunto de dados de treinamento; Onde C1 , C 2, ..., Ck s˜o classes dos poss´ a ıveis valores de C ; R ´ um novo registro que deve ser classificado. e Os valores que R assume em X s˜o a1 , a2 , ..., an . a Magalh˜es, J.J. a IA – 2013 2
  • 3. Naive Bayes Passos do algoritmo: 1 Calcular a probabilidade P(C = Ci |R), i = 1, 2, ..., k; 2 A sa´ ´ a classe Cj tal que P(C = Cj |R) seja m´xima. ıda e a A probabilidade de uma instˆncia pertencer a uma classe ´ dada por: a e P(C = Ci |A1 = a1 , A2 = a2 , ..., An = an ) = P(A1 = a1 |C = Ci ) ∗ P(A2 = a2 |C = Ci ) ∗ ... (2) ∗P(An = an |C = Ci ) ∗ P(C = Ci ). Magalh˜es, J.J. a IA – 2013 3
  • 4. Naive Bayes – Exemplo Considere os seguintes dados: X 1 : Tempo de utiliza¸˜o ca X 2 : N´mero postagens u Y : Passou na disciplina 2 4 N˜o a 3 6 N˜o a 4 8 N˜o a 4 4 N˜o a 5 7 N˜o a 6 5 N˜o a 6 6 Sim 6 5 Sim 7 7 Sim 8 5 Sim 8 6 Sim 10 10 Sim Magalh˜es, J.J. a IA – 2013 4
  • 5. Naive Bayes Discretizando os valores: Baixo: {0, 1, 2, 3} M´dio: {4, 5, 6, 7} e Alto: {8, 9, 10, 11} Magalh˜es, J.J. a IA – 2013 5
  • 6. Naive Bayes Dados discretizados: Magalh˜es, J.J. a IA – 2013 6
  • 7. Naive Bayes Deseja-se predizer se um aluno (instˆncia R) com... a Um n´mero m´dio de postagens (Postagens = m´dio), e; u e e Um n´mero m´dio de tempo de utiliza¸˜o (Tempo = m´dio). u e ca e ...Ir´ passar de ano ou n˜o (Passar = ?). a a Magalh˜es, J.J. a IA – 2013 7
  • 8. Naive Bayes Calculando (Passar = sim | R): P(passou = sim | tempo = m´dio, postagens = m´dio) = e e P(postagens = m´dio | passou = sim) * P(tempo = m´dio | e e passou = sim) * P(passou=sim). P(postagens = m´dio | passou = sim) = ? e P(tempo = m´dio | passou = sim) = ? e P(passou=sim) = ? Magalh˜es, J.J. a IA – 2013 8
  • 9. Naive Bayes 5 P(postagens = m´dio | passou = sim) = e 6 = 0.83 Magalh˜es, J.J. a IA – 2013 9
  • 10. Naive Bayes Calculando (Passar = sim | R): P(passou = sim | tempo = m´dio, postagens = m´dio) = e e 0.83 * P(tempo = m´dio | passou = sim) * P(passou=sim). e P(postagens = m´dio | passou = sim) = 0.83 e P(tempo = m´dio | passou = sim) = ? e P(passou=sim) = ? Magalh˜es, J.J. a IA – 2013 10
  • 11. Naive Bayes 3 P(tempo = m´dio | passou = sim) = e 6 = 0.5 Magalh˜es, J.J. a IA – 2013 11
  • 12. Naive Bayes Calculando (Passar = sim | R): P(passou = sim | tempo = m´dio, postagens = m´dio) = e e 0.83 * 0.5 * P(passou=sim). P(postagens = m´dio | passou = sim) = 0.83 e P(tempo = m´dio | passou = sim) = 0.5 e P(passou=sim) = ? Magalh˜es, J.J. a IA – 2013 12
  • 13. Naive Bayes 6 P(tempo = m´dio | passou = sim) = e 12 = 0.5 Magalh˜es, J.J. a IA – 2013 13
  • 14. Naive Bayes Calculando (Passar = sim | R): P(passou = sim | tempo = m´dio, postagens = m´dio) = e e 0.83 * 0.5 * 0.5 = 0.21. P(postagens = m´dio | passou = sim) = 0.83; e P(tempo = m´dio | passou = sim) = 0.5; e P(passou=sim) = 0.5. Magalh˜es, J.J. a IA – 2013 14
  • 15. Naive Bayes Calculando (Passar = n˜o | R): a P(passou = n˜o | tempo = m´dio, postagens = m´dio) = a e e 0.5 * 0.67 * 0.5 = 0.17. P(postagens = m´dio | passou = n˜o) = 0.5; e a P(tempo = m´dio | passou = n˜o) = 0.67; e a P(passou=sim) = 0.5. Magalh˜es, J.J. a IA – 2013 15
  • 16. Naive Bayes Classificando a instˆncia: a (Passar = n˜o | R) = 0.17; a (Passar = sim | R) = 0.21; Como (Passar = sim | R) > (Passar = n˜o | R), logo a predi¸˜o a ca sobre o aluno ´ que ele passar´ na disciplina. e a Magalh˜es, J.J. a IA – 2013 16
  • 17. Perguntas? Magalh˜es, J.J. a IA – 2013 17