SlideShare a Scribd company logo
1 of 19
Download to read offline
Multiple classifier systems under attack
Battista Biggio, Giorgio Fumera, Fabio Roli
Dept. of Electrical and Electronic Eng., Univ. of Cagliari

http://prag.diee.unica.it


9th International Workshop on Multiple Classifier Systems
Outline


●   Adversarial classification


●   MCSs in adversarial classification tasks


●   Some experimental results




                                               2
Adversarial classification
Two pattern classes:                        Examples:
                                            ● Biometric verification and recognition
legitimate, malicious                       ● Intrusion detection in computer networks

                                            ● Spam filtering

                                            ● Network traffic identification


Biometric verification                      ...
                                                              Spam filtering
           I am John Smith
                                                                                       legitimate
                                                               Subject: MCS2010 Suggested tours

                                                               Dear MCS 2010 Participant,
                                                               Attached please find the offers
                                                               we negotiated with the travel agency
                                                               ...
    genuine          Template
                     database                                                            spam
                                                               Subject: Need affordable Drugs??
                                J. Smith   B. Brown
            I am Bob Brown                                     Order from Canadian Pharmacy
                                                               & Save You Money
                                                               We are having Specials
                                                               Hot Promotion this week!
                                                               ...

                                                                                                  3
                    impostor
Adversarial classification
Attack: fingerprint spoofing
                                                                                      spam
                                               Subject: Need affordable Drugs??

                                               Order from Canadian Pharmacy & Save You Money
                                               We are having Specials Hot Promotion this week!
B. Brown                                       ...



                 I am Bob Brown
                                                         Attack:
                                                         Bad word obfuscation
                                                         Good word insertion


 impostor                                      Subject: Need affordab1e D r u g s??     spam

                                               Order from (anadian Ph@rmacy & S@ve You Money
                                               We are having Specials H0t Promotion this week!
                                               ...
                                               "Don't you guys ever read a paper? Moyer's a
                                               gentleman now. He knows t
                                               "Well I'm sure I can't help what you think,"
                                               she said tartly. "After a
           J. Smith   B. Brown

      Template database                                                                      4
Adversarial classification
Main issues:
●   vulnerabilities of pattern recognition systems
●   performance evaluation under attack
●   design of pattern recognition systems robust to attacks




                                                          5
Multiple classifier systems
                 in adversarial environments
      I am Bob Brown




                                             Fusion rule
                                                           Accepted/
                       J. Smith   B. Brown                 Rejected



impostor




Multimodal biometric systems: more accurate than unimodal ones



                                                                       6
Multiple classifier systems
                 in adversarial environments
      I am Bob Brown




                                                Fusion rule
                                                              Accepted/
                        J. Smith   B. Brown                   Rejected



impostor




Multimodal biometric systems: more accurate than unimodal ones
And also more robust to attacks (?)

Analogous claims in other applications
(spam filtering, network intrusion detection, etc.)                       7
Aim of our work
Main issues in adversarial classification:
●   vulnerabilities of pattern recognition systems
●   performance evaluation under attack
●   design of pattern recognition systems robust to attacks


Our goal: to investigate whether and how MCSs allow to
improve the robustness of PR systems under attack




                                                          8
Linear classifiers under attack
The adversary exploits some knowledge on
●   the features
●   the classifier's decision function
An example: spam filtering, linear classifiers
f(x) = sign { ω1x1 + ω2x2 + ... + ωNxN + ω0 }
xi  {0,1}; f(x) = +1: spam; f(x) = -1: legitimate

      Buy viagra!                         Buy vi4gr4!

                                          Did you ever play that game
                                          when you were a kid where the
                                          little plastic hippo tries to
                                          gobble up all your marbles?


    x = [ 1 0 1 0 0 0 0 0 …]              x’ = [ 1 0 0 0 1 0 0 1 …]

                                                                          9
Linear classifiers under attack
   The adversary exploits some knowledge on
   ●   the features
   ●   the classifier's decision function

ω buy
      viagra                   f(x) = sign { ω1x1 + ω2x2 + ... ωNxN + ω0 }
          2.0
    0.5                        Buy viagra!     0.5 + 2.0 - 0.9 = 0.6 > 0: spam
                               Buy vi4gr4!     0.5 - 0.9 = -0.4 < 0: legitimate
          -0.5
                        -0.9   Buy viagra!     0.5 + 2.0 - 2.0 - 0.9 = -0.4 < 0: legitimate
                 -2.0          game
          kid game ω
                     0




                                                                                       10
Linear classifiers under attack
   Possible strategy to improve the robustness of linear
   classifiers: keep weights as much uniform as possible
   (Kolcz and Teo, 6th Conf. on Email and Anti-Spam,
   CEAS 2009)

ω buy
      viagra                       f(x) = sign { ω1x1 + ω2x2 + ... ωNxN + ω0 }
    1.0 1.5
                                   Buy viagra!     1.0 + 1.5 - 0.9 = 1.6 > 0: spam
                                   Buy vi4gr4!     1.0 - 0.9 = 0.1 > 0: spam
              -1.0          -0.9   Buy viagra!     1.0 + 1.5 - 1.5 - 0.9 = 0.1 > 0: spam
                     -1.5
                                   game
              kid game ω
                         0
                                   Buy viagra!     1.0 + 1.5 - 1.0 - 1.5 - 0.9 = -0.9 < 0
                                   kid game                                     legitimate

                                                                                             11
Ensembles of linear classifiers under attack
Do randomisation-based MCS techniques result in more
uniform weights of linear base classifiers?
●   bagging
●   random subspace method
●   ...
(accuracy-robustness trade-off)




                                                       12
Experimental setting (1)
●   Spam filtering task
●   TREC 2007 data set (20,000 out of > 75,000 e-mails, 2/3 spam)
●   Features: bag of words (word occurrence) > 360,000
●   Base linear classifiers: SVM, Logistic Regression
●   MCS
    ●   ensemble size: 3, 5, 10
    ●   bagging: 20%, 100% training samples
    ●   RSM: 20%, 50%, 80% feature subset sizes
●   5 runs
●   Evaluation of performance under attack: worst-case BWO/GWI
    attack, for m obfuscated/added words (m = “attack strength”)


                                                                    13
Performance measure
    TP

1



                      Receiver Operating Characteristic (ROC) curve

                      TP = Prob [f(X) = Malicious | Y = Malicious]

                      FP = Prob [f(X) = Malicious | Y = Legitimate]

                      AUC10%




                               FP
0        0.1              1


                                                                     14
Measure of weights uniformity
                           sum of top-K weights   |ω|
                           absolute values

                          sum of weights
                          absolute values               |ω1|    |ωΝ|

       F(K)
                 least uniform weights            |ω|
          1



                                                        |ω1|    |ωΝ|

                                                  |ω|
                       most uniform weights



                                                        |ω1|    |ωΝ|
                                              K
          0                              N
                                                                              15
Kolcz and Teo, 6th Conf. on Email and Anti-Spam (CEAS 2009)
Results (1)




number of obfuscated/added words

                                        16
Experimental setting (2)
●   SpamAssassin
●   About N = 900 Boolean“tests”, x1, x2, ...,xN ,   xi  {0,1}
●   Decision function:
    f(x) = sign { ω1x1 + ω2x2 + ... + ωNxN + ω0 },
    f(x) = +1: spam; f(x) = -1: legitimate
●   Default weights: machine learning + manual tuning
●   Evaluation of performance under attack: evasion of the
    worst m tests (m = “attack strength”)



                                                                  17
Results (2)




number of evaded tests
                                   18
Conclusions
●   Adversarial classification: which roles can MCSs play?


●   This work:
    ●   linear classifiers
    ●   attacks based on some knowledge about features
        and decision function (case study: spam filtering)


●   Future works: investigating MCSs on different
    applications, base classifiers, kinds of attacks, ...


                                                             19

More Related Content

More from Pluribus One

Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...Pluribus One
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsPluribus One
 
Secure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksSecure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksPluribus One
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresPluribus One
 
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Pluribus One
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Pluribus One
 
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...Pluribus One
 
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...Pluribus One
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringPluribus One
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Pluribus One
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Pluribus One
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Pluribus One
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Pluribus One
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesPluribus One
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Pluribus One
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Pluribus One
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Pluribus One
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsPluribus One
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 PosterPluribus One
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Pluribus One
 

More from Pluribus One (20)

Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
Is Deep Learning Safe for Robot Vision? Adversarial Examples against the iCub...
 
On Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial SettingsOn Security and Sparsity of Linear Classifiers for Adversarial Settings
On Security and Sparsity of Linear Classifiers for Adversarial Settings
 
Secure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion AttacksSecure Kernel Machines against Evasion Attacks
Secure Kernel Machines against Evasion Attacks
 
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security MeasuresMachine Learning under Attack: Vulnerability Exploitation and Security Measures
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
 
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
Battista Biggio @ ICML 2015 - "Is Feature Selection Secure against Training D...
 
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...
 
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
Sparse Support Faces - Battista Biggio - Int'l Conf. Biometrics, ICB 2015, Ph...
 
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
Battista Biggio, Invited Keynote @ AISec 2014 - On Learning and Recognition o...
 
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware ClusteringBattista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
Battista Biggio @ AISec 2014 - Poisoning Behavioral Malware Clustering
 
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
Battista Biggio @ S+SSPR2014, Joensuu, Finland -- Poisoning Complete-Linkage ...
 
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
Battista Biggio @ AISec 2013 - Is Data Clustering in Adversarial Settings Sec...
 
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
Battista Biggio @ ECML PKDD 2013 - Evasion attacks against machine learning a...
 
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
Battista Biggio @ ICML2012: "Poisoning attacks against support vector machines"
 
Zahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense SlidesZahid Akhtar - Ph.D. Defense Slides
Zahid Akhtar - Ph.D. Defense Slides
 
Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...Design of robust classifiers for adversarial environments - Systems, Man, and...
Design of robust classifiers for adversarial environments - Systems, Man, and...
 
Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...Robustness of multimodal biometric verification systems under realistic spoof...
Robustness of multimodal biometric verification systems under realistic spoof...
 
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
Support Vector Machines Under Adversarial Label Noise (ACML 2011) - Battista ...
 
Understanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environmentsUnderstanding the risk factors of learning in adversarial environments
Understanding the risk factors of learning in adversarial environments
 
Amilab IJCB 2011 Poster
Amilab IJCB 2011 PosterAmilab IJCB 2011 Poster
Amilab IJCB 2011 Poster
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 

MCS Systems Performance Under Attack

  • 1. Multiple classifier systems under attack Battista Biggio, Giorgio Fumera, Fabio Roli Dept. of Electrical and Electronic Eng., Univ. of Cagliari http://prag.diee.unica.it 9th International Workshop on Multiple Classifier Systems
  • 2. Outline ● Adversarial classification ● MCSs in adversarial classification tasks ● Some experimental results 2
  • 3. Adversarial classification Two pattern classes: Examples: ● Biometric verification and recognition legitimate, malicious ● Intrusion detection in computer networks ● Spam filtering ● Network traffic identification Biometric verification ... Spam filtering I am John Smith legitimate Subject: MCS2010 Suggested tours Dear MCS 2010 Participant, Attached please find the offers we negotiated with the travel agency ... genuine Template database spam Subject: Need affordable Drugs?? J. Smith B. Brown I am Bob Brown Order from Canadian Pharmacy & Save You Money We are having Specials Hot Promotion this week! ... 3 impostor
  • 4. Adversarial classification Attack: fingerprint spoofing spam Subject: Need affordable Drugs?? Order from Canadian Pharmacy & Save You Money We are having Specials Hot Promotion this week! B. Brown ... I am Bob Brown Attack: Bad word obfuscation Good word insertion impostor Subject: Need affordab1e D r u g s?? spam Order from (anadian Ph@rmacy & S@ve You Money We are having Specials H0t Promotion this week! ... "Don't you guys ever read a paper? Moyer's a gentleman now. He knows t "Well I'm sure I can't help what you think," she said tartly. "After a J. Smith B. Brown Template database 4
  • 5. Adversarial classification Main issues: ● vulnerabilities of pattern recognition systems ● performance evaluation under attack ● design of pattern recognition systems robust to attacks 5
  • 6. Multiple classifier systems in adversarial environments I am Bob Brown Fusion rule Accepted/ J. Smith B. Brown Rejected impostor Multimodal biometric systems: more accurate than unimodal ones 6
  • 7. Multiple classifier systems in adversarial environments I am Bob Brown Fusion rule Accepted/ J. Smith B. Brown Rejected impostor Multimodal biometric systems: more accurate than unimodal ones And also more robust to attacks (?) Analogous claims in other applications (spam filtering, network intrusion detection, etc.) 7
  • 8. Aim of our work Main issues in adversarial classification: ● vulnerabilities of pattern recognition systems ● performance evaluation under attack ● design of pattern recognition systems robust to attacks Our goal: to investigate whether and how MCSs allow to improve the robustness of PR systems under attack 8
  • 9. Linear classifiers under attack The adversary exploits some knowledge on ● the features ● the classifier's decision function An example: spam filtering, linear classifiers f(x) = sign { ω1x1 + ω2x2 + ... + ωNxN + ω0 } xi  {0,1}; f(x) = +1: spam; f(x) = -1: legitimate Buy viagra! Buy vi4gr4! Did you ever play that game when you were a kid where the little plastic hippo tries to gobble up all your marbles? x = [ 1 0 1 0 0 0 0 0 …] x’ = [ 1 0 0 0 1 0 0 1 …] 9
  • 10. Linear classifiers under attack The adversary exploits some knowledge on ● the features ● the classifier's decision function ω buy viagra f(x) = sign { ω1x1 + ω2x2 + ... ωNxN + ω0 } 2.0 0.5 Buy viagra! 0.5 + 2.0 - 0.9 = 0.6 > 0: spam Buy vi4gr4! 0.5 - 0.9 = -0.4 < 0: legitimate -0.5 -0.9 Buy viagra! 0.5 + 2.0 - 2.0 - 0.9 = -0.4 < 0: legitimate -2.0 game kid game ω 0 10
  • 11. Linear classifiers under attack Possible strategy to improve the robustness of linear classifiers: keep weights as much uniform as possible (Kolcz and Teo, 6th Conf. on Email and Anti-Spam, CEAS 2009) ω buy viagra f(x) = sign { ω1x1 + ω2x2 + ... ωNxN + ω0 } 1.0 1.5 Buy viagra! 1.0 + 1.5 - 0.9 = 1.6 > 0: spam Buy vi4gr4! 1.0 - 0.9 = 0.1 > 0: spam -1.0 -0.9 Buy viagra! 1.0 + 1.5 - 1.5 - 0.9 = 0.1 > 0: spam -1.5 game kid game ω 0 Buy viagra! 1.0 + 1.5 - 1.0 - 1.5 - 0.9 = -0.9 < 0 kid game legitimate 11
  • 12. Ensembles of linear classifiers under attack Do randomisation-based MCS techniques result in more uniform weights of linear base classifiers? ● bagging ● random subspace method ● ... (accuracy-robustness trade-off) 12
  • 13. Experimental setting (1) ● Spam filtering task ● TREC 2007 data set (20,000 out of > 75,000 e-mails, 2/3 spam) ● Features: bag of words (word occurrence) > 360,000 ● Base linear classifiers: SVM, Logistic Regression ● MCS ● ensemble size: 3, 5, 10 ● bagging: 20%, 100% training samples ● RSM: 20%, 50%, 80% feature subset sizes ● 5 runs ● Evaluation of performance under attack: worst-case BWO/GWI attack, for m obfuscated/added words (m = “attack strength”) 13
  • 14. Performance measure TP 1 Receiver Operating Characteristic (ROC) curve TP = Prob [f(X) = Malicious | Y = Malicious] FP = Prob [f(X) = Malicious | Y = Legitimate] AUC10% FP 0 0.1 1 14
  • 15. Measure of weights uniformity sum of top-K weights |ω| absolute values sum of weights absolute values |ω1|  |ωΝ| F(K) least uniform weights |ω| 1 |ω1|  |ωΝ| |ω| most uniform weights |ω1|  |ωΝ| K 0 N 15 Kolcz and Teo, 6th Conf. on Email and Anti-Spam (CEAS 2009)
  • 16. Results (1) number of obfuscated/added words 16
  • 17. Experimental setting (2) ● SpamAssassin ● About N = 900 Boolean“tests”, x1, x2, ...,xN , xi  {0,1} ● Decision function: f(x) = sign { ω1x1 + ω2x2 + ... + ωNxN + ω0 }, f(x) = +1: spam; f(x) = -1: legitimate ● Default weights: machine learning + manual tuning ● Evaluation of performance under attack: evasion of the worst m tests (m = “attack strength”) 17
  • 18. Results (2) number of evaded tests 18
  • 19. Conclusions ● Adversarial classification: which roles can MCSs play? ● This work: ● linear classifiers ● attacks based on some knowledge about features and decision function (case study: spam filtering) ● Future works: investigating MCSs on different applications, base classifiers, kinds of attacks, ... 19