SlideShare uma empresa Scribd logo
1 de 33
Higher Order Learning William M. Pottenger, Ph.D. Rutgers University ARO Workshop
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Outline
IID Assumption in Machine Learning ,[object Object],[object Object],[object Object],[object Object]
Statistical Relational Learning (SRL) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Statistical Relational Learning (SRL) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Some Related Work in  SRL ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Reasoning by Abductive Inference  ,[object Object],[object Object],[object Object],[object Object]
Gathering Evidence stress migraine CCB magnesium PA magnesium SCD magnesium magnesium Slide reused with permission of Marti Hearst @ UCB
A Higher Order  Co-Occurrence  Relation! migraine magnesium Slide reused with permission of Marti Hearst @ UCB No single author knew/wrote about this connection… this distinguishes Text Mining from Information Retrieval. stress CCB PA SCD
Uses of Higher-order Co-occurrence Relations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Is there a theoretical basis for the use of higher order co-occurrence relations? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],Is there a theoretical basis for the use of higher order co-occurrence relations in LSI?
What role do higher-order relations play in supervised machine learning? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],An example of a fourth-order path between e1 and e5, as well as several shorter paths
What role do higher-order relations play in supervised machine learning? ,[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],What role do higher-order relations play in supervised machine learning?
[object Object],[object Object],[object Object],Preliminary Results – Supervised ML dataset Ganiz, M.,  Pottenger, W.M. and Yang, X. (2006). Link Analysis of Higher-Order Paths in Supervised Learning Datasets, In the Proceedings of the Workshop on Link Analysis, Counterterrorism and Security, 2006 SIAM Conference on Data Mining,  Bethesda, MD, April Fold t Stat P(T<=t) one-tail t_Critical one-tail P(T<=t) two-tail t_Critical two-tail 0 -2.684 0.0037 1.6471 0.0074 1.9634 1 -1.357 0.0875 1.6467 0.1751 1.9629 2 -1.554 0.0603 1.6468 0.1205 1.9629 3 -2.924 0.0018 1.6472 0.0036 1.9636 4 -1.908 0.0284 1.6469 0.0568 1.9631 5 -2.047 0.0205 1.6469 0.041 1.9631 6 -1.455 0.073 1.6467 0.146 1.9629 7 -2.023 0.0217 1.6469 0.0434 1.9631 8 -2.795 0.0027 1.6471 0.0053 1.9635 9 -2.71 0.0034 1.647 0.0069 1.9633
What role do higher-order relations play in supervised machine learning? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What role do higher-order relations play in supervised machine learning? ,[object Object],[object Object],[object Object],ID Attribute Definition 1 Announce # of BGP announcements 2 Withdrawal # of BGP withdrawals 3 Update # of BGP updates (=Announce + Withdrawal ) 4 Announce Prefix # of announced prefixes 5 Withdraw Prefix # of withdrawn prefixes 6 Updated Prefix # of updated prefixes (=Announce Prefix + Withdraw Prefix)
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Preliminary Results – BGP dataset
Preliminary Results – BGP dataset ,[object Object],[object Object],[object Object],[object Object],Ganiz, M., Pottenger, W.M., Kanitkar, S., Chuah, M.C. (2006b). Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis. Proceedings of the Sixth IEEE International Conference on Data Mining (ICDM’06), December 2006, Hong Kong, China Event 1 Event 2 t-test results Slammer Witty 0.00023 Blackout Witty 0.00016 Slammer Blackout 0.018
Preliminary Results –  Naïve Bayes on Higher-order Paths ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What role do higher-order relations play in unsupervised machine learning? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Higher Order Apriori: Approach  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Higher Order Apriori: Approach ,[object Object],[object Object]
Higher Order Apriori: Approach ,[object Object],[object Object],[object Object],[object Object],[object Object]
Higher Order Apriori: Results  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Higher Order Apriori: Results ,[object Object],[object Object]
Higher Order Apriori: Results ,[object Object],[object Object],[object Object],[object Object]
Higher Order Apriori: Results ,[object Object],[object Object],[object Object],[object Object],Apriori Indirect Higher Order Apriori Itemsets Discovered {AY,  X  } { X ,  K } { K , Q) {AY, K} {X, Q}  + Apriori Itemsets {AY, Q}   + Indirect Itemsets + Apriori Itemsets Itemsets Undiscovered {AY, K} {X, Q} {AY, Q} Shaver : Women’s Pantyhose relationship  Apriori (Donna Karan’s Extra Thin Pantyhose, Wet/Dry Shaver) Indirect (Berkshire’s Ultra Nudes Pantyhose, Epilady Wet/Dry Shaver) Higher-order Apriori (Donna Karan’s Pantyhose, Epilady Wet/Dry Shaver)
Higher Order Apriori: Results ,[object Object],[object Object],[object Object],Shaver : Lotion/Cream Apriori No Indirect No Higher Order Apriori (Pedicure Care Kit, Toning Lotion) (wet/dry shaver, Herb Lotion) (Pedicure Care Kit, Leg Cream) foot cream – women’s socks  Apriori No Indirect No Higher-order Apriori (foot cream, Women's Ultra Sheer Knee High) (foot cream, women’s cotton dog sock)
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thanks ,[object Object]
Higher-order Co-occurrence 3 rd  order co-occurrence  as a chain of co-occurrences (Kontostathis & Pottenger, 2006) Context (document, instance, record, …) entity, term, AVP, item, … Example

Mais conteúdo relacionado

Mais procurados

IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
Editor IJMTER
 
Algoithems and data structures
Algoithems and data structuresAlgoithems and data structures
Algoithems and data structures
adamlongs1983
 
Supervised Quantization for Similarity Search (camera-ready)
Supervised Quantization for Similarity Search (camera-ready)Supervised Quantization for Similarity Search (camera-ready)
Supervised Quantization for Similarity Search (camera-ready)
Xiaojuan (Kathleen) WANG
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009
Ajay Ohri
 

Mais procurados (14)

IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTIONA NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
A NAIVE METHOD FOR ONTOLOGY CONSTRUCTION
 
Textual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative AnalysisTextual Data Partitioning with Relationship and Discriminative Analysis
Textual Data Partitioning with Relationship and Discriminative Analysis
 
Algoithems and data structures
Algoithems and data structuresAlgoithems and data structures
Algoithems and data structures
 
B046021319
B046021319B046021319
B046021319
 
Contextual ontology alignment may 2011
Contextual ontology alignment may 2011Contextual ontology alignment may 2011
Contextual ontology alignment may 2011
 
Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...
 
Supervised Quantization for Similarity Search (camera-ready)
Supervised Quantization for Similarity Search (camera-ready)Supervised Quantization for Similarity Search (camera-ready)
Supervised Quantization for Similarity Search (camera-ready)
 
A Novel Clustering Method for Similarity Measuring in Text Documents
A Novel Clustering Method for Similarity Measuring in Text DocumentsA Novel Clustering Method for Similarity Measuring in Text Documents
A Novel Clustering Method for Similarity Measuring in Text Documents
 
A Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed ClusteringA Competent and Empirical Model of Distributed Clustering
A Competent and Empirical Model of Distributed Clustering
 
AUTOMATED SHORT ANSWER GRADER USING FRIENDSHIP GRAPHS
AUTOMATED SHORT ANSWER GRADER USING FRIENDSHIP GRAPHSAUTOMATED SHORT ANSWER GRADER USING FRIENDSHIP GRAPHS
AUTOMATED SHORT ANSWER GRADER USING FRIENDSHIP GRAPHS
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009
 
Utilizing Graph Theory to Model Forensic Examination
Utilizing Graph Theory to Model Forensic ExaminationUtilizing Graph Theory to Model Forensic Examination
Utilizing Graph Theory to Model Forensic Examination
 
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATAEFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
EFFICIENTLY PROCESSING OF TOP-K TYPICALITY QUERY FOR STRUCTURED DATA
 

Semelhante a Higher Order Learning

Complex Relations Extraction
Complex Relations ExtractionComplex Relations Extraction
Complex Relations Extraction
Naveed Afzal
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
Patricia Tavares Boralli
 
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docxRESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
audeleypearl
 
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
Clustering heterogeneous categorical data using enhanced mini  batch K-means ...Clustering heterogeneous categorical data using enhanced mini  batch K-means ...
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
IJECEIAES
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
pathsproject
 
Relationships Among Classical Test Theory and Item Response Theory Frameworks...
Relationships Among Classical Test Theory and Item Response Theory Frameworks...Relationships Among Classical Test Theory and Item Response Theory Frameworks...
Relationships Among Classical Test Theory and Item Response Theory Frameworks...
AnusornKoedsri3
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
Ninad Samel
 

Semelhante a Higher Order Learning (20)

Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
 
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
A SYSTEM OF SERIAL COMPUTATION FOR CLASSIFIED RULES PREDICTION IN NONREGULAR ...
 
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...A Formal Machine Learning or Multi Objective Decision Making System for Deter...
A Formal Machine Learning or Multi Objective Decision Making System for Deter...
 
Complex Relations Extraction
Complex Relations ExtractionComplex Relations Extraction
Complex Relations Extraction
 
Secured Ontology Mapping
Secured Ontology Mapping Secured Ontology Mapping
Secured Ontology Mapping
 
A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...A semantic framework and software design to enable the transparent integratio...
A semantic framework and software design to enable the transparent integratio...
 
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docxRESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
RESEARCH NOTESYSTEM DYNAMICS MODELING FOR INFORMATIONSYS.docx
 
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
Clustering heterogeneous categorical data using enhanced mini  batch K-means ...Clustering heterogeneous categorical data using enhanced mini  batch K-means ...
Clustering heterogeneous categorical data using enhanced mini batch K-means ...
 
Ijetcas14 347
Ijetcas14 347Ijetcas14 347
Ijetcas14 347
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
 
Discovering Novel Information with sentence Level clustering From Multi-docu...
Discovering Novel Information with sentence Level clustering  From Multi-docu...Discovering Novel Information with sentence Level clustering  From Multi-docu...
Discovering Novel Information with sentence Level clustering From Multi-docu...
 
Relationships Among Classical Test Theory and Item Response Theory Frameworks...
Relationships Among Classical Test Theory and Item Response Theory Frameworks...Relationships Among Classical Test Theory and Item Response Theory Frameworks...
Relationships Among Classical Test Theory and Item Response Theory Frameworks...
 
An Optimal Approach For Knowledge Protection In Structured Frequent Patterns
An Optimal Approach For Knowledge Protection In Structured Frequent PatternsAn Optimal Approach For Knowledge Protection In Structured Frequent Patterns
An Optimal Approach For Knowledge Protection In Structured Frequent Patterns
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018GradTrack: Getting Started with Statistics September 20, 2018
GradTrack: Getting Started with Statistics September 20, 2018
 
Automatically converting tabular data to
Automatically converting tabular data toAutomatically converting tabular data to
Automatically converting tabular data to
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
 
Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...
 

Mais de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 

Mais de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Higher Order Learning

  • 1. Higher Order Learning William M. Pottenger, Ph.D. Rutgers University ARO Workshop
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Gathering Evidence stress migraine CCB magnesium PA magnesium SCD magnesium magnesium Slide reused with permission of Marti Hearst @ UCB
  • 9. A Higher Order Co-Occurrence Relation! migraine magnesium Slide reused with permission of Marti Hearst @ UCB No single author knew/wrote about this connection… this distinguishes Text Mining from Information Retrieval. stress CCB PA SCD
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. Higher-order Co-occurrence 3 rd order co-occurrence as a chain of co-occurrences (Kontostathis & Pottenger, 2006) Context (document, instance, record, …) entity, term, AVP, item, … Example

Notas do Editor

  1. IID (Taskar et al, 2002) -&gt; test instances are related to each other and their labels are not independent! (Lu&amp;Getoor, 2003; Jensen, 1999) Traditional statistical inference assume that instances are independent -&gt; can lead inappropriate conclusions (Lu&amp;Getoor, 2003) “traditional data mining tasks such as association rule mining, market basket analysis, and cluster analysis commonly attempt to find patterns in a dataset characterized by a collection of independent instances of a single relation. This is consistent with the classical statistical inference problem of trying to identify a model given a random sample from a common underlying distribution“ Latent semantics are important for Information Retrieval (IR) and Text Mining applications For example In LSI, latent aspects of term similarity that LSI reveals is dependent on the higher-order paths between terms
  2. Explicit links : (e.g., hyperlinks between web pages or citation links between scientific papers) apply the model to a separate network (a set of unlabeled test instances with links) in collective classification phase Collective inference : making inferences about multiple data instances simultaneously collective inference can significantly reduce classification error (Jensen et al., 2004) The basic idea in these iterative algorithms is to start with a labeling of reasonable quality (by a content only classifier) and refine it using a coupled distribution of content and labels of neighbors.
  3. Several simple link attributes that are constructed based on the statistics computed from the categories of the different sets of linked objects: mode-link, a single attribute computed from the in-links, out-links, and co-citation links count-link which is basically the frequency of classes of linked instances binary-link is a simple binary feature vector; for each class label, if a link to an instance occurs at least once, the corresponding feature is 1 To determine a label (dependent var) c in {-1,+1} given an input vector (explanatory var) x, P(c=1|w,x) , find optimal w for discriminative function CORA : 4187 machine learning papers, 7 class, dictionary 1400 words after stemming, stopwords WEBKB: web pages from four computer science departments; 4 topics and others: total 5; without others 700 pages
  4. (Edmond, 1997) The application described selects the most appropriate term when a context (such as a sentence) is provided. (Sch ü tze,1998) use of second-order co-occurrence of the terms in the training set to create context vectors that represent a specific sense of a word to be discriminated. (Xu &amp; Croft, 1998) A strong correlation between terms A and B, and also between terms B and C will result in the placement of terms A, B, and C into the same equivalence class. The result will be a transitive semantic relationship between A and C. Orders of co-occurrence higher than two are also possible in this application. (LSI), a well-known approach to information retrieval, (LSI) implicitly depends on higher-order co-occurrences. (LSI) In previous work it is demonstrated empirically that higher-order co-occurrences play a key role in the effectiveness of systems based on LSI. (Zhang, 2000) A related effort used second-order co-occurrences to improve the runtime performance of LSI. (LBD), employs second-order co-occurrence to discover connections between concepts (entities). (LBD) A well-known example is the discovery of a novel migraine-magnesium connection in the medical domain, The researchers found that in the Medline database some terms co-occur frequently with “migraine” in article titles, e.g. “stress” and “calcium channel blockers.” They also discovered that “stress” co-occurs frequently with “magnesium” in other titles. As a result, they hypothesized a link between “migraine” and “magnesium,” and some clinical evidence has been obtained that supports this hypothesis.
  5. preliminary conclusion: frequency distributions of higher-order itemsets capture distinguishing characteristics of the classes in supervised machine learning datasets
  6. Cybersecurity: Abnormal BGP events often affect global routing infrastructure. For example, in January 2003, the Slammer worm caused a surge of BGP updates. Since BGP anomaly events often cause major disruptions in Internet, the ability to detect and categorize BGP events is extremely useful Our aim is to distinguish whether Border Gateway Protocol (BGP) traffic is caused by an anomalous event such as a power failure, a worm attack or a node/link failure. This is different from mushroom dataset because attributes are integer valued
  7. For the Slammer worm and Blackout events, the t-test probability starts increasing as the sliding window approaches the 25th window. When the number of abnormal event instances inside the current window exceeds a certain threshold (around the 21st – 23rd window), observe a sharp increase After the 25th window, the probability stays above 5%, revealing that we are in the event period and have detected and distinguished both the Slammer and Blackout events using their respective event models detect and distinguish these events in 360 seconds or less. Results are similar for the Witty worm event in figure 6, although the detection takes slightly longer
  8. This figure depicts three documents, D1, D2 and D3, each containing two terms, or entities, represented by the letters A, B, C and D. Below the three documents form a higher-order path that links entity A with entity D through B and C. This is a third-order path since three links, or “hops,” connect A and D D1, D2 and D3 are not always documents – they might be records in a database or instances in a labeled training dataset. Likewise, the entities A, B, C etc. need not be terms – they may be values in a database record, or items (attribute-value pairs) in an instance. Actually we can extract co-occurrence relations as long as there is a meaningful context of entities.