Much prior work has shown the practical value of modeling random variables as IID in order to simplify statistical inference, yet prior work has also shown this assumption to be suboptimal in terms of model performance. Occam’s razor prompts us to simplify explanations, and this talk will present how a very simple transform has been leveraged to improve performance of both generative and discriminative learners, as well as unsupervised learning, in a number of application domains including differentially private community discovery.
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Intuidex - To be or not to be iid by William M. Pottenger (NYC Machine Learning Group)
1. William M. Pottenger, Ph.D.
All Rights Reserved
To be or not to be IID:
That is the Question
Higher Order Learning
William M. Pottenger, Ph.D.
Rutgers University and Intuidex, Inc.
DrWMP@rci.rutgers.edu; www.dimacs.rutgers.edu/~billp
DrWMPottenger@Intuidex.com; www.intuidex.com
2. William M. Pottenger, Ph.D.
All Rights Reserved
Dr. William M. Pottenger
www.dimacs.rutgers.edu/~billp
www.intuidex.com
• Example Application Areas
– Homeland Security/Law
Enforcement/Criminal Justice
Information Systems
– Decision Support Systems
– Information Retrieval Systems
– High Performance Computing
• Research Funded by
– National Science Foundation
– National Institute of Justice
– Department of Homeland Security
– Army Research Lab
– Commonwealth of Pennsylvania
– Corporate Partners
– E.g., Lockheed-Martin, Kodak,
PNNL, Boeing, etc.
• Associate Research Professor
@ Rutgers University
– DIMACS & Computer Science
• CEO of Intuidex, Inc.
• Director of Transition for
DHS S&T CCI Center
• Research Scientist @ NCSA
• M.S., Ph.D. in CS at UIUC
• Research Interests
– Statistical Relational Learning
– Leveraging higher-order
relations in graphs of data
– Parallel and Distributed Visual
& Data Analytics
– Analytics in a parallel and/or
distributed environment
– Information Extraction
– Automatic extraction of
keywords/features from text
2
3. William M. Pottenger, Ph.D.
All Rights Reserved
What is Higher Order Information?
• Swanson (‘91) posed problem: Migraine headaches (M)
– stress associated with M
– stress leads to loss of magnesium
– calcium channel blockers prevent some M
– magnesium is a natural calcium channel blocker
– spreading cortical depression (SCD) implicated in M
– high levels of magnesium inhibit SCD
– M patients have high platelet aggregability
– magnesium can suppress platelet aggregability
• All extracted from medical journal titles
Slide reused with permission of Marti Hearst @ UCB
3
4. William M. Pottenger, Ph.D.
All Rights Reserved
Gathering Evidence
stress
migraine
CCB
magnesium
PA
magnesium
SCD
magnesiummagnesium
Slide reused with permission of Marti Hearst @ UCB
4
5. William M. Pottenger, Ph.D.
All Rights Reserved
Higher Order Paths!
migraine magnesium
stress
CCB
PA
SCD
Slide reused with permission of Marti Hearst @ UCB
5
6. William M. Pottenger, Ph.D.
All Rights Reserved
Related Work:
Link Mining and Collective Classification
Link-based approaches (Taskar et al., 2001; Getoor and
Diehl, 2005; Lu and Getoor, 2003; Neville and Jensen
2004) to collective classification use explicit link
information within networked data
Studies (Chakrabarti et al., 1998; Neville and Jensen,
2000; Taskar et al., 2001) have shown that collective
classifiers can achieve significant reductions in
classification errors by performing inference about
multiple data instances simultaneously
Collective classifiers are context-dependent and are not
designed to classify stand-alone data instances
We propose classification methods that leverage implicit
links between features in small training sets, and that
maintain the ability for “context-free” classification of
individual data instances
6
7. William M. Pottenger, Ph.D.
All Rights Reserved
Is there a theoretical basis for the use
of higher order co-occurrence relations?
• Research agenda: study machine learning
algorithms in search of a theoretical foundation
for the use of higher order relations
• First algorithm: Latent Semantic Indexing (LSI)
– Widely used technique in text mining and IR based on
the Singular Value Decomposition (SVD) matrix
factoring algorithm
– Research question: Does LSI use higher order term
co-occurrence?
– First step: study SVD
7
April Kontostathis
Associate Professor
@ Ursinus College
8. William M. Pottenger, Ph.D.
All Rights Reserved
Is there a theoretical basis for the use of
higher order co-occurrence relations in LSI?
s1
s2
s3
sr
A (m x n)
T (m x r) S (r x r) DT (r x n)
Term by Doc Term by
Dimension
Singular
Values
Dimension by Document
s1 <= s2 <= s3 <= . . . <=sr
r = rank of A, m = num terms, n = number docs
Singular Value Decomposition
8
9. William M. Pottenger, Ph.D.
All Rights Reserved
Is there a theoretical basis for the use of
higher order co-occurrence relations in LSI?
s1
s2
s3
sr
A (m x n)
T (m x k) S (k x k) DT (k x n)
Reduced
Term by Doc
Term by
Dimension
Singular
Values
Dimension by Document
s1 <= s2 <= s3 <= . . . <=sr
r = rank of A, m = num terms, n = number docs
LSI: Truncation of Singular Values
9
10. William M. Pottenger, Ph.D.
All Rights Reserved
Is there a theoretical basis for the use of
higher order co-occurrence relations in LSI?
human
interface
computer
user
system
response
time
EPS
Survey
trees
graph
minors
human x 1 1 0 2 0 0 1 0 0 0 0
interface 1 x 1 1 1 0 0 1 0 0 0 0
computer 1 1 x 1 1 1 1 0 1 0 0 0
user 0 1 1 x 2 2 2 1 1 0 0 0
system 2 1 1 2 x 1 1 3 1 0 0 0
response 0 0 1 2 1 x 2 0 1 0 0 0
time 0 0 1 2 1 2 x 0 1 0 0 0
EPS 1 1 0 1 3 0 0 x 0 0 0 0
Survey 0 0 1 1 1 1 1 0 x 0 1 1
trees 0 0 0 0 0 0 0 0 0 x 2 1
graph 0 0 0 0 0 0 0 0 1 2 x 2
minors 0 0 0 0 0 0 0 0 1 1 2 x
Deerwester Term by Term Matrix
human
interface
computer
user
system
response
time
EPS
Survey
trees
graph
minors
human x 0.54 0.56 0.94 1.69 0.58 0.58 0.84 0.32 -0.32 -0.34 -0.25
interface 0.54 x 0.52 0.87 1.50 0.55 0.55 0.73 0.35 -0.20 -0.19 -0.14
computer 0.56 0.52 x 1.09 1.67 0.75 0.75 0.77 0.63 0.15 0.27 0.20
user 0.94 0.87 1.09 x 2.79 1.25 1.25 1.28 1.04 0.23 0.42 0.31
system 1.69 1.50 1.67 2.79 x 1.81 1.81 2.30 1.20 -0.47 -0.39 -0.28
response 0.58 0.55 0.75 1.25 1.81 x 0.89 0.80 0.82 0.38 0.56 0.41
time 0.58 0.55 0.75 1.25 1.81 0.89 x 0.80 0.82 0.38 0.56 0.41
EPS 0.84 0.73 0.77 1.28 2.30 0.80 0.80 x 0.46 -0.41 -0.43 -0.31
Survey 0.32 0.35 0.63 1.04 1.20 0.82 0.82 0.46 x 0.88 1.17 0.85
trees -0.32 -0.20 0.15 0.23 -0.47 0.38 0.38 -0.41 0.88 x 1.96 1.43
graph -0.34 -0.19 0.27 0.42 -0.39 0.56 0.56 -0.43 1.17 1.96 x 1.81
minors -0.25 -0.14 0.20 0.31 -0.28 0.41 0.41 -0.31 0.85 1.43 1.81 x
Deerwester Term by Term Matrix, truncated to two dimensions
10
11. William M. Pottenger, Ph.D.
All Rights Reserved
• Answer is in the following theorem we proved:
If the ijth element of the truncated term by
term matrix, Y, is non-zero, then there exists a
co-occurrence path of order 1 between terms
i and j.
– Kontostathis, A. and Pottenger, W. M. (2006) A
Framework for Understanding LSI Performance.
Information Processing & Management, volume 42,
issue 1, pages 56-73.
• We have both proven mathematically and
demonstrated empirically that LSI is based on
the use of higher order co-occurrence relations.
• Next step?
Is there a theoretical basis for the use of
higher order co-occurrence relations in LSI?
11
12. William M. Pottenger, Ph.D.
All Rights Reserved
Using Higher Order Information in both
Generative and Discriminative Learning
• Extend the theoretical foundation that April and
I developed by studying characteristics of
higher-order information in other machine
learning approaches including both generative and
discriminative supervised learning as well as
unsupervised approaches
– Ganiz, M. C., Lytkin, N. I. and Pottenger, W. M.
(2009) Leveraging Higher Order Dependencies Between
Features for Text Classification. In the Proceedings
of the European Conference on Machine Learning and
Principles and Practice of Knowledge Discovery in
Databases (ECML PKDD). Bled, Slovenia, September.
Nikita Lytkin
Research Scientist @
NYU Medical Center
Murat Ganiz
Assistant Professor
@ Dogus University
13. William M. Pottenger, Ph.D.
All Rights Reserved
Representation of Boolean
Data by a Bipartite Graph
13
14. William M. Pottenger, Ph.D.
All Rights Reserved
Multinomial vs. Multivariate Event Model
McCallum & Nigam (1998)
14
15. William M. Pottenger, Ph.D.
All Rights Reserved
First Order Paths in a Data Graph
15
16. William M. Pottenger, Ph.D.
All Rights Reserved
Second Order Paths in a Data Graph
16
17. William M. Pottenger, Ph.D.
All Rights Reserved
Patterns of Connectivity between Features
17
18. William M. Pottenger, Ph.D.
All Rights Reserved
Probabilistic Characterization of Features by
Second Order Paths
18
19. William M. Pottenger, Ph.D.
All Rights Reserved
Higher Order Naïve Bayes:
A Generative Learner
Murat Ganiz
Assistant Professor
@ Dogus University
19
20. William M. Pottenger, Ph.D.
All Rights Reserved
20
Slonim & Tishby (2001) vs. HONB
Ganiz, M. C., Pottenger, W. M. and George, C. (2010) Higher Order
Naïve Bayes: A Novel Non-IID Approach to Text Classification. IEEE
Transactions of Knowledge and Data Engineering (TKDE).
multinomial features binary features
Dataset NB NB_wc improvement % NB HONB improvement %
COMP (5) 0.473 0.508 7.4 0.51 0.65 26.5
SCIENCE (4) 0.65 0.725 11.5 0.6 0.84 41.6
POLITICS (3) 0.62 0.67 8.1 0.68 0.83 22.8
RELIGION (3) 0.525 0.553 5.3 0.64 0.74 15.7
8.075 26.65
HONB achieves statistically significantly better performance
than NB for four datasets based on t-test results
(Slonim & Tishby, 2001) did not report std dev or t-test results
21. William M. Pottenger, Ph.D.
All Rights Reserved
Supervised Second Order Transformation
for Discriminative Learning
21
Nikita Lytkin Research
Scientist @ NYU Medical
Center
22. William M. Pottenger, Ph.D.
All Rights Reserved
Influence of Higher-Order Paths
22
23. William M. Pottenger, Ph.D.
All Rights Reserved
Experimental Setup
Support Vector Machine (Vapnik 1998) was
used to evaluate the Supervised Second Order
Transformation
Multi-class classification by SVM was
performed using the “one-against-one”
scheme
Used RBF and linear kernels in SVM and
varied soft margin cost from 10-4 to 104
Training set size varied from 5% to 60%
Eight experiments performed at each sample
size
25
24. William M. Pottenger, Ph.D.
All Rights Reserved
Six benchmark text corpora were selected
Stop words were removed, others were stemmed
For the RELIGION, POLITICS, SCIENCE and COMP
subsets of the 20 Newsgroups dataset, the top 2000
terms ranked by Information Gain were selected;
500 documents per class were sampled at random for
comparison with Slonim and Tishby (2001)
Experimental Setup (continued)
Dataset # classes total # docs # terms
RELIGION 3 1500 2000
POLITICS 3 1500 2000
SCIENCE 4 2000 2000
COMP 5 2500 2000
Citeseer 6 3312 3703
Cora 6 2708 1433
26
25. William M. Pottenger, Ph.D.
All Rights Reserved
Scalability Across Training Set Sizes
27
26. William M. Pottenger, Ph.D.
All Rights Reserved
Results for Naïve Bayes, SVM, HONB and
HOSVM on 20NG REL & SCI Datasets
28
27. William M. Pottenger, Ph.D.
All Rights Reserved
Results for Naïve Bayes, SVM, HONB and
HOSVM on Citeseer & Cora Datasets
29
28. William M. Pottenger, Ph.D.
All Rights Reserved
Significance of Results for Naïve Bayes,
SVM, HONB and HOSVM on All Datasets
30
HONB consistently and statistically significantly outperformed NB
on all datasets (significant at <= 5% p-value)
HOSVM outperformed SVM on the RELIGION, POLITICS and
SCIENCE datasets (significant at <= 5% p-value)
Although, the difference between HOSVM and SVM on the COMP
dataset was significant at the level 0.158, HOSVM outperformed
SVM on seven out of eight trials by an average of 3%
29. William M. Pottenger, Ph.D.
All Rights Reserved
What role do higher-order relations play in
supervised machine learning?
• Higher-Order Collective Classification (HOCC)
– Classifies a set of instances simultaneously and thus exploits the
relationships between them; Based on a record-relation graph
– Capable of both supervised event detection as well as
unsupervised anomaly detection
• Application: Classification and Anomaly Detection of
Interdomain Routing Events
– Goal: detect and categorize such events
– Menon, V. and Pottenger, W. M. (2009) A Higher Order
Collective Classifier for Detecting and Classifying Network
Events. In the Proceedings of the IEEE International Conference
on Intelligence and Security Informatics 2009 (ISI 2009)
31
Vikas Menon
Software Developer @
Bridgewater Associates
30. William M. Pottenger, Ph.D.
All Rights Reserved
HOCC Results
• Detection of Interdomain Routing Events and Anomalies Based on
Higher-Order Path Analysis
– Slammer worm attack, Witty worm attack, 2003 East Coast Blackout
• Real Time Classification of Abnormal Events
– Sliding window samples of 120 three-second instances
– 180th window = start of event
– HOCC detects events and distinguishes anomalies
Witty (Supervised) Witty (Unsupervised)
32
31. William M. Pottenger, Ph.D.
All Rights Reserved
What role do higher-order relations play in
unsupervised machine learning?
• Next step? Consider unsupervised learning…
– Association Rule Mining (ARM)
• ARM is one of the most widely used algorithms in
data mining
– Extend ARM to higher order… Higher Order
Apriori
• LHOIM (Latent Higher-Order Information Mining)
• Experiments confirm the value of Higher Order
Apriori on real world e-marketplace data
33
Shenzhi Li
Senior Software Engineer
@ Ask (Ask.com)
32. William M. Pottenger, Ph.D.
All Rights Reserved
LHOIM Results on 20NG Computer Dataset
• Average error rate for 1st-order (top left) 2nd-order (top right)
• Average stdev for 1st-order (bottom left) 2nd-order (bottom right)
34
Li, S. Z., Wu, T., and Pottenger, W. M. (2005) Distributed Higher
Order Association Rule Mining Using Information Extracted from
Textual Data. SIGKDD Explorations, volume 7, issue 1, pages 26-35.
33. Higher Order Graph Sampling on Reuters
Naï…
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
Naïve Bayes Random Sampling
Higher Order Naïve Bayes Random Sampling
Higher Order Naïve Bayes Higher Order Sampling
Naï…
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
Naïve Bayes Random Sampling
Higher Order Naïve Bayes Random Sampling
Higher Order Naïve Bayes Higher Order Sampling
Naï…
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
Naïve Bayes Random Sampling
Higher Order Naïve Bayes Random Sampling
Higher Order Naïve Bayes Higher Order Sampling
Naï…
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10
Naïve Bayes Random Sampling
Higher Order Naïve Bayes Random Sampling
Higher Order Naïve Bayes Higher Order Sampling
Higher Order Naïve
Bayes with Higher
Order Sampling gives
even better results
Higher Order Naïve
Bayes improves the
accuracy by at least 10%
Accuracy
in %
Patterns can be
discovered using a
much smaller sample –
important for online
learning
Training
Sample %
Cibin
George
M.S. in
CS @
Rutgers
34. William M. Pottenger, Ph.D.
All Rights Reserved
Higher Order (Online)
Latent Dirichlet Allocation
Intuitively, this formula can be interpreted as
a word being assigned to a topic proportional
to its frequency of occurrence in that topic.
This is in fact, our guiding intuition and we
simply replace these term frequencies with
higher order frequencies.
36
Nir Grinberg
Ph.D. in CS
@ Rutgers
Kashyap Kolipaka
Ph.D. in CS @
Rutgers
Christie Nelson
Ph.D. at RUTCOR
@ Rutgers
35. William M. Pottenger, Ph.D.
All Rights Reserved
Modeling Social Media for Emergency
Response in Port-au-Prince, Haiti
Cluster Geolocation
36. William M. Pottenger, Ph.D.
All Rights Reserved
Modeling Social Media for Emergency
Response in Port-au-Prince, Haiti
Cluster Geolocation with predicted resource
37. William M. Pottenger, Ph.D.
All Rights Reserved
Research Futures: Privacy-Enhanced
Higher Order Community Partitioning
),()(
11
=
,
1
1=
jiIPA
nl
Q k
ij
k
ij
ji
k
l
k
l
),()(=),()
2
(=
,,
jiIPAjiI
m
dd
AQ ijij
ji
ji
ij
ji
Let I(I,j) be 1 if vertices i and j are in the same
community (social network), and 0 otherwise, then
Newman’s Q-Modularity is defined as:
Generalization
Q-Modularity counts edges inside each community and
subtracts the expected number of edges inside the same
community. Higher-order Ql counts number of paths inside
each community and subtracts the expected number of
paths. We propose Ql as a measure of a community split
and consider a combinatorial optimization approach.
39
Alex Nikolov, Ph.D.
in CS @ Rutgers
38. William M. Pottenger, Ph.D.
All Rights Reserved
Results on Ground Truth Data
• We optimized Ql using an LP rounding based
approximation algorithm for correlation
clustering.
• We ran our experiments on networks with
known communities, and compared the known
communities to our clustering using the
Adjusted Rand Index.
Datasetl 1 2 3 4
Karate 0.5414 0.5669 0.5669 0.5669
Political
Books
0.6250 0.6463 0.6463 0.6463
40
39. William M. Pottenger, Ph.D.
All Rights Reserved
Is Ql easier to approximate?
• We approximated Ql on random Gn,p graphs for
different values of l and p.
• We used the ratio of the value of the found
solution to the value of an LP relaxation as an
estimate of the approximation factor.
• It seems that Ql is harder for denser graphs (p
high) but easier for higher l.
l = 1 2 3 4 5
p = 0.03 0.9678 0.9840 1.0000 1.0000 0.9986
p = 0.12 0.1828 0.4542 -0.1179 0.8447 1.0000
p = 0.60 -0.1130 0.3975 1.0000 1.0000 1.0000
41
40. William M. Pottenger, Ph.D.
All Rights Reserved
Differential Privacy
• Differential Privacy [DMNS]: A randomized
function K gives ε-differential privacy if for all
graphs G1,G2 differing in a single edge and all
subsets S of Range(K):
• The global sensitivity of a real valued function f
is:
where G1,G2 differ in a single edge.
S])G([KPrS])G(K[Pr 21
GSf maxG1,G2
| f (G1) f (G2) |
42
41. William M. Pottenger, Ph.D.
All Rights Reserved
Sensitivity of Ql
The global sensitivity of Ql is at most 5(2l – 1)/l
for any fixed clustering.
By [DMNS], given a community split, outputting Ql
+ Lap(5(2l – 1)/lε) satisfies ε-differential privacy.
43
42. William M. Pottenger, Ph.D.
All Rights Reserved
Differentially Private Community Discovery
• The measure of community split Ql is insensitive.
– We can output the value of a community split
differentially privately
• But we would like a to design an algorithm Alg,
such that:
– Alg outputs a community partition with high Ql ;
– Alg satisfies ε-differential privacy
• Considered in Differentially Private Combinatorial
Optimization (Gupta et al. 2009), but there is
no general method.
44
43. William M. Pottenger, Ph.D.
All Rights Reserved
In HOQL, we classify states as being in a high reward class or a low reward class. States are
added to a class based on a threshold. We use HONB classification for action selection. We
combine our method with greedy action selection based on the formula:
ε = 1- εstart* (1-episodecurrent / episodetotal)
Q-values are updated based on the traditional formula:
Q(st
, at
) ← Q(st
, at
) + α[rt+1
+ γmaxa
Q(st+1
, a) – Q(st
, at
)
Where α is the learning rate and γ is the discount factor. In these results, α = .91, γ =
1, and εstart = 0.8
REU Ashley Edwards
Higher Order Q-Learning (HOQL)
Ashley
Edwards,
Applicant for
Ph.D. in CS
@ Rutgers
Edwards, A. and
Pottenger, W. M. 2011.
Higher Order Q-
Learning. IEEE
Symposium on Adaptive
Dynamic Programming
and Reinforcement
Learning. Paris, France.
45
44. William M. Pottenger, Ph.D.
All Rights Reserved
Anomaly detection through
machine-learning exposed that the
Chinese government is capable of
“line rate” MITM attacks.
Due to pipelining in modern
browser implementations,
“censorware” is forced to remember
a 5-tuple for every attempt a user
makes to view censored content.
<ipSrc, ipDst, srcPort, dstPort,
proto>
Chinese government routers use
fiber-optics to do censorship at
“line rate.”
They lose the ability to drop
packets, so every censorware
router in the path must store a 5-
tuple and block responses.
This begs the question: “What
kinds of computational complexity
bottlenecks in ‘censorware’ can we
exploit?”
For example, how large of a
“botnet” would be required to cause
Chinese censorware routers to run
out of memory?
A BMITM
User attempts to restart the
connection.
Government servers useSEQ-1460
attack on TCP.
Government servers get user to
establish new, fake connection
User accepts new, fake connection
and retransmits.
Government rejects data
transmission with RST packet.
Server doesn’t understand new,
fake connection. Sends RSTs.
User rejects attempt to restart
the connection.
Server assumes user is adversarial.
Sends RSTs and kills connection.
REU Becker Polverini
Using Clustering to Detect Censorware
46
Polverini, A. B. and Pottenger, W. M. 2011. Using Clustering to Detect
Chinese Censorware. CSIIRW ’11 Oak Ridge National Labs, TN USA
45. William M. Pottenger, Ph.D.
All Rights Reserved
CCICADA technology transfer efforts
• Goal: Technology transfer to DHS users and
customers
• Several Tech Transfer programs @ DHS S&T:
– E2E – Engage to Excel
– Tech Solutions
– SECURE
• CCICADA is committed to support these existing
programs and to innovate new approaches – what can
you do?
– Publish your open-source software!
– Commercialize your software!
– Start your own company… and sell to DHS!
4747
69. William M. Pottenger, Ph.D.
All Rights Reserved
Acknowledgements
• I am very grateful to my hardworking, intelligent and creative
(current and former) students and postdocs without whom none of
this would have been possible: Kunikazu Yoda, Christie Nelson,
Aleksandar Nikolov, Nir Grinberg, Cibin George, Christopher
Janneck, Nikita Lytkin, Shenzhi Li, Murat Ganiz, Chirag Pandya,
Kashyap Kolipaka, Vikas Menon, April Kontostathis, Tianhao Wu,
Jirada Kuntraruk, Jason Perry, Mark Dilsizian (and >> others).
• I also thank Rutgers University, the National Science Foundation,
the Department of Homeland Security and the National Institute of
Justice. This material is based upon work partially supported by the
National Science Foundation under Grant Numbers 0703698 and
0712139. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not
necessarily reflect the views of the National Science Foundation or
Rutgers University.
• I also gratefully acknowledge the continuing help of my Lord and
Savior, Yeshua the Messiah (Jesus the Christ) in my life and work.
72
70. William M. Pottenger, Ph.D.
All Rights Reserved
Thank you!
Q&A
73
71. William M. Pottenger, Ph.D.
All Rights Reserved
References
Soumen Chakrabarti, Byron Dom, and Piotr Indyk. Enhanced hypertext
categorization using hyperlinks. SIGMOD Rec., 27(2):307–318, 1998.
Scott Deerwester, Susan T. Dumais, George W. Furnas,Thomas K.
Landauer, and Richard Harshman.
Indexing by latent semantic analysis. Journal of the American Society for
Information Science, 41:391–407, 1990.
Lise Getoor and Christopher P. Diehl. Link mining: a survey. SIGKDD Explor.
Newsl., 7(2):3–12, 2005.
Murat Can Ganiz, Sudhan Kanitkar, Mooi Choo Chuah, and William M.
Pottenger. Detection of interdomain routing anomalies based on higher-order
path analysis. In ICDM ’06: Proceedings of the Sixth International
Conference on Data Mining, pages 874–879, Washington, DC, USA, 2006.
IEEE Computer Society.
Leo Katz. A new status index derived from sociometric analysis.
Psychometrika, 18(1):39–43, March 1953.
April Kontostathis and William M. Pottenger. A framework for understanding
latent semantic indexing (LSI) Performance. Inf. Process. Manage.,
42(1):56–73, 2006.
74
72. William M. Pottenger, Ph.D.
All Rights Reserved
Qing Lu and Lise Getoor. Link-based classification. In Tom Fawcett and
Nina Mishra, editors, ICML, pages 496–503. AAAI Press, 2003.
Shenzhi Li, Tianhao Wu, and William M. Pottenger. Distributed higher order
association rule mining using information extracted from textual data.
SIGKDD Explorations Newsl., 7(1):26–35, 2005.
J. Neville and D. Jensen. Iterative classification in relational data. In Proc.
AAAI, pages 13–20. AAAI Press, 2000.
J. Neville and D. Jensen. Dependency networks for relational data. Data
Mining, 2004. ICDM ’04. Fourth IEEE International Conference, pages 170–
177, Nov. 2004.
Noam Slonim and Naftali Tishby. The power of word clusters for text
classification. In In 23rd European Colloquium on Information Retrieval
Research, 2001.
Ben Taskar, Eran Segal, and Daphne Koller. Probabilistic classification and
clustering in relational data. In Proceedings of the Seventeenth
International Joint Conference on Artificial Intelligence, pages 870–878,
2001.
Vladimir Vapnik. Statistical Learning Theory. John Wiley, 1998.
References
75