The document discusses text mining and its applications. It provides an introduction to text mining and describes some common text mining tasks such as finding structures and relations in text data, text classification, entity extraction, and sentiment analysis. It also lists several application areas for text mining such as digital humanities, knowledge management, and customer relationship management. Finally, it discusses some examples of text mining projects including analyzing museum visitor feedback, studying patterns in Shakespeare's sonnets, and analyzing religious texts.
1. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Timo Honkela
Modeling Meaning and Knowledge
25 Apr 2016
timo.honkela@helsinki.fi
An introduction to
text mining
3. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Data mining tasks
(Hand, Mannila & Smyth 2001)
● Exploratory data analysis
● Descriptive modeling
● Prescriptive modeling:
classification and regression
● Discovering patterns and rules
● Retrieval by content
4. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Text mining
http://www.intechopen.com/books/theory-and-applications-for-advanced-text-mining
5. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Text mining
● Finding structures and relations at
different levels of abstraction
● Study of distributions, trends and correlations
● Text classification and clustering
● Entity extraction
● Authorship analysis
● Sentiment analysis
● etc. etc.
6. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Application areas of text mining
● Digital humanities
– Sociology
– History
– Literature
– Law
● Knowledge management
● Customer relationship management (CRM)
● Competence management
– Archeology
– Linguistics
– Religion
– Philosophy
● Remember also
– Medicine
– Psychology
– Geology
– etc.
7. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Examples using the SOM
● Art museum visitors
Pockets full of memories: an interactive museum installation
G Legrady, T Honkela
Visual Communication 1 (2), 163-169
● Poetry
In search for volta: Statistical analysis of word patterns in Shakespeare's sonnets
O Kohonen, S Katajamäki, T Honkela.
Proceedings of AMKLC'05, International Symposium on Adaptive Models of Knowledge, Language and Cognition, pages 44–47,
Finland
● Religious cognition
Counterintuitiveness as the hallmark of religiosity
I Pyysiäinen, M Lindeman, T Honkela
Religion 33 (4), 341-355
● Competence
Document maps for competence management
T Honkela, R Nordfors, R Tuuli
Proceedings of the Symposium on Professional Practice in AI, 31-39
Dimensionality reduction
Visualization
Abstraction
8. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
New projects
● Digital Mindscapes: Mining social media
(Jussi Pakkasvirta, Krista Lagus, Mika Pantzar, Minna Ruckenstein, etc.)
http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/citizen-mindscapes-digihum-starts_3-vain-luku.pdf
● Computational History 1640–1910:
Mining newspapers
(Mikko Tolonen, Kimmo Kettunen, Hannu Salmi, Tapio Salakoski, etc.)
http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/comhis-presentation-logomo-22-march-2016.pdf
In many cases
a supporting
infrastructure
is FIN-CLARIN