SlideShare uma empresa Scribd logo
1 de 8
Baixar para ler offline
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Timo Honkela
Modeling Meaning and Knowledge
25 Apr 2016
timo.honkela@helsinki.fi
An introduction to
text mining
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Data mining
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Data mining tasks
(Hand, Mannila & Smyth 2001)
● Exploratory data analysis
● Descriptive modeling
● Prescriptive modeling:
classification and regression
● Discovering patterns and rules
● Retrieval by content
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Text mining
http://www.intechopen.com/books/theory-and-applications-for-advanced-text-mining
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Text mining
● Finding structures and relations at
different levels of abstraction
● Study of distributions, trends and correlations
● Text classification and clustering
● Entity extraction
● Authorship analysis
● Sentiment analysis
● etc. etc.
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Application areas of text mining
● Digital humanities
– Sociology
– History
– Literature
– Law
● Knowledge management
● Customer relationship management (CRM)
● Competence management
– Archeology
– Linguistics
– Religion
– Philosophy
● Remember also
– Medicine
– Psychology
– Geology
– etc.
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
Examples using the SOM
● Art museum visitors
Pockets full of memories: an interactive museum installation
G Legrady, T Honkela
Visual Communication 1 (2), 163-169
● Poetry
In search for volta: Statistical analysis of word patterns in Shakespeare's sonnets
O Kohonen, S Katajamäki, T Honkela.
Proceedings of AMKLC'05, International Symposium on Adaptive Models of Knowledge, Language and Cognition, pages 44–47,
Finland
● Religious cognition
Counterintuitiveness as the hallmark of religiosity
I Pyysiäinen, M Lindeman, T Honkela
Religion 33 (4), 341-355
● Competence
Document maps for competence management
T Honkela, R Nordfors, R Tuuli
Proceedings of the Symposium on Professional Practice in AI, 31-39
Dimensionality reduction
Visualization
Abstraction
Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016
New projects
● Digital Mindscapes: Mining social media
(Jussi Pakkasvirta, Krista Lagus, Mika Pantzar, Minna Ruckenstein, etc.)
http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/citizen-mindscapes-digihum-starts_3-vain-luku.pdf
● Computational History 1640–1910:
Mining newspapers
(Mikko Tolonen, Kimmo Kettunen, Hannu Salmi, Tapio Salakoski, etc.)
http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/comhis-presentation-logomo-22-march-2016.pdf
In many cases
a supporting
infrastructure
is FIN-CLARIN

Mais conteúdo relacionado

Semelhante a Timo Honkela: An introduction to text mining

Semelhante a Timo Honkela: An introduction to text mining (9)

Hobby horses-and-detail-devils-transparency-in-digital-humanities-research-an...
Hobby horses-and-detail-devils-transparency-in-digital-humanities-research-an...Hobby horses-and-detail-devils-transparency-in-digital-humanities-research-an...
Hobby horses-and-detail-devils-transparency-in-digital-humanities-research-an...
 
Timo Honkela: Metaphors, analogies and conceptual blending
Timo Honkela: Metaphors, analogies and conceptual blendingTimo Honkela: Metaphors, analogies and conceptual blending
Timo Honkela: Metaphors, analogies and conceptual blending
 
Data Scopes - Towards transparent data research in digital humanities (Digita...
Data Scopes - Towards transparent data research in digital humanities (Digita...Data Scopes - Towards transparent data research in digital humanities (Digita...
Data Scopes - Towards transparent data research in digital humanities (Digita...
 
Search in Research, Let's Make it More Complex!
Search in Research, Let's Make it More Complex!Search in Research, Let's Make it More Complex!
Search in Research, Let's Make it More Complex!
 
Social media analysis and document based research
Social media analysis and document based researchSocial media analysis and document based research
Social media analysis and document based research
 
Timo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and modelsTimo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and models
 
Research Proposal Seminar
Research Proposal SeminarResearch Proposal Seminar
Research Proposal Seminar
 
Information Systems & Knowledge Structures
Information Systems & Knowledge StructuresInformation Systems & Knowledge Structures
Information Systems & Knowledge Structures
 
Timo Honkela: Kuhn’s Structure of Scientific Revolutions and Gärdenfors’ Conc...
Timo Honkela: Kuhn’s Structure of Scientific Revolutions and Gärdenfors’ Conc...Timo Honkela: Kuhn’s Structure of Scientific Revolutions and Gärdenfors’ Conc...
Timo Honkela: Kuhn’s Structure of Scientific Revolutions and Gärdenfors’ Conc...
 

Mais de Timo Honkela

Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 Timo Honkela: Meaning negotiations as phenomenon and as languages technology... Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela
 
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela
 

Mais de Timo Honkela (20)

Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 Timo Honkela: Meaning negotiations as phenomenon and as languages technology... Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
 
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
 
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial IntelligenceTimo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
 
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
 
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
 
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
 
Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...
 
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
 
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
 
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
 
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
 
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
 
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
 
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
 
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
 
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
 
Timo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessaTimo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessa
 
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning MethodsTimo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
 
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
 

Último

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Último (20)

Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 

Timo Honkela: An introduction to text mining

  • 1. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Timo Honkela Modeling Meaning and Knowledge 25 Apr 2016 timo.honkela@helsinki.fi An introduction to text mining
  • 2. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Data mining
  • 3. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Data mining tasks (Hand, Mannila & Smyth 2001) ● Exploratory data analysis ● Descriptive modeling ● Prescriptive modeling: classification and regression ● Discovering patterns and rules ● Retrieval by content
  • 4. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Text mining http://www.intechopen.com/books/theory-and-applications-for-advanced-text-mining
  • 5. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Text mining ● Finding structures and relations at different levels of abstraction ● Study of distributions, trends and correlations ● Text classification and clustering ● Entity extraction ● Authorship analysis ● Sentiment analysis ● etc. etc.
  • 6. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Application areas of text mining ● Digital humanities – Sociology – History – Literature – Law ● Knowledge management ● Customer relationship management (CRM) ● Competence management – Archeology – Linguistics – Religion – Philosophy ● Remember also – Medicine – Psychology – Geology – etc.
  • 7. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 Examples using the SOM ● Art museum visitors Pockets full of memories: an interactive museum installation G Legrady, T Honkela Visual Communication 1 (2), 163-169 ● Poetry In search for volta: Statistical analysis of word patterns in Shakespeare's sonnets O Kohonen, S Katajamäki, T Honkela. Proceedings of AMKLC'05, International Symposium on Adaptive Models of Knowledge, Language and Cognition, pages 44–47, Finland ● Religious cognition Counterintuitiveness as the hallmark of religiosity I Pyysiäinen, M Lindeman, T Honkela Religion 33 (4), 341-355 ● Competence Document maps for competence management T Honkela, R Nordfors, R Tuuli Proceedings of the Symposium on Professional Practice in AI, 31-39 Dimensionality reduction Visualization Abstraction
  • 8. Timo Honkela, Modeling Meaning and Knowledge, 25.4.2016 New projects ● Digital Mindscapes: Mining social media (Jussi Pakkasvirta, Krista Lagus, Mika Pantzar, Minna Ruckenstein, etc.) http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/citizen-mindscapes-digihum-starts_3-vain-luku.pdf ● Computational History 1640–1910: Mining newspapers (Mikko Tolonen, Kimmo Kettunen, Hannu Salmi, Tapio Salakoski, etc.) http://www.aka.fi/globalassets/32akatemiaohjelmat/digihum/comhis-presentation-logomo-22-march-2016.pdf In many cases a supporting infrastructure is FIN-CLARIN