SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
Change is Key!
An introduction to lexical
semantic change
Nina Tahmasebi, Associate Professor & Simon Hengchen, Phd
University of Gothenburg
October 2022, KBR
Digital Heritage Seminar Series: Lexical Semantic Change
Some facts
• 6 years
• 6 partner universities
• Members from 4 countries
• With advisors, 6 countries
• 13 people including PM and SE
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Word meaning change
Over time
He was an
awesome leader!
He was an
awesome leader!
time
In different contexts (at the same time)
St. Petersburg St. Petersburg
Petrograd
Leningrad
time
October, 2022 |
Nina Tahmasebi | KBR DH seminar
main CHALLENGES for
computational models of meaning and change
Handle languages with
smaller amounts of data
Sense-aware models
Find out WHAT changed,
HOW and WHEN
Generalize to
multiple languages
Computational
models of
meaning and
change
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Computational
models of
meaning and
change
Language level change
Historical Linguistics
Our Research Questions
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Computational
models of
meaning and
change
Our Research Questions
Language level change
Lexicography
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Computational
models of
meaning and
change
Our Research Questions
Societal level change
Analytical Sociology
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Computational
models of
meaning and
change
Our Research Questions
Societal level change
Gender Studies
Gender Studies
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Our Research Questions
Computational
models of
meaning and
change
Gender Studies
Societal level change
Literary Studies
Our societal contribution
Meaning for everyone
clams muslim
clams muslim
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Changeiskey.org
Methods for
computational
semantic change
October, 2022 |
Nina Tahmasebi | KBR DH seminar
?
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Explicit, count-based vector representations
MOUSE
2 3 1 5 5 4 8
October, 2022 |
Nina Tahmasebi | KBR DH seminar
MOUSE
0.4 0.4. .5 .02 .005 0.1 0.9
? ? ? ? ? ? ? ? ? ? ? ?
Statistical, learned vector representations
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Word embeddings shown in 2D instead of 50-100000
Image: Nieto Pina and Johansson, RANLP’15
October, 2022 |
Nina Tahmasebi | KBR DH seminar
collective text
individual
individual text
signal
topic, cluster, vector…
Pipeline
signal change
October, 2022 |
Nina Tahmasebi | KBR DH seminar
word
word Single-sense
Sense-differentiated
Stone
Music
Lifestyle
Rock
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Change type
Novel related ws
Novel unrelated ws
Broadening
Join
Narrowing
Split
Death
Novel word sense
Novel word
Change
Single-sense
Sense-differentiated
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Difficulty:
What does a word mean?
When are two meanings the same?
October, 2022 |
Nina Tahmasebi | KBR DH seminar
word
word Single-sense
October, 2022 |
Nina Tahmasebi | KBR DH seminar
count-based embeddings
dynamic embeddings
neural embeddings
Single-sense
Sense-differentiated
2013
2008 2012
2010
2009 2011 2014 2015 2016 2017 2018 2019
topic models
word sense induction
contextual embeddings
2020
Giulianelli
et al
2020
Hu et al
2019
Tahmasebi et al.
2008
Mitra et al
2015
Tahmasebi & Risse
2017
Wijaya
& Yentizerzi
2011
Lau et al
2012
Frerman & Lapata
2016
Bamler & Mandt
2018
Kim et al
2014
Kulkarni et
al
2015
Hamilton et al
2016
Sagi et al
2009
Basile et al
2016
2013
2008 2012
2010
2009 2011 2014 2015 2016 2017 2018
embeddings
dynamic embeddings
neural embeddings
2019 2020
topic models
word sense induction
contextual embeddings
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Word-level
semantic change
embeddings / context-based methods
dynamic embeddings
neural embeddings
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Context-based method
Sagi et al.
GEMS 2009
context vectors
w
ti
tj
Broadening of sense
Narrowing of sense
With grouping:
Added/removed sense
Data set split in approp. sets
BUT: 1. 2. No alignment of senses over time!
No discrimination between senses
Word embedding-based models
Kulkarni et al. WWW’15
Project a word onto a vector/point
(POS, frequency and embeddings)
Track vectors over time
Kim et al. LACSS 2014
Basile et al. CLiC-it 2016
Hamilton et al. ACL 2016
Image: Kulkarni et al. WWW’15
LSC – individually trained embedding spaces
Single-point
embedding space
ti
multiple
time points
Track an individual
word w over time
Change
point/degree
detection
1 Embedding space
Alignment
2
Change degree/ point
3
align
Vector space image:
Nieto Pina and
Johansson, RANLP’15
LSC – dynamic embedding spaces
Align while
training
Track an individual
word w over time
Change
point/degree
detection
Dynamic Embeddings
Sharing data is highly beneficial!
Bamler & Mandt:
• Bayesian Skip-gram
Yao et al:
• PPMI embeddings
Rudolph & Blei:
• Exponential family embeddings
(Beronoulli embeddings)
Share data across all time points
Avoids aligning
Temporal Referencing
Sharing data is highly beneficial!
Share contexts across all time points
Indivudal vectors for words for each bin
Avoids aligning
Dubossarsky et al
• SGNS
• PPMI embeddings
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Sense-differentiated
semantic change
topic models
word sense induction
contextual embeddings
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Wijaya & Yeniterzi
DETECT '11
Cook et al.
Coling 2014
Frermann & Lapata
TACL 2016
Topic-based methods
Finally, we
conduct a
preliminary
evaluation in
which we apply
our methods to
the task of
The meanings
of words are
not fixed but in
fact undergo
change
BNC ukWaC
1 Topic model (HDP)
Assign topics to all instances of a word.
2
If a word sense WSi is assigned to collection 2
but not 1 then WSi is a novel word sense.
3
BUT:
Only two time points (typically there is much noise!)
No alignment of senses over time!
A
B
Lau et al.
EACL 2014
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Downsides topic models
Topic
change
Sense
change
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Word sense induction
Word sense induction
(curvature clustering)
individual time slices
Tahmasebi & Risse, RANLP2017
Stone
Music
Lifestyle
Rock
Step 1: Step 2: Step 3:
Detecting stable
senses
→ units
Relating units
Paths
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Complexity
O(|S|T)
Type-based embedding methods
w
Sentence with w and more
Different sentence with w and more
Last sentence with w and more
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Token-based embedding methods
w
w
w
Sentence with w and more
Different sentence with w and more
Last sentence with w and more
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Evaluation
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Evaluation
individual
individual text
signal
topic, cluster, vector…
signal change
collective text
minimum optimum medium
October, 2022 |
Nina Tahmasebi | KBR DH seminar
• Positive examples
• Negative examples
• Pairs
Evaluation
Controlled data
3
ways
Top/bottom
results
Pre-determined list of:
October, 2022 |
Nina Tahmasebi | KBR DH seminar
Summary of methods
• Most co-occurrence methods
• are outperformed by type-
embeddings
• Type-embeddings
• average embeddings
• need alignment across corpora
• need very much data
• Dynamic embeddings
• ‘remember’ too much historical
• Topic-based method
• have little correspondence to
senses
• (and run badly on too large
datasets)
• WSI-based method
• have typically too low coverage
• Contextual embeddings
• need to be clustered into senses
Thank you!
Nina.tahmasebi@gu.se
nina@tahmasebi.se

Mais conteúdo relacionado

Mais de Nina Tahmasebi

Tartu-DHtalk-final.pdf
Tartu-DHtalk-final.pdfTartu-DHtalk-final.pdf
Tartu-DHtalk-final.pdfNina Tahmasebi
 
2020 09-28-odense-final-forpublication
2020 09-28-odense-final-forpublication2020 09-28-odense-final-forpublication
2020 09-28-odense-final-forpublicationNina Tahmasebi
 
Workshop on Digital Literacy - Digital text and data-intensive research
Workshop on Digital Literacy - Digital text and data-intensive researchWorkshop on Digital Literacy - Digital text and data-intensive research
Workshop on Digital Literacy - Digital text and data-intensive researchNina Tahmasebi
 
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper Corpus
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper CorpusDhn2018-A Study on Word2Vec on a Historical Swedish Newspaper Corpus
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper CorpusNina Tahmasebi
 
Detecting language change for the digital humanities
Detecting language change for the digital humanitiesDetecting language change for the digital humanities
Detecting language change for the digital humanitiesNina Tahmasebi
 

Mais de Nina Tahmasebi (6)

CHR2022-final.pdf
CHR2022-final.pdfCHR2022-final.pdf
CHR2022-final.pdf
 
Tartu-DHtalk-final.pdf
Tartu-DHtalk-final.pdfTartu-DHtalk-final.pdf
Tartu-DHtalk-final.pdf
 
2020 09-28-odense-final-forpublication
2020 09-28-odense-final-forpublication2020 09-28-odense-final-forpublication
2020 09-28-odense-final-forpublication
 
Workshop on Digital Literacy - Digital text and data-intensive research
Workshop on Digital Literacy - Digital text and data-intensive researchWorkshop on Digital Literacy - Digital text and data-intensive research
Workshop on Digital Literacy - Digital text and data-intensive research
 
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper Corpus
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper CorpusDhn2018-A Study on Word2Vec on a Historical Swedish Newspaper Corpus
Dhn2018-A Study on Word2Vec on a Historical Swedish Newspaper Corpus
 
Detecting language change for the digital humanities
Detecting language change for the digital humanitiesDetecting language change for the digital humanities
Detecting language change for the digital humanities
 

Último

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Último (20)

BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

2022-10-18-KBR-for publication.pdf

  • 1. Change is Key! An introduction to lexical semantic change Nina Tahmasebi, Associate Professor & Simon Hengchen, Phd University of Gothenburg October 2022, KBR Digital Heritage Seminar Series: Lexical Semantic Change
  • 2. Some facts • 6 years • 6 partner universities • Members from 4 countries • With advisors, 6 countries • 13 people including PM and SE
  • 3. October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 4. Word meaning change Over time He was an awesome leader! He was an awesome leader! time In different contexts (at the same time) St. Petersburg St. Petersburg Petrograd Leningrad time October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 5. main CHALLENGES for computational models of meaning and change Handle languages with smaller amounts of data Sense-aware models Find out WHAT changed, HOW and WHEN Generalize to multiple languages Computational models of meaning and change October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 6. Computational models of meaning and change Language level change Historical Linguistics Our Research Questions October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 7. Computational models of meaning and change Our Research Questions Language level change Lexicography October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 8. Computational models of meaning and change Our Research Questions Societal level change Analytical Sociology October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 9. Computational models of meaning and change Our Research Questions Societal level change Gender Studies Gender Studies October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 10. Our Research Questions Computational models of meaning and change Gender Studies Societal level change Literary Studies
  • 11. Our societal contribution Meaning for everyone clams muslim clams muslim October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 13. Methods for computational semantic change October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 14. ? October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 15. Explicit, count-based vector representations MOUSE 2 3 1 5 5 4 8 October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 16. MOUSE 0.4 0.4. .5 .02 .005 0.1 0.9 ? ? ? ? ? ? ? ? ? ? ? ? Statistical, learned vector representations October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 17. Word embeddings shown in 2D instead of 50-100000 Image: Nieto Pina and Johansson, RANLP’15 October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 18. collective text individual individual text signal topic, cluster, vector… Pipeline signal change October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 20. Change type Novel related ws Novel unrelated ws Broadening Join Narrowing Split Death Novel word sense Novel word Change Single-sense Sense-differentiated October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 21. Difficulty: What does a word mean? When are two meanings the same? October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 22. word word Single-sense October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 23. count-based embeddings dynamic embeddings neural embeddings Single-sense Sense-differentiated 2013 2008 2012 2010 2009 2011 2014 2015 2016 2017 2018 2019 topic models word sense induction contextual embeddings 2020 Giulianelli et al 2020 Hu et al 2019 Tahmasebi et al. 2008 Mitra et al 2015 Tahmasebi & Risse 2017 Wijaya & Yentizerzi 2011 Lau et al 2012 Frerman & Lapata 2016 Bamler & Mandt 2018 Kim et al 2014 Kulkarni et al 2015 Hamilton et al 2016 Sagi et al 2009 Basile et al 2016
  • 24. 2013 2008 2012 2010 2009 2011 2014 2015 2016 2017 2018 embeddings dynamic embeddings neural embeddings 2019 2020 topic models word sense induction contextual embeddings October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 25. Word-level semantic change embeddings / context-based methods dynamic embeddings neural embeddings October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 26. Context-based method Sagi et al. GEMS 2009 context vectors w ti tj Broadening of sense Narrowing of sense With grouping: Added/removed sense Data set split in approp. sets BUT: 1. 2. No alignment of senses over time! No discrimination between senses
  • 27. Word embedding-based models Kulkarni et al. WWW’15 Project a word onto a vector/point (POS, frequency and embeddings) Track vectors over time Kim et al. LACSS 2014 Basile et al. CLiC-it 2016 Hamilton et al. ACL 2016 Image: Kulkarni et al. WWW’15
  • 28. LSC – individually trained embedding spaces Single-point embedding space ti multiple time points Track an individual word w over time Change point/degree detection 1 Embedding space Alignment 2 Change degree/ point 3 align Vector space image: Nieto Pina and Johansson, RANLP’15
  • 29. LSC – dynamic embedding spaces Align while training Track an individual word w over time Change point/degree detection
  • 30. Dynamic Embeddings Sharing data is highly beneficial! Bamler & Mandt: • Bayesian Skip-gram Yao et al: • PPMI embeddings Rudolph & Blei: • Exponential family embeddings (Beronoulli embeddings) Share data across all time points Avoids aligning
  • 31. Temporal Referencing Sharing data is highly beneficial! Share contexts across all time points Indivudal vectors for words for each bin Avoids aligning Dubossarsky et al • SGNS • PPMI embeddings October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 32. Sense-differentiated semantic change topic models word sense induction contextual embeddings October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 33. Wijaya & Yeniterzi DETECT '11 Cook et al. Coling 2014 Frermann & Lapata TACL 2016 Topic-based methods Finally, we conduct a preliminary evaluation in which we apply our methods to the task of The meanings of words are not fixed but in fact undergo change BNC ukWaC 1 Topic model (HDP) Assign topics to all instances of a word. 2 If a word sense WSi is assigned to collection 2 but not 1 then WSi is a novel word sense. 3 BUT: Only two time points (typically there is much noise!) No alignment of senses over time! A B Lau et al. EACL 2014 October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 34. Downsides topic models Topic change Sense change October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 35. Word sense induction Word sense induction (curvature clustering) individual time slices Tahmasebi & Risse, RANLP2017 Stone Music Lifestyle Rock Step 1: Step 2: Step 3: Detecting stable senses → units Relating units Paths October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 37. Type-based embedding methods w Sentence with w and more Different sentence with w and more Last sentence with w and more October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 38. Token-based embedding methods w w w Sentence with w and more Different sentence with w and more Last sentence with w and more October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 40. October, 2022 | Nina Tahmasebi | KBR DH seminar Evaluation individual individual text signal topic, cluster, vector… signal change collective text minimum optimum medium
  • 41. October, 2022 | Nina Tahmasebi | KBR DH seminar • Positive examples • Negative examples • Pairs Evaluation Controlled data 3 ways Top/bottom results Pre-determined list of: October, 2022 | Nina Tahmasebi | KBR DH seminar
  • 42. Summary of methods • Most co-occurrence methods • are outperformed by type- embeddings • Type-embeddings • average embeddings • need alignment across corpora • need very much data • Dynamic embeddings • ‘remember’ too much historical • Topic-based method • have little correspondence to senses • (and run badly on too large datasets) • WSI-based method • have typically too low coverage • Contextual embeddings • need to be clustered into senses