2010 PACLIC - pay attention to categories

Human Interface Laboratory
Pay Attention to Categories:
Syntax-Based Sentence Modeling with
Metadata Projection Matrix
2020. 10. 24, @PACLIC 34
Won Ik Cho, Nam Soo Kim (SNU ECE & INMC)

Contents
• Introduction
• Proposed Method
• Experiment and Result
• Visualization
• Conclusion
1

Introduction
• Brief overview on sentence modeling
 Sparse encoding to dense word vectors
2

Introduction
 Deep learning techniques and attention model
3

Introduction
 Self-attentive models (which advances self attention)
• Still a useful approach to sentence classification
4

Introduction
• Motivation
 What if we need to pay more attention to some syntactic categories and
want to decide the intensity automatically?
• e.g., Oxymoron detection (Cho et al., 2017)
– Drink a sugar-free sweet tea
– When I’m low, I get high
 Attention mechanism is useful, but less reliable in the control regarding
syntactic categories
• Not mainly considered since the necessity is underestimated
• Minimal information (e.g., POS) can be of help for some tasks!
• How can we take into account that information beyond just attaching it to each
token?
5

Related Work
• Word embedding
 Dense embedding of words, in view of distributional semantics
 Projecting words to low dimensional space, with the objective of ‘making
the distribution that enables the prediction on the probable surrounding
words’
 Variations of word2vec
• Word2Vec (Mikolov et al., 2013)
• GloVe (Pennington et al., 2014)
• fastText (Bojanovsky et al., 2016)
6

Related Work
• Deep learning techniques
 Convolutional neural networks
• Primarily applied to the vision area
• Used in Kim (2014) as 1D convolution that represents the sentence as a
sequence of dense word vectors
 Recurrent neural networks
• Adopted to model the sequential data
• Long short-term memory is used to cope with the vanishing gradient
• Bidirectional LSTM to consider the directivity
7

Related Work
 Attention models
• First used to deal with the word order sensitivity and matching in machine
translation
• Evolved to various format, such as location-based
 Self-attentive sentence embedding (Lin et al., 2017)
• Related to self attention, but more applicable to BiLSTM format
8

Related Work
 Self-attentive sentence embedding
9

Proposed Method
• Overall description
 Sequential word
embedding
and BiLSTM
 Feature extraction
for attention source
• TF-IDF? BiLSTM?
 Attention source
activated with ReLU
 PAC structure with:
• Weight layer with
category-wise info
• Projection matrix
• Multiplication (𝛼1
𝐿
)
10

Proposed Method
 Sequential word
embedding
and BiLSTM
• TF-IDF? BiLSTM?
activated with ReLU
category-wise info
𝐿
)
11

Proposed Method
 Sequential word
embedding
and BiLSTM
• TF-IDF? BiLSTM?
activated with ReLU
category-wise info
𝐿
)
12

Proposed Method
category-wise info
𝐿
)
13

Experiment and Result
• Implementation
 Baseline features
• TF-IDFs and bigrams
– Dictionary size set to 3,000 (=30 * 100)
• GloVe pretrained with Twitter 27B
– Word vector dim. 100
– Padding max length 30
 Baseline classifiers
• SVM for TF-IDF
• NN for GloVe (averaged) with Adam(0.0005) and batch size 16
• CNN (32 filters, window 3) and BiLSTM (hidden dim. 64) for Glove (padded)
 Baeseline attention model
• Lin et al. (2017) with context vector dim. 64, alongwith the above BiLSTM
 The proposed
• 𝑛 𝑝 follows the NLTK POS tagging result
14

• Dataset
 Metalanguage detection (2,393)
• Investigates whether a sentence contains explicit mention terms (‘title’ or ‘name’)
• Contains 629 mentioned and 1,764 not-mentioned instances excerpted from
Wikipedia
 Irony detection (4,618)
• Distributed in SemEval 2018 Task 3 for ironic tweet detection (Van Hee et al., 2018).
• Binary label case was taken into account; 2,222 contain irony and 2,396 do not
 Subjectivity detection (10,000)
• Refers to Pang and Lee (2004); checks if the movie review contains a subjective
judgment
• Incorporates equally 5,000 instances for each of the subjective and objective reviews
 Stance classification (3,835)
• Part of distributed dataset from SemEval 2016 Task 6 (Mohammad et al., 2016)
• Labels corresponding to target, stance, opinion towards and sentiment information
• 1,205, 2,409, and 221 instances for favor, against, and none each
 Sentiment classification (20,632)
• Utilizes the test data released in SemEval 2017 Task 4 (Rosenthal et al., 2017)
• Consists of 7,059 positive, 3,231 negative and 10,342 neutral tweets
15

• Result
 The proposed surpasses the baseline results in META, IRONY, and SUBJ
• Which are expected to benefit from identifying the specific syntactic categories
• Relatively weak at discerning the latent information such as STANCE and SENT
 Dependency on the source information
• META highly prefers the word-level attention source
– Concerns explicit existence of certain lexical terms
16

• Result
 STANCE and IRONY requires contextual information as well, but works
better in IRONY?
• IRONY incorporates hashtagged information which matters in the prediction
 Expected suitable application
• Bitstream or symbolic music analysis (where the formatted/syntactic information
plays an important role)
17

• Two excerpts from SUBJ
 Enables to see which syntactic
category is the most important
 Will be effective if the constituency
tagging is more reliable!
Visualization
18

Conclusion
• Sentence modeling concerns recent language modeling and deep
learning techniques
• Among attention approaches, how model pays attention is
determined automatically, but hardly cares the syntactic categories
that are given as a prior information
• Incorporating such information in deciding the attention weight via
projection matrix brings advantage in some tasks that necessitate
the attention on lexical features
19

2010 PACLIC - pay attention to categories

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a 2010 PACLIC - pay attention to categories

Semelhante a 2010 PACLIC - pay attention to categories (20)

Mais de WarNik Chow

Mais de WarNik Chow (20)

Último

Último (20)

2010 PACLIC - pay attention to categories

Notas do Editor