4. Convolutional Layer
• Wide Convolution
• Narrow Convolution
The red connections all
have the same weight.
s+m-1=7-5+1=3 s+m-1=7+5-1=11
5. Pooling Layer
• Max pooling: The idea is to capture the most important feature—one
with the highest value—for each feature map.
6. Dropout: A Simple Way to Prevent Neural
Networks from Overfitting
• Consider a neural net with one hidden layer.
• Each time we present a training example, we
randomly omit each hidden unit with probability
0.5.
• So we are randomly sampling from 2^H different
architectures.
• All architectures share weights.
• Dropout prevents units from co-adapting (共同作用)
too much.
H
17. MVCNN:Training
• Pretraining
• Unsupervised training
• Average of context word vectors as a
predicted representation of the
middle word
• To produce good initial values
• Training
• Logistic regression
18. References
[1] Kim, Y. (n.d.). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on
Empirical Methods in Natural Language Processing (EMNLP).
[2] Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (n.d.). A Convolutional Neural Network for Modelling Sentences.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
[3] Wang, P., Xu, J., Xu, B., Liu, C. L., Zhang, H., Wang, F., & Hao, H. (2015). Semantic Clustering and Convolutional
Neural Network for Short Text Categorization. In Proceedings of the 53rd Annual Meeting of the Association for
Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 2, pp. 352-
357).
[4] Johnson, R., & Zhang, T. (n.d.). Effective Use of Word Order for Text Categorization with Convolutional Neural
Networks. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies.
[5] dos Santos, C. N., & Gatti, M. (2014). Deep convolutional neural networks for sentiment analysis of short texts. In
Proceedings of the 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland.
[6] Tang, D., Wei, F., Qin, B., Liu, T., & Zhou, M. (2014, August). Coooolll: A deep learning system for twitter sentiment
classification. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) (pp. 208-212).
[7] Wenpeng Yin, Hinrich Schütze. Multichannel Variable-Size Convolution for Sentence Classification. The 19th SIGNLL
Conference on Computational Natural Language Learning (CoNLL'2015, long paper). July 30-31, Peking, China.