SlideShare uma empresa Scribd logo
1 de 229
Lecture BY
Session 1 : SeungWoo Kim tmddno1@gmail.com
Session 2 : SuSang Kim healess1@gmail.com
Python과 Tensorflow를 활용한 AI Chatbot 개발
1. 도커실행환경
https://github.com/TensorMSA/tensormsa_docker.git
./tensormsa_docker/docker_compose_cpu
2. 소스설명코드 - jupyter
git clone https://github.com/TensorMSA/tensormsa_jupyter.git
Session 1 : chap05_nlp
Session 2 : chap13_chatbot_lecture
시작 전 실습 환경 구성
●ML&DL Engineer (2014 ~ 2017)
○ POSCO Smart Factory Machine Learning Based Scheduling (2014~2015)
○ POSCO AI ChatBot (2016 ~ 2017)
○ Deep Learning Open Source Framework - TensorMSA (2016~2017)
●Android Developer - POSCO Mobile system (2010 ~ 2014)
○ LBS, IPS Vehicle & Navigation System
○ IPS with Deep Learning - Patent (2016)
●Awards
○ OSS world Challenge 2017 (on top 12 , on progress now)
○ Employee of the year 2015, 2017 on POSCO ICT
●Woori Bank AI (‘17.11.1 ~)
Session 1 : SeungWoo Kim tmddno1@gmail.com
Session 1 - Understand NLP
Session 1 - 강의 목표
전체 ChatBot 아키텍처를 이해하고 서비스를 구성하기 위해 필요한 기반
지식에 대한 설명을 통해 Session 2 에서 실질적인 챗봇 개발에 대한 설명을
더 잘 이해 할 수 있도록 돕고자 함 .
챗봇 , 자연어 처리, 딥러닝 그리고 구현의 연관성을 이해하는 것에 중점 !
Session 1 - Understand NLP
About ChatBot
Session 1 - Understand NLP
Natural Language
Understanding
Natural Language
Generation
User System
자연어
Semantic Frame자연어
Semantic Frame
Why we need nlp on ChatBot system?
About ChatBot
Session 1 - Understand NLP
Sort of Chatbot
Easy Hard
Retrieval-based model Generative model
Traditional algorithms Deep Learning algorithms
Short Conversation Long Conversation
Closed Domain Open Domain
About ChatBot
Session 1 - Understand NLP
Retrieval-Based vs Generative Models
Retrieval-based models (easier)
use a repository of predefined responses and some kind of heuristic to pick an
appropriate response based on the input and context. The heuristic could be as
simple as a rule-based expression match, or as complex as an ensemble of Machine
Learning classifiers. These systems don’t generate any new text, they just pick a
response from a fixed set.
Generative models (harder)
don’t rely on pre-defined responses. They generate new responses from scratch.
Generative models are typically based on Machine Translation techniques, but
instead of translating from one language to another, we “translate” from an input to
an output (response).
About ChatBot
Session 1 - Understand NLP
Use Deep Learning or Not
Using Deep Learning
Using Deep Learning do not guarantee better performance
all the time to compared with using traditional techniques.
It’s more expensive to gather enough data and train heavy model.
Using traditional algorithms
Most of current chatbot systems are based on those traditional algorithms
and It has own strong points to compared with DL algorithms.
형태소 분석
품사 태깅
패턴 매칭
구문 분석
의미 분석
감성 분석
대화 처리
CharCNN
BiLSTMCrf
Seq2Seq
Word2Vec
RNN
DMN
E2E MMN
Attention
DNN
TFIDF
SVM
Dictionary
Bayesian
Logistic
LSA
HMM
USE
BOTH
About ChatBot
Session 1 - Understand NLP
Long Conversation vs Short Conversation
Short Conversation
the goal is to create a single response to a single input. For example, you
may receive a specific question from a user and reply with an appropriate
answer.
Long conversation
go through multiple turns and need to keep track of what has been said. Customer
support conversations are typically long conversational threads with multiple
questions.
About ChatBot
Session 1 - Understand NLP
Open Domain vs Closed Domain
“Closed Domain
You can ask a limited set of questions on specific topics.
(Easier). What is the Weather in Miami?”
“Open Domain
I can ask a question about any topic… and expect a relevant response.
(Harder) Think of a long conversation around refinancing my mortgage
where I could ask anything.” Mark Clark
OverView Session 1 - Understand NLP
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning BasicNLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
SyntaxNet
Scenario
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System
ChatBot
Server
Numpy
Pandas
Tensorflow
파이프 라인 데이터 처리 ML & DL Library
Scikit Learn
Konlpy
개발 관련
데이터 수집
데이터 전처리
모델 훈련
모델 평가
모델 서비스
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message
1 2
3
기본 이론
관련 딥러닝
이론 설명
예제를 통한
구현 설명
Session 1 - Understand NLP
Memory Network
Seq2SeqResponse Generation
Ontology
DM
Legacy Data Base
[AI Based] Chat-Bot Research Environment
Data MartMonitoring
Summary Result
Train Data
AI Model
Pipe Line
Session 1 - Contents
1. 자연어 처리 이론
> 일반적으로 자연어를 처리하기 위해 필요한 언어학적 이론 설명
2. 딥러닝 이론
> 자연어 처리 이론에서 이야기하는 문제에 해당하는 딥러닝 이론
3. 구현
> 딥러닝 및 라이브러리 등을 사용한 이론의 구현
About NLP (Natural Language Process)
Session 1 - Understand NLP
Mostly Solved Making Good Progress Still Really Hard
Spam Detection
(스팸분석)
Text Categorization
(텍스트 분류)
Part of Speech Tagging
(단어 분석)
Named Entity Recognition
(의미 구분 분석)
Information Extraction
(정보 추출)
Sentiment Analysis
(감정분석)
Coreference Resolution
(같은 단어 복수 참조)
Word Sense
Disambiguation
(복수 의미 분류)
Syntactic Parsing
(구문해석)
Machine Translation
(기계번역)
Semantic Search
(의미 분석 검색)
Question & Answer
(질의 응답)
Textual inference
(문장 추론)
Summarization
(텍스트 요약)
Discourse & Dialog
(대화 & 토론)
About NLP (Natural Language Process)
Session 1 - Understand NLP
Text Categorization
Text Classification assigns one or more classes to a document according to their content. Classes are
selected from a previously established taxonomy (a hierarchy of catergories or classes).
Spam Detection
Spam Detection is also the part of Text Classification problem.
Part of Speech
grammatical tagging or word-category disambiguation, is the process of marking up a word in a
text (corpus) as corresponding to a particular part of speech, based on both its definition and its
context
About NLP (Natural Language Process)
Session 1 - Understand NLP
Low Level Information Extraction
About NLP (Natural Language Process)
Session 1 - Understand NLP
Information Extraction on Broader view
https://www.google.co.kr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwievZKlmMzVAhVCgrwKHbM_D88QFggyMAE&url=https%3A
%2F%2Fweb.stanford.edu%2Fclass%2Fcs124%2Flec%2FInformation_Extraction_and_Named_Entity_Recognition.pptx&usg=AFQjCNFUT9ZjvrDrx
F9su0J9KiWobVP4Kg
Rule Based
Extraction
Named Entity
recognition
Syntax Anal
Relation Search
Ontology
Information
Extraction
About NLP (Natural Language Process)
Session 1 - Understand NLP
Coreference Resolution
I did not vote for the Donald Trump because I think he is too reckless
Coreference resolution is the task of finding all expressions that refer to the same entity in a
text. It is an important step for a lot of higher level NLP tasks that involve natural language
understanding such as document summarization, question answering, and information
extraction.
Deep Reinforcement Learning for Mention-Ranking Coreference Models
Improving Coreference Resolution by Learning Entity-Level Distributed Representations
https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30
About NLP (Natural Language Process)
Session 1 - Understand NLP
Word Sense Disambiguation
[Example]
1. a type of fish
2. tones of low frequency
and the sentences:
1. I went fishing for some sea bass.
2. The bass line of the song is too
weak.
http://www.cs.cornell.edu/courses/cs4740/2014sp/lectures/wsd-1.pdf
supervised way lable data example
simi-supervised way
About NLP (Natural Language Process)
Session 1 - Understand NLP
Syntactic Parsing
syntactic parsing is Find structural relationships between words in a sentence
https://web.stanford.edu/~jurafsky/slp3/12.pdf
About NLP (Natural Language Process)
Session 1 - Understand NLP
Machine translation (MT) is automated translation. It is the process by which computer software is
used to translate a text from one natural language (such as English) to another (such as Spanish).
Machine Translation
About NLP (Natural Language Process)
Session 1 - Understand NLP
Semantic Search
Semantic search seeks to improve search accuracy by understanding a searcher’s intent through
contextual meaning.
Question and Answer
Able to answer questions in natural language based on Knowledge data (usually ontology)
ex) Best example is IBM Watson
Textural Inference
Recognize, generate, or extract pairs <T,H> of natural language
expressions, such that a human who reads (and trusts) T would infer that His most likely also true
Summarization
Extracting interesting parts of the text and create a summary by using these parts of the text and
allow for rephrasings to make summary more grammatically correct.
Discourse & Dialog
Do conversation with understanding the whole history of dialog and semantic meaning of speaker.
Standard Natural Language Process
Session 1 - Understand NLP
Spoken Utterance
Lexical (어휘) Analysis : Word Structure
Speech Recognition
Written Utterance
Syntactic (구문) Analysis : Sentence Structure
Morphemes, Word
Semantic (의미) Analysis : Meaning of Words & Sentence
Sentence
Discourse (대화) Analysis : Relationship between sentence
Context beyond Sentence
Lexical Analysis
Syntactic Analysis
Semantic Analysis
NLU Server
(Understand)
NLG Server
(Generate)
Voice Recognition
Discourse Analysis
자연어 처리 이론
기본 이론
Session 1 - Understand NLP
Session 1 - Now We are Here!
Response Generation
Session 1 - Understand NLP
AI Speaker Alexa Alexa Microphone System
NLP - Voice Recognition
Session 1 - Understand NLP
Deep Learning for Classification Hidden Markov Model for Language Model
NLP - Voice Recognition
Lexical Analysis
Syntactic Analysis
Semantic Analysis
NLU Server
(Understand)
NLG Server
(Generate)
Voice Recognition
Discourse Analysis
자연어 처리 이론
기본 이론
Session 1 - Understand NLP
Session 1 - Now We are Here!
Response Generation
Session 1 - Understand NLP
NLP - Lexical Analysis
Main Factors on Lexical Analysis
1. Sentence Splitting
2. Tokenizing
3. Morphological
4. Part of speech Tagging
Session 1 - Understand NLP
NLP - Lexical Analysis
Lexical Analysis
What if there is no line change char (‘n’) ? Where is the EOS point?
What if sentence is not separated into words properly with space?
[Examples]
[Problems]
Session 1 - Understand NLP
NLP - Lexical Analysis
Word stemming lemmatization
Love Lov Love
Loves Lov Love
Loved Lov Love
Loving Lov Love
Innovation Innovat Innovation
Innovations Innovat Innovation
Innovate Innovat Innovate
Innovates Innovat Innovate
Innovative Innovat Innovative
Morphing Examples Stemming & lemmatization
Morphology is process of finding morpheme which is smallest“meaningful unit (Lexical meaning
or grammatical function)” and other features like stem in a language that carries information.
Lexical Analysis
Session 1 - Understand NLP
NLP - Lexical Analysis
Lexical Analysis
Ambiguity
“that” can be a subordinating conjunction or a relative pronoun
- The fact that/IN you’re here
- A man that/WDT I know
“Around” can be a preposition, particle, or adverb
- I bought it at the shop around/IN the corner.
- I never got around/RP to getting a car.
- A new Toyota Prius costs around/RB $25K.
Degree of ambiguity (in Brown corpus)
- 11.5% of word types (40% of word tokens) are ambiguous
# of Tags 1 2 3 4 5 6 7
# of Words 35340 3760 264 61 12 2 1
#Ambiguity Problem is much serious in Korean
Part-of-speech tagging is one of the most important text analysis tasks used to classify words into
their part-of-speech and label them according the tagset which is a collection of tags used for the pos
tagging. Part-of-speech tagging also known as word classes or lexical categories
Session 1 - Understand NLP
NLP - Lexical Analysis
Lexical Analysis
Hannanum Kkma Komoran Mecab Twitter
하늘 / N 하늘 / NNG 하늘 / NNG 하늘 / NNG 하늘 / Noun
을 / J 을 / JKO 을 / JKO 을 / JKO 을 / Josa
나 / N 날 / VV 나 / NP 나 / NP 나 / Noun
는 / J 는 / ETD 는 / JX 는 / JX 는 / Josa
자동차 / N 자동차 / NNG 자동차 /
NNG
자동차 /
NNG
자동차 /
Noun
Anal Result Comparison Library Performance Comparison
Session 1 - Understand NLP
NLP - Lexical Analysis
Lexical Analysis
[Code]
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Lexical Analysis
(1) Word Segmentation
(2) POS Tagging
(3) Chunking
(4) Clause Identification
(5) Named Entity Recognition
(6) Semantic Role Labeling
(7) Information Extraction
What we can do with sequence labeling What’s sequence labeling
Sequence Labeling
Session 1 - Understand NLP
NLP - Lexical Analysis
Word POS Chunk NE
West NNP B-NP B-MISC
Indian NNP I-NP I-MISC
all-around NN I-NP O
Phil NNP I-NP B-PER
Simons NNP I-NP I-PER
took VBD B-VP O
four CD B-NP O
for IN B-PP O
38 CD B-NP O
on IN B-PP O
Friday NNP B-NP O
<iob data set example>
POS Tag 의미
ttps://docs.google.com/spreadsheet/ccc?key=0ApcJghR6UMXxdEdU
RGY2YzIwb3dSZ290RFpSaUkzZ0E&usp=sharing
Chunk Tag 의미
B : Begin of Chunk
I : Continuation of Chunk
E: End of Chunk
NP : Noun
VP : Verb
NER BIO Tag 의미
B : Start with new Chunk
I : word inside Chunk
O: Outside of Chunk
Sequence Labeling
Session 1 - Understand NLP
NLP - Lexical Analysis
BiLSTM-CRF Description
Sequence Labeling with Deep Learning
Deep Learning Basic
Word Embedding
DL FrameWorks
Prerequisite
Session 1 - Understand NLP
NLP - Lexical Analysis
VIDEO
Deep Learning Basic
Session 1 - Understand NLP
New Algorithms
Back Propagation
CNN, RNN .. etc
Big Data
HDFS
MapReduce
Hardware
GPU Parallel Execution
Cloud Service
NLP - Lexical Analysis
Deep Learning Basic
Session 1 - Understand NLP
3
5
7
9
(1) Problem (2) Algorithm (3) Programming
Y = 2 * X + 1
function(x)
{
return x*2 + 1
}
NLP - Lexical Analysis
Deep Learning Basic
Session 1 - Understand NLP
3
5
7
9
(1) Problem (2) Algorithm (3) Programming
Y = w * X + b
3
5
7
9
initial
optimized
NLP - Lexical Analysis
Deep Learning Basic
Session 1 - Understand NLP
Supervised Learning Unsupervised Learning Reinforcement Learning
CAT
CAT
CAT
DOG
DOG
DOG
Deep Learning Basic
NLP - Lexical Analysis
Session 1 - Understand NLP
1. Perceptron
2. Activation Function
3. Cost
4. Gradient Descent
5. Back Propagation
6. Optimizers
Deep Learning Basic
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Perceptron
wX + b
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Perceptron
wX + b Activation Function
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Activation Function
Logistic Regression Nonlinear Problems
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Activation Function
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Loss (Error)
Initial
Optimized
LOSS
x y y~
0 3 7
1 5 9
2 7 11
3 9 13
4 11 15
5 13 17
6 15 19
Y
X0 1 2 3
Y = wX + b
NLP - Lexical Analysis
Session 1 - Understand NLP
x y init opt
0 3 7 3
1 5 9 5
2 7 11 7
init : ((7-3)^2 + (9-5)^2 + (6-11)^2) / 3 = 16
opt : ((3-3)^2 + (5-5)^2 + (7-7)^2) / 3 = 0
HOW?
Deep Learning Basic - Loss (Error)
W, b
Cost(W, b)
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Gradient Descent
weight Learning
Rate
gradient
NLP - Lexical Analysis
Session 1 - Understand NLP
Output Hidden Input
Train Data
Forward Propagation
y-y~
(Error)
Back Propagation
Update
Each Weight
partial derivative
chain rule
Deep Learning Basic - BackPropagation
NLP - Lexical Analysis
Session 1 - Understand NLP
Deep Learning Basic - Optimizer
NLP - Lexical Analysis
https://www.youtube.com/watch?v=hMLUgM6kTp8
Session 1 - Understand NLP
NLP - Lexical Analysis
SGD
Adagrad
RMS
Momentum
Nag
Adadelta
Adam
Adaptive 계열 알고리즘
Deep Learning Basic - Optimizer
기존 진행 방향 반영
가속도 개념의 적용
Momentum과 유사
이동 위치에서 반영
2차 미분 값 활용
느린 것은 더 빨리
빠른 것은 더 꼼꼼히
누적 Gradient 를 Sum이
아닌 지수평균으로대체하여
G가 무한이 커지는 것을 방지
지수평균 사용, StepSize
변화 값의 제곱 사용
Adadelta, Momentum
특성 두 가지 모두 적용
http://shuuki4.github.io/deep%20learning/2016/05/20/Gradient-Descent-Algorithm-Overview.html
Session 1 - Understand NLP
NLP - Lexical Analysis
https://arxiv.org/pdf/1705.08292.pdf
"Gradient descent (GD)나 Stochastic gradient descent (SGD)를 이용하여 찾은 solution이
다른 adaptive methods (e.g. AdaGrad, RMSprop, and Adam)으로 찾은 solution보다 훨씬
generalization 측면에서 뛰어나다."
The Marginal Value of Adaptive Gradient Methods in Machine Learning Ashia C. Wilson] , Rebecca Roelofs] ,
Mitchell Stern] , Nathan Srebro† , and Benjamin Recht]∗ ] University of California, Berkeley. † Toyota
Technological Institute at Chicago May 24, 2017
There is no optimizer best for all cases!!
When to use adaptive optimizer?
If input embedding vectors are sparse, it’s better to use adaptive optimizer!
Deep Learning Basic - Optimizer
Session 1 - Understand NLP
# tf Graph input
x = tf.placeholder("float", [None, 784])
y = tf.placeholder("float", [None, 10])
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([784, 256])),
'h2': tf.Variable(tf.random_normal([256, 256])),
'out': tf.Variable(tf.random_normal([256, 10]))
}
biases = {
'b1': tf.Variable(tf.random_normal([256])),
'b2': tf.Variable(tf.random_normal([256])),
'out': tf.Variable(tf.random_normal([10]))
}
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
# Hidden layer with RELU activation
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.relu(layer_2)
# Output layer with linear activation
pred = tf.matmul(layer_2, weights['out']) + biases['out']
hypothesis = tf.nn.softmax(pred )
# Define loss and optimizer
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis),
reduction_indices=1))
tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
input Hidden Out
784
256
10
Hidden
256
784
256
784 256
256 10
256
S
O
F
T
M
A
X
Y=Activation(W*x + b)
[Error]
Cross
Entropy
W W1
A(W*x + b)
b
b
A(W*x + b)x
2
1
3
4
5
256
784
1
Deep Learning Basic
NLP - Lexical Analysis
Session 1 - Understand NLP
START 오늘 날씨 는 ? PAD PAD END
START 오늘 날씨 는 어때 ? PAD END
START 오늘 비가 오 려 나 ? END
Case of long sentence …
Vanishing Problem happens
Various length of data cause
waste of computing power
Here we have concept of Dynamic RNN
BiDirectional Lstm learn given data from backward Long Short Term Memory Cell
Cell State
https://brunch.co.kr/@chris-song/9
updateforget out
cell state
https://blog.altoros.com/the-magic-behind-google-translate-
sequence-to-sequence-models-and-tensorflow.html
NLP - Lexical Analysis
Deep Learning Basic
Session 1 - Understand NLP
NLP - Lexical Analysis
Deep Learning Basic
Overfitting
Fine Tuning
Multi Tasking
Ensemble
Data Preprocessing
Drop Out
Batch Normalization
Network Compression
https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdfhttps://arxiv.org/pdf/1510.00149.pdf
Adam+SGD
Learning Rate
Decaying
Fully Convolutional
1by1 Convolutional Filter
Quantize Neural
Networks
AutoML
Hyper Parameter
Random Search
Grid
Search
Genetic
Algorithm
Session 1 - Understand NLP
Session 1 - Now We are Here !
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Numpy
Pandas
Tensorflow
데이터 처리 ML & DL Library
Scikit Learn
Konlpy
개발 관련
구현
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software#cite_note-29
NLP - Lexical Analysis - Implementation
Deep Learning Framework comparison
pytorch
Session 1 - Understand NLP
NLP - Lexical Analysis - Implementation
Deep Learning Framework comparison
 dynamic vs static graph definition
Debugging Visualization
Deployment
VS
Session 1 - Understand NLP
NLP - Lexical Analysis - Implementation
Deep Learning Framework - Tensorflow
with tf.Graph().as_default() :
X = tf.placeholder("float")
Y = tf.placeholder("float")
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
pred = tf.add(tf.multiply(X, W), b)
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
tf.summary.FileWriter(logs_path, graph=tf.get_default_graph())
# Fit all training data
for epoch in range(training_epochs):
for (x, y) in zip(train_X, train_Y):
sess.run(optimizer, feed_dict={X: x, Y: y})
Tensorflow : static graph definition Pytorch : dynamic graph definition
Session 1 - Understand NLP
https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106
NLP - Lexical Analysis - Implementation
Deep Learning Framework comparison
https://blog.paperspace.com/which-ml-framework-should-i-use/
Session 1 - Understand NLP
NLP - Lexical Analysis - Implementation
Deep Learning Framework - Tensorflow
Graph (Edge + Node)
+
Session
Session 1 - Understand NLP
NLP - Lexical Analysis - Implementation
Deep Learning Framework - Tensorflow
https://github.com/TensorMSA/tensormsa_jupyter/blob/master/chap03_basic_models/linear_regressions.ipynb
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here!
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
Word Embedding 이란 ?
텍스트를 구성하는 하나의 음소, 음절, 단어, 문장, 문서
단위를 수치화하여 표현하는 방법의 일종
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
Word Representation
Discrete Representation
WordNet OneHot Vector
Distributed Representation
Direct Prediction
Word2Vec
Count Based
Full Document Windows
LSA SVD of x Glove
FastText
Session 1 - Understand NLP
WordNet
NLP - Lexical Analysis - Word Embedding
과거에는 WordNet과 같은 방법을 사용했다. WordNet이란, 각 단어끼리의 관계(상위단어, 동의어) 가 나타나 있는 트리구조의 그래프 모형이다.
물론 이를 구축하기 위한 작업은 전부 사람이 했다. 그러다보니 주관적이고 유지하는데 있어 많은 노동이 필요하다는 한계가 존재했다.
Session 1 - Understand NLP
OneHot Vector
NLP - Lexical Analysis - Word Embedding
Session 1 - Understand NLP
LSA(잠재적 의미 분석) with SVD(특이값 분해)
NLP - Lexical Analysis - Word Embedding
https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/04/06/pcasvdlsa/
- doc1 doc2 doc3
나 1 0 0
는 1 1 2
학교 1 1 0
에 1 1 0
가 1 1 0
ㄴ 1 0 0
다 1 0 1
영희 0 1 1
좋 0 0 1
truncated SVDSVD
LSA(잠재적 의미 분석)
Session 1 - Understand NLP
SVD of X
NLP - Lexical Analysis - Word Embedding
https://swalloow.github.io/cs224d-lecture2
이 방법은 Window의 길이 (일반적으로 5 - 10) 에 따라 대칭적으로 이동하면서 확인하는 방법이다.
● I like deep learning.
● I like NLP.
● I enjoy flying
위와 같은 corpus가 있을 때, 이를 matrix로 표현하면 다음과 같다. 간단히 보면 각 단어의 빈도 수를 체크한 것이다.
SVD 로 차원 축소Window size로 빈도 조사 결과
Session 1 - Understand NLP
https://www.tensorflow.org/tutorials/word2vec
http://w.elnn.kr/search/
Word2Vector Demo Site
장점 : 차원의 축소 , 의미적 유사성의 표현
단점 : 동음이의어 처리, 데이터 적을 경우 신경망 훈련시 신호 강도
NLP - Lexical Analysis - Word Embedding
Word2Vec
Session 1 - Understand NLP
C-Bow
the quick brown fox jumped over the lazy dog
([brown, jumped], fox)
window size : 1
brown
jumped
over
the
.
.
brown
jumped
over
fox
.
.
Input OutputHidden
Hidden Size Hidden Size
Vocab
Size
Data Set
Original
Text
NLP - Lexical Analysis - Word Embedding
Word2Vec
Session 1 - Understand NLP
the quick brown fox jumped over the lazy dog
(fox, brown), (fox, jumped)
window size : 1
brown
jumped
over
the
.
.
brown
jumped
over
fox
.
.
Input OutputHidden
Hidden Size Hidden Size
Vocab
Size
Data Set
Original
Text
Skip-Gram
NLP - Lexical Analysis - Word Embedding
Word2Vec
Session 1 - Understand NLP
(1)PV-DM (2)PV-DBOW
(3)DM + DBOW (Vector Concat)
W2V W2V W2V
(4)AVG(TF-IDF * W2V)
the quick brown fox jumped over the lazy dog
(paragraph, the)
(paragraph, quick)
(paragraph, brown)
(paragraph, fox)
(paragraph, jumped)
([paragraph, quick, brown,
fox, juped], over)
([paragraph, quick, brown,
fox, juped,over],the)
vector vector vector
TF-IDF TF-IDF TF-IDF
X X X
vector
AVG
NLP - Lexical Analysis - Word Embedding
Doc2Vec
Session 1 - Understand NLP
tfidf(t,d,D) = tf(t,d) x idf(t,D)
https://thinkwarelab.wordpress.com/2016/11/14/ir-tf-idf-%EC%97%90-%EB%8C%80%ED%95%B4-%EC%95%8C%EC%95%84%EB%B4%85%EC%8B%9C%EB%8B%A4/
http://www.popit.kr/bm25-elasticsearch-5-0%EC%97%90%EC%84%9C-%EA%B2%80%EC%83%89%ED%95%98%EB%8A%94-%EC%83%88%EB%A1%9C%EC%9A%B4-%EB%B0%A9%EB%B2%95/
Not exactly word embedding but used on nlp with deep learning pretty often
- Document similarity
- Words importance on document
- Used on search engine (like elasticsearch though it use BM25 for now)
NLP - Lexical Analysis - Word Embedding
TF-IDF
Session 1 - Understand NLP
- Introduce several ways to embed char as vector
안 녕 하 세 요
1
가 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
나 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
다 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
라 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
마 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
바 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
사 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
아 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
자 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
An Neung Ha Se Yo (ㅇ ㅏ ㄴ) (ㄴ ㅕ ㅇ) . . . .
2
a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
b 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
c 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
e 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
f 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
g 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
i 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
3
ㄱ 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ㄴ 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ㄷ 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ㄹ 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
ㅁ 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
ㅂ 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
ㅅ 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
ㅇ 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
ㅈ 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
NLP - Lexical Analysis - Word Embedding
Char Embeding
Session 1 - Understand NLP
the quick brown fox jumped over the lazy dog
0.2 0.1 0.4 0.21 0 0 0
f o x fox
Word2Vector
0 1 0 0 0 0 1 0
OneHot
Encoding
OneHot
Encoding
OneHot
Encoding
1.Word2Vec 계열은 의미적 상관성을 잘 표현
2.OneHot 은 강한 신호적 특성으로 Train 에 효과적
3.Word 단위 Embedding 은 단어를 잘 기억함
4.Char 단위 Embedding 은 미훈련 단어 처리에 용이
NLP - Lexical Analysis - Word Embedding
+
Char +Word Concat
Session 1 - Understand NLP
Words not exactly matched with the pretrained dict will return “UNKNOWN”
So FastText (by Facebook ) use ngram on their word embedding algorithm..
에어컨 ~ 에어조단 비교
에어컨
['$$에', '$에어', '에어컨', '어컨$', '컨$$'] => 5
에어조단
['$$에', '$에어', '에어조', '어조단', '조단$', '단$$'] => 6
일치
['$$에', '$에어'] => 2
점수
일치 2건 / 중복제거 전체 7건 => 0.2222
NLP - Lexical Analysis - Word Embedding
FastText
Session 1 - Understand NLP
Glove
NLP - Lexical Analysis - Word Embedding
(their dot product equals the logarithm of the words’ probability of co-occurrence) “임베딩된 단어벡터 간 유사도 측정을 수월하게 하면서도 말뭉치
전체의 통계 정보를 좀 더 잘 반영해보자”가 GloVe가 지향하는 핵심 목표라 말할 수 있을 것 같습니다.
동시 등장 확률
https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/04/09/glove/
Glove 는 특정 문맥 단어가 주어졌을 때 임베딩된 두 단어벡터의 내적이 두 단어의 동시 등장 확률 간 비율이 되게끔 단어를 임베딩 하고자 하였음
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론 관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here!
Numpy
Pandas
Tensorflow
데이터 처리 ML & DL Library
Scikit Learn
Konlpy
개발 관련
구현
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
OneHot Encoding : Simple Test Code show concept of onehot
http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
[Code]
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
Word2Vector : Using Gensim word2vec package
http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
FastText : FaceBook fasttext with gensim wrapper
http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
FastText : Possible to use pretrained vector and do find tuning on it
http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
N-grams are simply all combinations of adjacent words or letters of length n that you can
find in your source text.
Session 1 - Understand NLP
NLP - Lexical Analysis - Word Embedding
For large dataset word2vec training GPU acceleration is needed
You can also think about using Tensorflow or Keras for training model
https://github.com/SimonPavlik/word2vec-keras-in-gensim/blob/keras106/word2veckeras/word2veckeras.py
https://github.com/tensorflow/models/blob/master/tutorials/embedding/word2vec.py
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Lexical Analysis - DL ALgorithms
Paper Model CoNLL 2003 (F1 %)
Collobert et al.(2011) MLP with word embeddings+gazetteer 89.59
Passos et al.(2014) Lexicon Infused Phrase Embeddings 90.90
Chiu and Nichols(2015) Bi-LSTM with word+char+lexicon embeddings 90.77
Luo et al.(2015) Semi-CRF jointly trained with linking 91.20
Lample et al.(2016) Bi-LSTM-CRF with word+char embeddings 90.94
Lample et al.(2016) Bi-LSTM with word+char embeddings 89.15
https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/
https://arxiv.org/pdf/1708.02709.pdf
NER (Named Entity Recognition) Algorithm Performance
NLP - Lexical Analysis - DL ALgorithms
what do we want to do with this algorithm?
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
김승우 B-PERSON
전화번호 B-TARGET
검색 O
김승우 B-PERSON
이메일 B-TARGET
검색 O
김승우 B-PERSON
이미지 B-TARGET
검색 O
IOB Data
김승우 전화번호 검색
김승우 이메일 검색
김승우 이미지 검색
Plain Data
Sentence
Splitting
Token Morphing
Part of
Speech
Tagging
Lexical Analysis
Word2Vector
OneHot Encoding
1 0 0 0
0 1 0 0
0 0 1 0
김승우
전화번호
이메일
검색
B-PERSON
B-TARGET
김
우
승
Index
List
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
김승우
전화번호
이메일
검색
B-PERSON
B-TARGET
김
우
승
Index
List
[Code]
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
김
우
승
김승우
전화번호
이메일
Concat Vector
[Code]
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
Concat Vector
김승우
전화번호
이메일
검색
B-PERSONB-TARGET
BiLstm
Fully Connected Layer
B-? B-? B-?
[Code]
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
Conditional Random Field Soft Max
[Code]
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
http://people.cs.umass.edu/~mccallum/papers/crf-tutorial.pdf
Probabilistic Model for sequence data segmentation and labeling
https://www.slideshare.net/kanimozhiu/tdm-probabilistic-models-part-2
he first method makes local choices. In other words, even if we capture some information from the
context in our hh thanks to the bi-LSTM, the tagging decision is still local. We don’t make use of the
neighbooring tagging decisions. For instance, in New York, the fact that we are tagging York as a
location should help us to decide that New corresponds to the beginning of a location. Given a
sequence of words w1,…,wmw1,…,wm, a sequence of score vectors s1,…,sms1,…,sm and a
sequence of tags y1,…,ymy1,…,ym, a linear-chain CRF defines a global score s∈Rs∈R
Session 1 - Understand NLP
NLP - Lexical Analysis - BiLstmCrf
Real Project BiLstm Result Sample Code Predict Test Result
Test data Not Included in Train Set
Predicts well
http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/sequence_tagging/
Lexical Analysis
Syntactic Analysis
Semantic Analysis
NLU Server
(Understand)
NLG Server
(Generate)
Voice Recognition
Discourse Analysis
자연어 처리 이론
기본 이론
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Session 1 - Understand NLP
NLP - Lexical Analysis - SyntaxNet
구문 분석(構文分析, 문화어: 구문해석, 문장해석)은 문장을 그것을 이루고 있는
구성 성분으로 분해하고 그들 사이의 위계 관계를 분석하여 문장의 구조를
결정하는 것을 말한다.
Graph-Based Models Transition-Based Models
CYK Style Parsing MST finding Algorithm Projective & Non Projective Model
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models
Sentence W
Repeat until all words have their head
- Select two target words in data structure
(One dependent & one head candidate)
- Deterministically predict next parsing action from parsing model
- Modify structure according parsing action
C0 -> C1 -> C2 -> ……..C8 -> C9 -> C10 -> .… -> Cm D-tree
t1 t2 t3 t8 t9 t10 tm
Oracle
(Classifier)
Predict the best
transition
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Assume that we are given an oracle :
- for any non-terminal configuration, it can predict the correct transition
(for deterministic parsing)
- That is, it takes two words & magically gives us the dependency
relation b/w item if one exists
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Shift :
Move Economic from buffer B to stack S
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Left-arc :
Add left-arc (had, news, nsubj) to A
Remove news from stack (since it now has head in A)
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Right-arc :
Add right-arc (ROOT, had, root) to A
keep had in stack : because it can have other dependents on the right
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Left-arc :
Add left-arc (effect, little, amod) to A
Remove little from stack (since it now has head in A)
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Right-arc :
Add right-arc (had, effect, dobj) to A
Keep effect in stack : because it can have other dependents on right
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Right-arc :
Add right-arc (effect, on, prep) to A
Keep on in stack : because it can have other dependents on the right
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Shift :
Move financial from buffer B to stack S
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Left-arc :
Add left-arc (market, financial, amod) to A
Remove financial from stack (since it now has head in A)
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Right-arc :
Add right-arc (on, markets, pmod) to A
Keep markets in stack : because it can have other dependents on the right
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Reduce :
Remove markets, on, effect from stack (since they already have head in A)
※ All decisions like right-arc, left-arc, reduce, shift will be made by oracle
Session 1 - Understand NLP
NLP - Syntactic Analysis
Transition-Based Models - Arc Eager Transition System
Right-arc :
Add right-arc (had, period, p) to A
Keep period in stack
Done !
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
Parsing type Paper Model WSJ
Dependency
Parsing
Chen and
Manning(2014)
Fully-connected NN with features
including POS
91.8/89.6
(UAS/LAS)
Dependency
Parsing
Weiss et al.(2015) Deep fully-connected NN with features
including POS
94.3/92.4
(UAS/LAS)
Dependency
Parsing
Dyer et al.(2015) Stack LSTM 93.1/90.9
(UAS/LAS)
Constituency
Parsing
Petrov et al.(2006) Probabilistic context-free grammars
(PCFG)
91.8 (F1 Score)
Constituency
Parsing
Zhu et al.(2013) Feature-based transition parsing 91.3 (F1 Score)
Constituency
Parsing
Vinyals et
al.(2015b)
seq2seq learning with LSTM+Attention 93.5 (F1 Score)
Syntax Parsing Algorithm Performance
파싱(parsing, 구문분석)에는 두 가지 유형이 있다. 하나는 개별 단어를 이들 사이의 관계를 고려해 연결하는 의존구문분석(dependency
parsing)과 텍스트를 반복적으로 하위 구문으로 분리하는 구성성분분석(constituency parsing)이다.
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
We show this layout in the schematic below: the state of the system (a stack and a buffer, visualized
below for both the POS and the dependency parsing task) is used to extract sparse features, which
are fed into the network in groups. We show only a small subset of the features to simplify the
presentation in the schematic
Google SyntaxNet with Deep Learning - Pos Tagging
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
Google SyntaxNet with Deep Learning - A Fast and Accurate Dependency Parser using Neural Networks
https://arxiv.org/pdf/1603.06042.pdf
1 2 3
1 I _ PRP PRP _ 2 nsubj _ _
2 knew _ VBD VBD _ 0 ROOT _ _
3 I _ PRP PRP _ 5 nsubj _ _
4 could _ MD MD _ 5 aux _ _
5 do _ VB VB _ 2 ccomp _ _
6 it _ PRP PRP _ 5 dobj _ _
7 properly _ RB RB _ 5 advmod _ _
8 if _ IN IN _ 9 mark _ _
9 given _ VBN VBN _ 5 advcl _ _
10 the _ DT DT _ 12 det _ _
11 right _ JJ JJ _ 12 amod _ _
12 kind _ NN NN _ 9 dobj _ _
13 of _ IN IN _ 12 prep _ _
14 support _ NN NN _ 13 pobj _ _
15 . _ . . _ 2 punct _ _
18 units
(1),(2),(3)
18 units
(1),(2),(3)
12 units
(2),(3)
(1) The top 3 words on the stack and buffer: s1, s2, s3, b1, b2, b3; => 6
(2) The first and second leftmost / rightmost children of the top two words
on the stack: lc1(si), rc1(si), lc2(si), rc2(si), i = 1, 2. => 8
(3) The leftmost of leftmost / rightmost of rightmost children of the top two
words on the stack: lc1(lc1(si)), rc1(rc1(si)), i = 1, 2. => 4
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
Google SyntaxNet with Deep Learning - Local Parser
1. SHIFT: Push another word onto the top of the stack, i.e. shifting one token from the buffer to
the stack.
2. LEFT_ARC: Pop the top two words from the stack. Attach the second to the first, creating an
arc pointing to the left. Push the first word back on the stack.
3. RIGHT_ARC: Pop the top two words from the stack. Attach the second to the first, creating an
arc point to the right. Push the second word back on the stack.
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
As we describe in the paper, there are several problems with the locally normalized models we just
trained. The most important is the label-bias problem: the model doesn't learn what a good parse
looks like, only what action to take given a history of gold decisions. This is because the scores are
normalized locally using a softmax for each decision.
Google SyntaxNet with Deep Learning - Global Training
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
What’s Beam Search Algorithm on RNN ?
https://www.youtube.com/watch?v=UXW6Cs82UKo
Instead of try only the best every iteration, try all cases to the end and choose the sum is maximum.
But if you try to calculate all cases algorithms will be too heavy, so remain only the best few every
step and remove others (pruning). This is for find global maximum predict result .
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
What’s Beam Search Algorithm on RNN ?
Follow best every step may can miss chance to find global optimal case
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
What’s Beam Search Algorithm on RNN ?
Consider all cases will require too much computing power
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
What’s Beam Search Algorithm on RNN ?
Remove low score cases for every step (Pruning)
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
http://universaldependencies.org/
Google SyntaxNet do not support Korean as a default language.
But as we can see bellow, we can train the model with Sejong corpus data.
Though we have to covert the format for SyntaxNet to understand.
Google SyntaxNet with Deep Learning - How about Korean
Session 1 - Understand NLP
NLP - Syntactic Analysis - SyntaxNet
Demo Site (we also use samples on this site)
http://sejongpsg.ddns.net/syntaxnet/psg_tree.htm
SyntaxNet Korean with Docker (We pretrained Korean corpus and set up webserver for service)
https://github.com/TensorMSA/tensormsa_syntax_docker
Google SyntaxNet with Deep Learning - Test it by yourself
Lexical Analysis
Syntactic Analysis
Semantic Analysis
NLU Server
(Understand)
NLG Server
(Generate)
Voice Recognition
Discourse Analysis
자연어 처리 이론
기본 이론
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Session 1 - Understand NLP
NLP - Semantic Analysis
Sentential semantics
- Semantic role labeling (SRL)
- Phrase similarity (=paraphrase)
- Sentence Classification, Sentence Emotion Analysis and etc
What is Semantic in study of language
Three perspectives on meaning
- Lexical semantics : individual words
- Sentential semantics : individual sentences
- Discourse or Pragmatics : longer piece of text or conversation
NLP Tasks for Semantics
Session 1 - Understand NLP
NLP - Semantic Analysis
What is Semantic Role Labeling (SRL)
SRL = Semantic roles express the abstract role that arguments of a predicate
can take in the event.
The police arrested the suspect in the park last night
Agent predicate Theme Location Time
Who did what to whom where when
Can we figure out that these sentences have the same meaning?
Can we figure out the bought, sold, purchase used on sentence with same meaning?
XYZ corporation bought the stock.
The sold the stock to XYZ corporation.
The stock was bought by XYZ corporation.
The purchase of the stock by XYZ corporation.
Session 1 - Understand NLP
NLP - Semantic Analysis - Semantic Role Labeling
Common Semantic Role Labeling Architecture
http://naacl2013.naacl.org/Documents/semantic-role-labeling-part-1-naacl-2013-tutorial.pdf
Syntatic
Parse
Argument
Identification
Argument
Classification
Structural
Inference
Prune
Constituents
Candidates
Semantic
roles
Arguments
Step-1 Candidate Selection
- Parse the sentence
- Prune/filter the parse tree
(eliminate some tree constituents to speed up the execution)
Step-2 Argument Identification
- A binary classification of each node as Argument or NONE
- Local scoring
Step-3 Argument Classification
- A multi class (one-of-N) classification of all the argument candidates
- Global /joint scoring
ML
ML
ML
Session 1 - Understand NLP
Paper Model CoNLL2005 (F1
%)
CoNLL2012 (F1
%)
Collobert et
al.(2011)
CNN with parsing features 76.06
Tackstrom et
al.(2015)
Manual features with DP for inference 78.6 79.4
Zhou and
Xu(2015)
Bidirectional LSTM 81.07 81.27
He et al.(2017) Bidirectional LSTM with highway
connections
83.2 83.4
의미역 결정(Semantic Role Labeling, SRL)은 문장에서 술어(predicate)-논항(argument) 구조를 발견하는 것을 목표로 한다. 각 목표 동사(술어)에 대해, 동사의 의미역을
취하는 문장의 모든 구성요소가 인식된다. 전형적인 의미 논항은 행위주, 대상, 도구 등이며 위치, 시간, 방법, 원인 등도 포함된다(Zhou and Xu, 2015). 표7은 CoNLL 2015 및
2012 데이터셋에서 여러 모델의 성능을 보여준다.
전통적인 SRL 시스템은 여러 단계로 구성된다. 파싱 트리를 생성한 뒤 트리의 노드가 주어진 동사의 논항을 나타내는지 판별한 다음, 해당 SRL 태그를 결정한다. 각 분류
과정은 많은 피처를 추출하여 통계 모델(statistical model)로 전달하는 과정을 대개 수반한다. (Collobert et al., 2011)
Tackstrom et al. (2015)는 술어가 주어지면 파싱 트리를 기반으로 하는 일련의 피처로 구성요소의 범위와 해당 술어에 대한 의미역 후보들에 점수를 매긴다. 그들은 효율적인
추론을 위한 동적 프로그래밍(dynamic programming) 알고리즘을 제안했다. Collobert et al., (2011)은 추가적인 참조 테이블의 형태로 제공된 파싱 정보에 의해 보강된 CNN을
사용하여 유사한 결과를 얻었다. Zhou and Xu(2015)는 임의의 긴 문맥을 모델링하기 위해 bidirectional LSTM을 제안했는데, 파싱 트리 정보 없이도 성공적인 것으로
판명되었다. He et al. (2017)은 이 연구를 더욱 확장해 ‘highway connection’을 소개했다.
NLP - Semantic Analysis - Semantic Role Labeling
LSTM is effective of SRL problem too !
Session 1 - Understand NLP
NLP - Semantic Analysis - Semantic Role Labeling
Bidirectional LSTM with highway connections
Stack more layers on RNN with highway technique !
https://homes.cs.washington.edu/~luheng/files/acl2017_hllz.pdf
Session 1 - Understand NLP
NLP - Semantic Analysis - Semantic Role Labeling
Semantic Role Labeling Applications
Information : Anna is friend of mine.
http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/neo4j/neo4j_basic.ipynb
Who WhoWhat
session.run("MATCH (you:Person {name:'You'})"
"FOREACH (name in ['Anna'] |"
" CREATE (you)-[:FRIEND]->(:Person {name:name}))")
result = session.run("MATCH (you {name:'You'})-[:FRIEND]->(yourFriends)"
"RETURN you, yourFriends")
Neo4j Insert Query
Neo4j Jupyter example & visualize
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
What kind of problem we want to solve ?
Can we figure out that these sentences are positive or negative?
돈이 아깝지 않다 (긍정)
다시는 오지 않을 거야 (부정)
음식이 정말 맛이 없다 (부정)
이 식당은 정말 맛있다 (긍정)
Analysis negative and positive with dictionary
word “않다” is usually negative but ?
돈이 아깝지 않다 => Positive
다시는 오지 않을 거야 => Negative
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
There are many ways of doing text classification..
Traditional Rule based Machine Learning - Logistic & SVM
Deep Learning - CharCNN, RNN, Etc..
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
Paper Model SST-1 SST-2
Socher et al.(2013) Recursive Neural Tensor Network 45.7 85.4
Kim(2014) Multichannel CNN 47.4 88.1
Kalchbrenner et al.(2014) DCNN with k-max pooling 48.5 86.8
Tai et al.(2015) Bidirectional LSTM 48.5 87.2
Le and Mikolov(2014) Paragraph Vector 48.7 87.8
Tai et al.(2015) Constituency Tree-LSTM 51.0 88.0
Kumar et al.(2015) DMN 52.1 88.6
https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/
https://arxiv.org/pdf/1708.02709.pdf
Semantic Analysis - CharCNN
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb
Deep Learning Method CharCNN can be a solution for this kind of problem.
1 2
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb
Preparing Data for embedding is pretty similar to other neural networks
1. Word Embedding & OneHot didn’t show that much difference.
2. Personally, prefer to concat char onehot + word2vector오늘
메뉴
는
뭐
지?
PAD
PAD
1. Need to define sentence max length
2. Need padding like other nlp neural networks
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb
Using Multi Convolution Filter Size
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb
Other steps are same (fully connected > softmax > loss> optimizer)
Session 1 - Understand NLP
NLP - Semantic Analysis - CharCNN
You can see Char CNN can distinguish two sentences
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
NLP - Discourse Analysis
https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/
Paper Model bAbI (Mean accuracy
%)
Farbes (Accuracy
%)
Fader et al.(2013) Paraphrase-driven lexicon
learning
0.54
Bordes et al.(2014) Weekly supervised embedding 0.73
Weston et al.(2014) Memory Networks 93.3 0.83
Sukhbaatar et
al.(2015)
End-to-end Memory Networks 88.4
Kumar et al.(2015) DMN 93.6
Discourse Analysis - End2End Memory Network
Session 1 - Understand NLP
Discourse Analysis - End2End Memory Network
https://arxiv.org/pdf/1503.08895v4.pdf https://arxiv.org/pdf/1503.08895v4.pdf
NLP - Discourse Analysis
Session 1 - Understand NLP
Here is the network architecture of end2end memory network
https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/
https://www.slideshare.net/mobile/carpedm20/ss-63116251
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
(1) Feed data (“Sentences”, “Question”, “Target”)
1
2
3
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
Convert word index to embedding vector (Training target vector A,B,C)
1
3
Vocab
Size
2 Dim
Size
vocab size
Mem Size
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
Embedding A from given context sentences multiply Input Question Embedding (using embedding B
which is not defined on this code) ※ if it’s a first layer, if not it would be output of t-1 layer
1
2 1
2
multiply
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
NLP - Lexical Analysis - Memory Network
Set embedding C(on the code it’s B) this is also the target variable for train
Session 1 - Understand NLP
Embedding C(one the code it’s B) Multiply softmax result
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
For the last multiply question and output of memory network again
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
stack more memory layers
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
Set fully connected layer and calculate error with softmax cross entropy
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
On the given code I removed 90% of data set because we are using CPU for education..
So result may can be poor…..
NLP - Discourse Analysis - Memory Network
Session 1 - Understand NLP
https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/
https://github.com/YerevaNN/Dynamic-memory-networks-in-Theano
Dynamic Memory Networks Episodic Memory
Other types of memory networks ..
NLP - Discourse Analysis - Memory Network
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning Basic
NLU Server
(Understand)
NLG Server
(Generate)
SyntaxNet
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론
기본 이론
관련 딥러닝
이론 설명
Session 1 - Understand NLP
Session 1 - Now We are Here !
Response Generation
Memory Network
Seq2Seq
Session 1 - Understand NLP
Seq2Seq 모델은 기계번역, 요약, 간단한 질답 등 말 그대로 Input 과 Output 이 모두 Sequence Data 인
다양한 케이스에 적용이 가능하며, 이를 간단한 트릭을 적용하여 답변을 생성하는 용도로 사용할 수 있다.
- Input : 딥 러닝 재미 즐거운 일
- Output : 딥 러닝은 재미있고 즐거운 일이다
https://arxiv.org/pdf/1406.1078.pdf
https://www.slideshare.net/KeonKim/attention-mechanisms-with-tensorflow
NLP - Response Generator - Seq2Seq
https://nlp.stanford.edu/pubs/emnlp15_attn.pdf
Session 1 - Understand NLP
NLP - Response Generator - Attention Mechanism
Attention Mechanism on Machine Translation
https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
Session 1 - Understand NLP
NLP - Response Generator - Attention Mechanism
Attention Mechanism on Machine Translation
Bahdanau
http://aclweb.org/anthology/D15-1166
Luong
https://blog.heuritech.com/2016/01/20/attention-mechanism
LocalGlobal Input Feeding
Session 1 - Understand NLP
NLP - Response Generator - Bahdanau
https://blog.heuritech.com/2016/01/20/attention-mechanism/
Without Attention Mechanism With Attention Mechanism
Session 1 - Understand NLP
NLP - Response Generator - Bahdanau
1.embedding layer with inputs
○ embedded = embedding(last_rnn_output)
2.attention layer with inputs and outputs , normalized to create
○ attn_energies[j] = attn_layer(last_hidden, encoder_outputs[j])
○ attn_weights = normalize(attn_energies)
3.context vector as an attention-weighted average of encoder outputs
○ context = sum(attn_weights * encoder_outputs)
4.RNN layer(s) with inputs and internal hidden state, outputting
○ rnn_input = concat(embedded, context)
○ rnn_output, rnn_hidden = rnn(rnn_input, last_hidden)
5.an output layer with inputs , outputting
○ output = out(embedded, rnn_output, context)
Session 1 - Understand NLP
NLP - Response Generator - Implementation
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
data_util
(1)Data Processing & Feed Data
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(2)Word Embedding
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(3)Encoder
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(4)Attention
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(5)Decoder & Attention
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(6)Loss & Optimization
Session 1 - Understand NLP
http://localhost:8888/tree/chap05_nlp/attention_seq2seq
NLP - Response Generator - Implementation
(7)Inference Task
Session 1 - Understand NLP
NLP - Response Generator - Seq2Seq
Pointer Network
https://medium.com/@devnag/pointer-networks-in-tensorflow-with-sample-code-14645063f264
논문 저자들은 “포인터 네트워크"라는 새로운 뉴럴넷 구조를
제안합니다. 포인터 네트워크는 집중 메커니즘을 가진 seq2seq
구조로, 입력의 "인덱스"를 출력합니다. 출력 보카가 입력 시퀀스의
길이에 따라 달라지므로 다양한 크기의 입력을 다룰 수 있다는
장점이 있습니다. (주석: 기존의 seq2seq나 뉴럴 튜링 머신은
고정된 길이만 다룰 수 있었습니다.) 여기서 사용한 집중
메커니즘은 표준 seq2seq 집중 메커니즘을 살짝 변형했으며
O(n^2)의 시간 복잡도를 갖습니다.
논문 저자들은 제안한 구조를 평가하기 위해 컨벡스 헐, 딜루나이
삼각화, 순환 판매원 문제(TSP) 등 입력의 위치(순서)를 정답으로
출력해야하는 과제를 사용했습니다. 그 결과 포인터 네트워크는 잘
작동했고, 심지어 학습 데이터보다 더 긴 길이의 시퀀스에서도
동작했습니다.
What else ?
Session 2 - Make ChatBot
Session 2 - 강의 목표
Sessionn 1에서 배운 NLP에 대한 이해를 바탕으로 AI를 적용하여 전체
아키텍쳐를 이해하고 피자 주문 봇을 바탕으로 수강생분들이 자기만의
ChatBot을 만들어 가는 것을 목표로 함
Session 2 - Make ChatBot
Session 2 : Susang Kim healess1@gmail.com
●Chatbot Develover
○ Released in POSCO (Find people using by NLP/AI)
○ Deep Learning MSA (ML,DNN, CNN, RNN)
●Agile Develover (worked at Pivotal Labs)
○ TDD, CI, Pair programming, User Story
●iOS Develover (Ranked App store in 100th - 2011 Korea)
●Front-End Developer (React, D3, Typescript and ES6)
●OSS world Challenge 2017 (on top 12 , on progress now)
●POSCO MES ... (working at POSCO ICT for 10 year)
Facebook AI shut down after creating their own language
논문 https://arxiv.org/abs/1706.05125
Remind of Session 1
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning BasicNLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
SyntaxNet
Scenario
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System
ChatBot
Server
Numpy
Pandas
Tensorflow
파이프 라인 데이터 처리 ML & DL Library
Scikit Learn
Konlpy
개발 관련
데이터 수집
데이터 전처리
모델 훈련
모델 평가
모델 서비스
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message 기본 이론
관련 딥러닝
이론 설명
예제를 통한
구현 설명
Memory Network
Seq2SeqResponse Generation
Ontology
DM
Legacy Data Base
[AI Based] Chat-Bot Research Environment
Data MartMonitoring
Summary Result
Train Data
AI Model
Pipe Line
Session 2 - Make ChatBot
Session 2 - Make Chatbot
[출처 Deview 2016 - https://deview.kr/2016/schedule#session/176]
요즘 왜 Chatbot이 뜨는가??
직관적인 UX
일관성 있는 경험
음성과 연결 가능
별도 App 설치 필요 없음
다양한 서비스와 연결 가능
빠른 Feedback
플랫폼에 독립
Chatbot의 특징
• 많은 기술이 필요 (NLP, AI, F/W, Text Mining and 다양한 개발 skill)
• Deep Learning을 공부하는 입장에서 결과 확인이 빠름
- 적은 Computing으로 빠른 결과확인 가능 (Text 기반)
• 재미가 있음(Micro Data처리에 비해 Biz dependency가 적은편)
- 이미지(CNN)이나 정형Data(DNN)보다는 Data처리에 대한 부담감이 적음
(형태소 분석기등으로 쉽게 전처리 쓴다는 가정하에)
• 응용분야가 많음 (API기반의 다양한 서비스 연결 Smart Management)
- Intent와 Slot만 채워주면 어느 서비스와 연결가능
• 관련 오픈소스가 적어 블루오션 (한글은 대부분 자체개발해야함)
- 다행인건 딥러닝 기반의 언어독립적 Text algorithm이 많이 공개되어 활용 가능
• Bot Service가 있으나 가격부담, 한국어는 잘안됨, Customize 불가
Session 2 - Make ChatBot
Session 2 - Understand Chatbot
Chatbot은?
AI
(패턴,맥락)
언어학
(자연언어처리)
프로그래밍
(Data처리-Python)
Bot F/W
(Story/Slot설계)
Architecture
(응답속도)
Text Mining
(Data구성)
Chatbot
Chatbot 구현을 위해서는 많은 분야의 다양한 기술 필요
Session 2 - Make Chatbot
다양한 Chatbot Platform이 존재는하고 있음
API.AI로 코딩없이 챗봇 만들기 https://calyfactory.github.io/api.ai-chatbot/
모든 챗봇에는 의도와 개체인식이
존재 또한 그 것을 위해서는 Data가
중요함!!!
api.ai에 가입해서 챗봇을
만들어보면서 원리를 파악해보면
도움이 됨
Session 2 - Make Chatbot
Closed Domain vs Open Domain
Rule Based
General
(abstract)
Open
Closed
Retrieval
(accuracy)
Impossible Strong AI
Weak AI
level of difficulty
작은 Biz 도메인으로 시작해서 정확도를 높이면서 여러 Biz를 추가하는 상황
Session 2 - Make Chatbot
Rule Based vs AI
Computer
Input
Program
Output
Rule
이름, 지역, 팀등 조건별로 일일이 rule을 등록해야한다
- 정확도는 올라가나 모든 질문을 다 등록??
(룰을 백만개 등록하면 가능)
Computer
Input
Output
Program
AI(ML, DL)
라벨링된 Data만으로 결과를 구할 수 있는 모델을 만들 수 있다
- 비슷한 Data들도 잘찾는편(Word2Vec,Glove)
intent = 판교에 근무하는 김수상 찾아줘 => Intent : 특정 지역 사람 찾아줘
NER = 판교에 근무하는 김수상 찾아줘 => NER : B-Loc O O B-Name O
정확한 결과를
얻을 수 있으나
모든 질문은 불가
비슷한 유형의
질문은 적당히 잘
찾아줌 Data가
많을 수록 정확도
향상(학습효과)
If (loc = 판교 and comp = 포스코ICT)
person = 김수상
elif (loc = 판교 and comp = SK)
person = 가나다
else
person = 홍길동
Make ChatBot Now
Lexical Analysis
Syntactic Analysis
Semantic Analysis
Word Embedding
BilstmCrf
CharCNN
Deep Learning BasicNLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
SyntaxNet
Scenario
Voice Recognition
Discourse Analysis
자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System
ChatBot
Server
Numpy
Pandas
Tensorflow
파이프 라인 데이터 처리 ML & DL Library
Scikit Learn
Konlpy
개발 관련
데이터 수집
데이터 전처리
모델 훈련
모델 평가
모델 서비스
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message 기본 이론
관련 딥러닝
이론 설명
예제를 통한
구현 설명
Memory Network
Seq2SeqResponse Generation
Ontology
DM
Legacy Data Base
[AI Based] Chat-Bot Research Environment
Data MartMonitoring
Summary Result
Train Data
AI Model
Pipe Line
Session 2 - Make ChatBot
This Lesson
Session 2 - Make Chatbot
나만의 ChatBot를 만들어보자
피자 주문 챗봇을 어떻게 만들지?
피자를 주문하려면 피자 종류도 여러가지고, 사이즈도 다양하고,장소와
날짜, 사이드메뉴도등 다양한데 어떻게 ChatBot으로 만들 수 있을까?
⇒ 피자주문과 관련된 스토리가 구성되야함
⇒ 딥러닝과 적당한 로직으로 피자 주문 Bot을 만들어보자
Session 2 - Make Chatbot
NLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
Scenario
Chat-Bot System
ChatBot
Server
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message
Session 2 - Make Chatbot
질문 : 판교에 포스코ICT에 배달해줘
답변 : 사이즈를 선택해주세요
답변 : 장소를 입력해주세요
답변 : 피자주문 처리가 완료되었습니다.
Text(Message)
1
3
4
2
Session 2 - Make Chatbot
Chatbot Interface Flow
NLP
Context Analyzer
Decision Maker
판교에 포스코ICT에 배달해줘
Intent : 피자주문
Entity : 장소 = 판교 포스코ICT
Service Manager
Response
Generator
메뉴=null
시간=null
배달관련 Slot 분석(Knowlodge Base/Scenario)
Entity : 메뉴:Null, 시간:null
피자주문 처리가 완료되었습니다.
피자주문 Slot 완성
어떤 메뉴를 원하시나요?
어떤 메뉴를 원해? (Tone Gen)
Slot OK
Session 2 - Make Chatbot
Story slot의 구성 (Frame-based DM)
피자 주문하고 싶어
Pizza Slot
Size
Type
Side menu
피자 주문 의도 파악
피자 Bot의 스토리 구성
1) 어떤 사이즈를 원하시나요?
2) 어떤 종류를 원하시나요?
3) 사이드 메뉴는 필요하신가요?
사용자 답변
- 페파로니 피자로 라지 사이즈에
콜라추가해주세요
NER처리 및 Slot 구성
Pizza Slot
Size Large
Type Pepperoni
Side menu cola
서비스 연결
(Slot API Call)
처리를 위해 Slot를 선택할
수 있게 보여주는 것도 방법
(UX기술까지 필요??)
Session 2 - Make Chatbot
1. 맥북 프로 검색해줘
2. 전처리 -> 맥북 프로 NER
3. 맥북프로 -> 대표 Entity처리 -> MacBook Pro API Call
4. 검색결과 출력
5. 상세 서비스 조회를 위한 Slot 출력
6. 새상담 원할 경우 새상담 클릭
Slot를 선택할 수 있게 화면에 출력함으로써 챗봇의
정확도를 대폭 향상 시킬 수 있음
(해당 Frame안에서만 선택할 수 있기에…)
ex) “삼성 노트북” 쳐보면 Slot별 선택
바로봇
http://www.11st.co.kr/toc/bridge.tmall?method=chatPage
Slot
Trigger
API
Session 2 - Make Chatbot
NLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
Scenario
Chat-Bot System
ChatBot
Server
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message
Session 2 - Make Chatbot
판교에 포스코ICT에 배달해줘
NLU를 어떻게 한다는거지?
=> AI 적용을 위해 Vector로
변환을 해야함
1
Session 2 - Make Chatbot
Word Represention의 정의 (컴퓨터가 잘 이해할수 있게)
- One Hot은 단어별 강한 신호적 특성으로 Train 에 효과적 (Scope가 작을경우-Sparse)
- Word 단위 Embedding 은 단어를 잘 기억함 (But Sparse) / W2V (유사도)
- Gloves는 단어의 세부 종류까지도 구분 (카라칼-고양이)
- Char 단위 Embedding 은 미훈련 단어 처리에 용이 (Vector을 줄이기위한 영어변환)
- 한글을 변환한 영어 Char 단위 Embedding는 백터 수를 줄이면서 영어 처리도 가능
Train을 위한 Word Representation
15 한국어에 적합한 단어 임베딩 모델 및 파라미터 튜닝에 관한 연구.pdf
Session 2 - Make Chatbot
일반적으로 Biz에 따른 Text는 존재하나 Deep Learning를 구현하기 위해서는
정제된 Text와 Tagging이 가능한 매우 많은 Data가 있어야함
한국어 Corpus를 일반적으로 세종 말뭉치를 사용하여 추가적인 Biz 어휘는 새로 학습시킴(노가다)
- Corpus (annotation) 세종말뭉치(2007 ) https://ithub.korean.go.kr/user/main.do
- 물결21 (2001~2014) 소스X http://corpus.korea.ac.kr/
- Web Crawling or down (Wiki, Namu Wiki)
- Domain Specific의 경우엔 Text Data는 직접 만들어야함(Augmentation)
특화된 단어의 경우 새로 학습시켜야함 (ㅎㅇ? , 방가방가)
※ 고유명사등 새로운 어휘가 생성될때 새로 등록을 해주어야함
Data를 어떻게 얻는가?
Session 2 - Make Chatbot
문체부·국립국어원 '2차 세종계획' 추진
4차 산업혁명의 기반인 인공지능(AI)의 핵심 중 하나는
사람과 기계의 자유로운 의사소통이다.
컴퓨터가 인간의 말이나 글을 제대로 이해하고
반응하려면 인간이 말하고 쓰는 자연언어를 처리할 수
있는 방대한 언어 데이터베이스가 필요하다.
이러한 언어 데이터베이스를 말뭉치(corpus)라고 한다.
최근 빠르게 보급되는 음성인식 인공지능의 정확도는
이러한 말뭉치가 얼마나 풍부하게 정교하게 구축돼
있느냐에 달려있다.
문화체육관광부와 국립국어원은 한국어 인공지능
기술의 발전을 위해 2018~2022년 총 154억7천만 어절의
말뭉치를 구축하는 국어 정보화사업 계획을 마련했다고
9일 밝혔다.
Session 2 - Make Chatbot
Train Vector를 정한 후 Feature를 뽑아야함
Cleansing -> Feature Engineering -> Train
(상황별 특수문자 제거, 의미 있는 단어 도출 - Tagging)
의도나 객체와 상관있는 단어만 추출해내어
성능을 향상시킴Train Cost를 줄이고 모델의 성능을 향상)
임베딩 차원도 줄이는 효과 (Dense Respresention-SVD)
abcd~z, 0~9, ?, !, (,),’,’,공백등 약 70여개
초중종성으로 글자를 쪼개기에는 어려움
.lower()를 활용하는것도 방법 백터 줄이기
학습시킬 Data의 구성
Session 2 - Make Chatbot
판교에 포스코ICT에 배달해줘
Data의 양이 적은데 어떻게
정제된 Data를 구하지?
[AI Based] Chat-Bot Research Environment
Data MartMonitoring
AI Model
Pipe Line
Session 2 - Make ChatBot
Session 2 - Make Chatbot
1
Session 2 - Make Chatbot
Data Augmentation for AI (Intent - tag)
판교에 오늘 피자 주문해줘 Story Definition
Intent Mapping주문 해줘 Entity Mapping 메뉴 : 피자, 장소 : 판교, 날짜 : 오늘
Pattern Generation
30% of Train Data
의도 : 피자 주문 (주문)
Preprocessing판교 오늘 피자 주문
Story key value (주문)
tagloc tagdate tagmenu 주문
Model Train(Char-CNN)
Evaluation
tagloc tagdate tagmenu 주문
tagloc tagdate 주문
tagdate tagmenu 주문
tagloc tagmenu 주문
Predictiontagloc tagdate 주문 tagmenu
Hyper parameter
Selection
의도 = 주문
Session 2 - Make Chatbot
Data flow for Model in AI (NER - BIO)
판교에 오늘 피자 주문해줘 Story Definition
tagloc tagdate tagmenu 주문
BIO-Mapping
Preprocessing판교 오늘 피자 주문
B_Loc / B_Date / B_menu
Model Train(Bi-LSTM)
B-loc B-date B-menu 주문
B-loc B-date 주문
B-date B-menu 주문
B-loc B-menu 주문
Text Generator
Pattern Matching
tagloc tagdate tagmenu 주문
tagloc tagdate 주문
tagdate tagmenu 주문
tagloc tagmenu 주문
W2V
30% of Train Data Evaluation
Prediction판교 오늘 피자 주문
Hyper parameter
Selection
피자 : 0.12
장소 : 0.7
메뉴 : 0.3
객체인식
B_loc O B_Date B_menu 주문 O
Session 2 - Make Chatbot
NLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
Scenario
Chat-Bot System
ChatBot
Server
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message
Session 2 - Make Chatbot
판교에 포스코ICT에 배달해줘
Data는 구했는데 의도를
어떻게 알아내지?
1
Intent를 알아내는법 (Text Classification)
피자주문 하고 싶어 / 여행 정보 알려줘 / 호텔 예약해줘
주문, 정보, 예약의 3가지 의도
문장 내 Word검색으로 일일이 파악할 수도 있으나 한계가 있음
ex) 피쟈 시켜먹고 싶어 / 여행 좋은데 알려줘….
Deeplearning를 활용하면 이런 문제들을 해결 할 수 있음
Char + CNN으로 분류해보자
(CNN - Feature 주문, 정보, 예약)
(Word Similarity 피자, 피쟈 / 정보, 갈만한데)
Intent를 알아내는법 (Text Classification - Data 구성)
Word
피자
주문
하고
싶어
Vector가 많다면
영어발음변환
PIJA
JUMUN
HAGO
SIPO
숫자, 특수문자,공백등
모두 고려해야함
W2V(Pretrained)
피자 (0.12, 0.54, 0.72)
주문(0.56, 0.65, 0.64)
하고(0.67, 0.91, 0.13)
싶어(0.89, 0.14, 0.11)
Ont Hotencoding (Word단위 or 글자단위)
(0100000000)
(0000010000)
(0010000000)
(0000000100)
Ont Hotencoding (A~Z Vector)
(0100000000)
(0000010000)
(0010000000)
(0000000100)
Char CNN?
CNN은 일반적으로 이미지의 특징을
추출하여 인식하는데 많이 쓰이나
이미지도 결국은 Vector이고
텍스트도 Vector을 감안하면
텍스트의 Feature를
뽑아낼 수 있음
Text Classification - Char CNN
지금
피자
주문
하고
싶어
[논문 Convolutional Neural Networks for Sentence Classification - Yoon Kim - https://arxiv.org/abs/1408.5882]
예약
주문
정보
Feature
바라볼단어수
[3,4,5 filter]
Vector (W2V)
길이/차원/윈도우
Static / Non Static
/ Random
pooling
추상화
classification
분류
Char-CNN을 활용하여 의도를 파악해보자
Why Char-CNN??
Char-CNN이 일반적인 다른 알고리즘과 비교하여 좋은 성능을 보임
논문 Convolutional Neural Networks for Sentence Classification - Yoon Kim - https://arxiv.org/abs/1408.5882
Text Classification (Multi-class SVM)
Char-CNN보다 간단하게 Machine Learning를 활용하여 의도를 파악할 수 있음
Session 2 - Make Chatbot
NLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
Scenario
Chat-Bot System
ChatBot
Server
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
message
Session 2 - Make Chatbot
판교에 포스코ICT에 배달해줘
Entity는 어떻게 알아내지?
1
RNN에 대한 이해
연속된 Data에 대한 모델링에 유용
시퀀스를 입력으로 받기 때문에
Backpropagation을 시간에 대해서도 수행(BPTT)
http://aikorea.org/blog/rnn-tutorial-3/
http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf
Seq2Seq (RNN+RNN) 이해
Chatbot에서는
Generator의 역활
Sentence Generator
영화 자막이나 소설책을 활용하여
학습시킬 수 있음
(형태소 분석기로 input/output정의)
http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf
LSTM에 대한 이해
Cell State
https://brunch.co.kr/@chris-song/9
updateforget out
cell state
http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf
ResNet과 RNN의 LSTM은 비슷한 개념
Named Entity Recognition 알아내기
Bidirectional LSTM (양방향 Layer)
- RNN기반의 모델
- 특정위치에 있는 단어의 태깅에 유용
문장내 단어 위치에 따른 의미 처리하는 효과적인 방법
[ 한국어 정보처리 학술대회 - https://sites.google.com/site/2016hclt/jalyosil]
Why Bi-LSTM CRF ?
[ Bidirectional LSTM-CRF Models for Sequence Tagging - https://arxiv.org/pdf/1508.01991.pdf ]
피자 주문하고 싶어
B-Pizza B-Order O O
여행 정보 알려줘
B-Travel B-Information O
호텔 예약해줘
B-Hotel B-Reserve O
Named Entity Recognition 알아내기
brat를 활용 BIO Tagging
B-시작어휘
I-이어지는 어휘
O-어휘아님, 공백(OUT)
U-Unknown
(Word Embedding이 없을시)
※New York?,수상하다?
Brat - http://brat.nlplab.org/examples.html / https://wapiti.limsi.fr/
Bi-LSTM으로 사전 강화 -> 모델 학습
피자 주문하고 싶어
B-Pizza B-Order O O
여행 정보 알려줘
B-Travel B-Info O
호텔 예약해줘
B-Hotel B-Reserve O
피이쟈 주문하고 싶어
놀러갈 정보 알려줘
숙소 예약해줘
피자
여행
호텔
Bi-LSTM을 통해서 신규 어휘를 도출하고 학습Data에
반영하여 모델의 성능을 지속적으로 향상 시킴
Session 2 - Make Chatbot
NLU Server
(Understand)
NLG Server
(Generate)
DM
Server
Messaging
Platform
BackEnd
Service Servers
Scenario
Chat-Bot System
ChatBot
Server
BackEnd
Service Servers
message
intent & slot
information
message
message
Semantic Frame
Semantic Frame
connect services
판교에 포스코ICT에 배달해줘
의도도 파악했고 Entity도
알아냈으니 서비스를 만들어보자 message
Session 2 - Make Chatbot
12
3
Session 2 - Make Chatbot
ChatBot Layer
Log File
Chatbot Architecture
Deep Learning Layer 위에 ChatBot Layer 와 같은 Application Layer 를
구성하고 각 Application Layer 는 필요한 기능을 DL Layer 와 연동.
DeepLearning Layer
Bi-LSTM
CRF
Char-CNN
SVM
Seq2Seq
Attention
NAS File
Model
Bot DB
Residual
Vgg
NLP
Context
Analyzer
Decision
Maker
Response
Genertor
※ 이미지검색을 위해 Residual등과 같은 모델 활용
Bot Builder
GPU
Deeplearning
Predict
Dict File
Bot config
Train
Train
Intent /
NER
Session 2 - Make Chatbot
NLP Architecture
Preprocessing
Python
Konlpy
Mecab
(Sejong Corpus)
Tensorflow
SVM
Char-CNN
Bi-LSTM CRF
Gensim
FastText
User-Dic
Synonym
Voting
Python
API Service
(Swagger)
판교 근무하는 포스코ICT에 김수상한테 피자 주문하고 싶어...
[Intent 도출]
피자 주문
[NER 도출]
판교 - Loc
포스코ICT - Loc
김수상 - Name
고유명사
('포스코'ICT'', 'NNP'),
('김수상', 'NNP'),
※Mecab 고유명사등록
링크
문장길이체크 , 특수기호
(...) 삭제
명사 추출
명사 추출
[('판교', 'NNG'),
('근무', 'NNG'),
('하', 'XSV'),
('는', 'ETM'),
('포스코'ICT'', 'NNP'),
('에', 'JKB'),
('김수상', 'NNP'),
('한테', 'JKB'),
('피자', 'NNG'),
('주문', 'NNG'),
('하', 'XSV'),
('고', 'EC'),
('싶', 'VX'),
('어', 'EC')]
Intent Slot 및 모델 비교
피자주문 Slot의 Entity값
NER 결과값
Input Data=’’ 판교 근무하는
포스코ICT에 김수상한테 피자
주문하고 싶어…”
Intent=’피자주문’
Intent_History=[‘피자주문’,’’]
story_slot_entity
{
‘메뉴’:피자’’,
‘지역’ : ’판교 포스코ICT’,
‘이름’ : ‘김수상’}
request_type=’text’
service_type=’pizza order’
output_data=’’
}
Meta
Session 2 - Make Chatbot
Docker (Ubuntu) in AWS EC2
(c4.8xlarge / p2.xlarge GPU)
NAS
DB Server
Bot Builder
(analysis)
React
Chatbot Server (Django)
Python
Tensorflow
Postgres
SQL
Bootstrap
Web Service Architecture(MSA)
D3
SCSS
Konlpy
Nginx
Celery
Log File Model File
Rabbit
MQ
Service
Java
Node
Python
Rest
Gensim
Front-End
Java
(Trigger)
Rest
LB Rest
AP2
GPU Server
(HDF5)
GPU Server
(HDF5)
Dict File
Hbase
Session 2 - Make Chatbot
Bot Builder and UX (Story)
Session 2 - Make Chatbot
ChatBot
Definition
ChatBot Intent
ChatBot
Service
ChatBot Intent
Entity
ChatBot
Story
ChatBot
Response
ChatBot
Model
ChatBot
Tagging
ChatBot
Entity Relation
ChatBot
Synonym
Bot Builder DB
Service의 확대를 위해 가능하면 Common하게 구성
Session 2 - Make Chatbot
Rest API
Client
Input Data=페파로니 피자 주문할께
Intent=’’
Intent_History=[‘ ’,’’]
story_slot_entity
{
메뉴:’’,
사이즈:’’,
사이드:’’
}
request_type=text
service_type=’’
output_data=’’
Server
Input Data= 페파로니 피자 주문할께
Intent=피자주문
Intent_History=[‘피자주문’,’’]
story_slot_entity
{
메뉴:피자,
사이즈:라지,
사이드: 콜라
}
request_type=text
service_type=’’
output_data=주문완료
Chatbot API
※ 필수 값들만 JSON으로 통신하고 다른 값은 Dilog Manager(Log)에서 관리
Session 2 - Make Chatbot
Case별 Test Coverage 코드 구현
1. 로직 변경 (단위테스트)
2. Model 변경 (Hyper Parameter)
3. Data 변경(Slot, Dict, Entity,유의어)
4. 속성 값 변경 (Threshold, Rule기준)
단순 로직 변경과는 다르게 Data와 Model의 변경사항을 지속적 검증 할 수 있는 방안 필요
가동상황에서 정확도를 올리기 위해선 Continous Integration이 필수 (Jenkins / Travis CI등)
Test Codes for Chatbot
피자주문
호텔예약 의도점검->NER점검-> Slot점검
여행정보
input 판교에 피자주문할께 -> intent : 피자주문
slot : {메뉴,크리,사이드-extra}
실무에서 발생하는
문제와 해결 Tips
모델의 정합성을 올리기 위해 복수개의 모델과 로직으로 보완 (Scoring / Voting)
의도를 찾는 경우 여러모델을 비교하여 가장 근접한 값을 찾는다
Textming과 앙상블의 조합으로 정합도르 올리자(Fine tunning)
포스코ICT에
지금 피자
배달해줘
Char-CNN
VotingSVM(Multi-class) Result
naive_bayes.MultinomialNB 각 의도별 Slot 비교
배달의 경우엔
장소,시간이 필수
여행정보
메뉴배달
메뉴배달 피자 배달
Ensemble and Voting
모델별
가중치
Slot
비교
병렬 수행
Trigger 처리 (사랑, 이미지 검색)
1. 사랑단어가 포함될 경우
<실재 가동 사례>
직원 : XXX 사원에게 사랑한다고 포스톡 보내줘
챗봇 : 너무 쉽게 사랑하지 마세요.
직원 : 니가 먼제 내 사랑을 논해
챗봇 : 학습중이라 아직 잘 모르는게 많아요.
직원 : ㅋㅋㅋㅋ
챗봇 : ㅋㅋㅋ
[안녕, 사랑, ㅋㅋㅋ] 등에 Trigger를 적용하고 이에 확보된
Data를 Seq2Seq모델에 학습시켜 NLP전처리 모델로 사용
https://www.youtube.com/watch?v=x9bvkXJ-JeQ
2.이미지 검색 시(ResNet Model Call)
필요시 Tone Generater을 쓰자
말투를 다르게만듬 (지역별, 존댓말 , 부하톤)
주문이 완료되었습니다 (일반)
주문이 완료되었단다 (공손)
주문이 완료되었어요 (존대)
주문이 완료되었다니깐 (짜증)
Seq2Seq Model활용 - Encoder에 명사등 구성
Decoder에 명사+조사 구성
Response Generator의 경우 형태소 분석기의 응용
유의어 처리(N-Gram)
페파로니 - Pepperoni, 폐파로니, 페파피자..... / Mac Book Pro - 맥프로, 맥북프로...
고객별로 다양한 단어를 사용하나 API호출시에는 지정 값으로 해야 함
N-Gram을 활용하여 유의어로 학습한 결과를 Dict에 찾는 방식 (일반적 trigram)
링크 https://www.simplicity.be/article/throwing-dices-recognizing-west-flemish-and-other-languages/
각 Entity별 N과
Threshold 값을 적절하게 조절
※ threshold :
작을수록 비슷하게 찾음
Response Speed
LB 구성
Nginx 사용
적절한 수의 Thread와 AP
Caching of Data (Memory - API사용)
Chatbot에서 수용할수 있는 MAX Time반영
학습시 병렬 처리를 위한 Coding
tf.device를 통해 연산할 Device를 지정
CPU와 GPU의 적절한 분배
GPU가 많다고 무조건 빠른지는...
TensorMSA 소개
마무리
● 챗봇의 구현에 있어서 Hot한 기술의 사용도 중요하지만
무엇보다 Domain별 Data의 의미를 알고 컴퓨터가 잘 이해할 수 있게 해야함
● 학습할 Data와 예측 Data의 패턴을 일치화하는 것이 중요(일관성)
● 딥러닝은 대량의 정제된 Data와 확보가 중요함
● 딥러닝은 성능개선에 있어 충분한 해결 방안이 될 수 있음
When the singularity comes...
Google IO17 : https://www.youtube.com/watch?v=Y2VF8tmLFHw
Reference
모두를 위한 딥러닝
http://hunkim.github.io/ml/
제28회 한글 및 한국어 정보처리 학술 대회
한국어에 적합한 단어 임베딩 모델 및 파라미터 튜닝에 관한 연구등
Stanford University CS231n
http://cs231n.stanford.edu/
Creating AI chat bot with Python 3 and Tensorflow[신정규]
https://speakerdeck.com/inureyes/building-ai-chat-bot-using-python-3-and-tensorflow
파이썬으로 챗봇_만들기 [김선동]
https://www.slideshare.net/KimSungdong1/20170227-72644192?next_slideshow=1
딥러닝을 이용한 지역 컨텍스트 검색 [김진호]
http://www.slideshare.net/deview/221-67605830
Developing Korean Chatbot 101 [조재민]
https://www.slideshare.net/JaeminCho6/developing-korean-chatbot-101-71013451
Tensorflow-Tutorials
https://github.com/golbin/TensorFlow-Tutorials

Mais conteúdo relacionado

Mais procurados

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understandinggohyunwoong
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Databricks
 
검색엔진에 적용된 ChatGPT
검색엔진에 적용된 ChatGPT검색엔진에 적용된 ChatGPT
검색엔진에 적용된 ChatGPTTae Young Lee
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Pythonshanbady
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and originShubhankar Mohan
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Mother of Language`s Langchain
Mother of Language`s LangchainMother of Language`s Langchain
Mother of Language`s LangchainJun-hang Lee
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Gpt1 and 2 model review
Gpt1 and 2 model reviewGpt1 and 2 model review
Gpt1 and 2 model reviewSeoung-Ho Choi
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with TransformersJulien SIMON
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersJulien SIMON
 
GPT : Generative Pre-Training Model
GPT : Generative Pre-Training ModelGPT : Generative Pre-Training Model
GPT : Generative Pre-Training ModelZimin Park
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 

Mais procurados (20)

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
Building a Pipeline for State-of-the-Art Natural Language Processing Using Hu...
 
검색엔진에 적용된 ChatGPT
검색엔진에 적용된 ChatGPT검색엔진에 적용된 ChatGPT
검색엔진에 적용된 ChatGPT
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Introduction to natural language processing, history and origin
Introduction to natural language processing, history and originIntroduction to natural language processing, history and origin
Introduction to natural language processing, history and origin
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Mother of Language`s Langchain
Mother of Language`s LangchainMother of Language`s Langchain
Mother of Language`s Langchain
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Gpt1 and 2 model review
Gpt1 and 2 model reviewGpt1 and 2 model review
Gpt1 and 2 model review
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
GPT : Generative Pre-Training Model
GPT : Generative Pre-Training ModelGPT : Generative Pre-Training Model
GPT : Generative Pre-Training Model
 
Bert
BertBert
Bert
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 

Semelhante a Sk t academy lecture note

NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflowseungwoo kim
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing WorkshopLakshya Sivaramakrishnan
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyanrudolf eremyan
 
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this pptAI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this pptpavankalyanadroittec
 
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptxEXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptxAtulKumarUpadhyay4
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingParrotAI
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extractionGabriel Hamilton
 
Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019 Alexis Agahi
 
Data Analytics using R with Yelp Dataset
Data Analytics using R with Yelp DatasetData Analytics using R with Yelp Dataset
Data Analytics using R with Yelp DatasetCédric Poottaren
 
Artificial inteIegence & Machine learning - Key Concepts
Artificial inteIegence & Machine learning - Key ConceptsArtificial inteIegence & Machine learning - Key Concepts
Artificial inteIegence & Machine learning - Key ConceptsHasibAhmadKhaliqi1
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using SolrPradeep Pujari
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLPVijay Ganti
 
Conversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData WorkshopConversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData WorkshopTom Bocklisch
 

Semelhante a Sk t academy lecture note (20)

NLP Deep Learning with Tensorflow
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
 
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this pptAI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
AI UNIT 3 - SRCAS JOC.pptx enjoy this ppt
 
Chatbot_Presentation
Chatbot_PresentationChatbot_Presentation
Chatbot_Presentation
 
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptxEXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
EXPLORING NATURAL LANGUAGE PROCESSING (1).pptx
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
 
Devoxx traitement automatique du langage sur du texte en 2019
Devoxx   traitement automatique du langage sur du texte en 2019 Devoxx   traitement automatique du langage sur du texte en 2019
Devoxx traitement automatique du langage sur du texte en 2019
 
Data Analytics using R with Yelp Dataset
Data Analytics using R with Yelp DatasetData Analytics using R with Yelp Dataset
Data Analytics using R with Yelp Dataset
 
Artificial inteIegence & Machine learning - Key Concepts
Artificial inteIegence & Machine learning - Key ConceptsArtificial inteIegence & Machine learning - Key Concepts
Artificial inteIegence & Machine learning - Key Concepts
 
Nltk
NltkNltk
Nltk
 
Top 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data ScientistsTop 10 Must-Know NLP Techniques for Data Scientists
Top 10 Must-Know NLP Techniques for Data Scientists
 
Sentiment Analysis Using Solr
Sentiment Analysis Using SolrSentiment Analysis Using Solr
Sentiment Analysis Using Solr
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Conversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData WorkshopConversational AI with Rasa - PyData Workshop
Conversational AI with Rasa - PyData Workshop
 
NLP
NLPNLP
NLP
 

Mais de Susang Kim

[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...Susang Kim
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)Susang Kim
 
[Paper] dynamic routing between capsules
[Paper] dynamic routing between capsules[Paper] dynamic routing between capsules
[Paper] dynamic routing between capsulesSusang Kim
 
[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognitionSusang Kim
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)Susang Kim
 
[Paper] shuffle net an extremely efficient convolutional neural network for ...
[Paper] shuffle net  an extremely efficient convolutional neural network for ...[Paper] shuffle net  an extremely efficient convolutional neural network for ...
[Paper] shuffle net an extremely efficient convolutional neural network for ...Susang Kim
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...Susang Kim
 
[Paper] auto ml part 1
[Paper] auto ml part 1[Paper] auto ml part 1
[Paper] auto ml part 1Susang Kim
 
[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer visionSusang Kim
 
[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposalsSusang Kim
 
[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object DetectionSusang Kim
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Susang Kim
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)Susang Kim
 
GroupFace (Face Recognition)
GroupFace (Face Recognition)GroupFace (Face Recognition)
GroupFace (Face Recognition)Susang Kim
 
제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)Susang Kim
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용Susang Kim
 

Mais de Susang Kim (16)

[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)
 
[Paper] dynamic routing between capsules
[Paper] dynamic routing between capsules[Paper] dynamic routing between capsules
[Paper] dynamic routing between capsules
 
[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition[Paper] anti spoofing for face recognition
[Paper] anti spoofing for face recognition
 
[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)[Paper] attention mechanism(luong)
[Paper] attention mechanism(luong)
 
[Paper] shuffle net an extremely efficient convolutional neural network for ...
[Paper] shuffle net  an extremely efficient convolutional neural network for ...[Paper] shuffle net  an extremely efficient convolutional neural network for ...
[Paper] shuffle net an extremely efficient convolutional neural network for ...
 
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...[Paper] EDA : easy data augmentation techniques for boosting performance on t...
[Paper] EDA : easy data augmentation techniques for boosting performance on t...
 
[Paper] auto ml part 1
[Paper] auto ml part 1[Paper] auto ml part 1
[Paper] auto ml part 1
 
[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision[Paper] eXplainable ai(xai) in computer vision
[Paper] eXplainable ai(xai) in computer vision
 
[Paper] learning video representations from correspondence proposals
[Paper]  learning video representations from correspondence proposals[Paper]  learning video representations from correspondence proposals
[Paper] learning video representations from correspondence proposals
 
[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection[Paper] DetectoRS for Object Detection
[Paper] DetectoRS for Object Detection
 
Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)Long term feature banks for detailed video understanding (Action Recognition)
Long term feature banks for detailed video understanding (Action Recognition)
 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
 
GroupFace (Face Recognition)
GroupFace (Face Recognition)GroupFace (Face Recognition)
GroupFace (Face Recognition)
 
제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)제11회공개sw개발자대회 금상 TensorMSA(소개)
제11회공개sw개발자대회 금상 TensorMSA(소개)
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
 

Último

A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Último (20)

A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

Sk t academy lecture note

  • 1. Lecture BY Session 1 : SeungWoo Kim tmddno1@gmail.com Session 2 : SuSang Kim healess1@gmail.com Python과 Tensorflow를 활용한 AI Chatbot 개발
  • 2. 1. 도커실행환경 https://github.com/TensorMSA/tensormsa_docker.git ./tensormsa_docker/docker_compose_cpu 2. 소스설명코드 - jupyter git clone https://github.com/TensorMSA/tensormsa_jupyter.git Session 1 : chap05_nlp Session 2 : chap13_chatbot_lecture 시작 전 실습 환경 구성
  • 3. ●ML&DL Engineer (2014 ~ 2017) ○ POSCO Smart Factory Machine Learning Based Scheduling (2014~2015) ○ POSCO AI ChatBot (2016 ~ 2017) ○ Deep Learning Open Source Framework - TensorMSA (2016~2017) ●Android Developer - POSCO Mobile system (2010 ~ 2014) ○ LBS, IPS Vehicle & Navigation System ○ IPS with Deep Learning - Patent (2016) ●Awards ○ OSS world Challenge 2017 (on top 12 , on progress now) ○ Employee of the year 2015, 2017 on POSCO ICT ●Woori Bank AI (‘17.11.1 ~) Session 1 : SeungWoo Kim tmddno1@gmail.com
  • 4. Session 1 - Understand NLP
  • 5. Session 1 - 강의 목표 전체 ChatBot 아키텍처를 이해하고 서비스를 구성하기 위해 필요한 기반 지식에 대한 설명을 통해 Session 2 에서 실질적인 챗봇 개발에 대한 설명을 더 잘 이해 할 수 있도록 돕고자 함 . 챗봇 , 자연어 처리, 딥러닝 그리고 구현의 연관성을 이해하는 것에 중점 ! Session 1 - Understand NLP
  • 6. About ChatBot Session 1 - Understand NLP Natural Language Understanding Natural Language Generation User System 자연어 Semantic Frame자연어 Semantic Frame Why we need nlp on ChatBot system?
  • 7. About ChatBot Session 1 - Understand NLP Sort of Chatbot Easy Hard Retrieval-based model Generative model Traditional algorithms Deep Learning algorithms Short Conversation Long Conversation Closed Domain Open Domain
  • 8. About ChatBot Session 1 - Understand NLP Retrieval-Based vs Generative Models Retrieval-based models (easier) use a repository of predefined responses and some kind of heuristic to pick an appropriate response based on the input and context. The heuristic could be as simple as a rule-based expression match, or as complex as an ensemble of Machine Learning classifiers. These systems don’t generate any new text, they just pick a response from a fixed set. Generative models (harder) don’t rely on pre-defined responses. They generate new responses from scratch. Generative models are typically based on Machine Translation techniques, but instead of translating from one language to another, we “translate” from an input to an output (response).
  • 9. About ChatBot Session 1 - Understand NLP Use Deep Learning or Not Using Deep Learning Using Deep Learning do not guarantee better performance all the time to compared with using traditional techniques. It’s more expensive to gather enough data and train heavy model. Using traditional algorithms Most of current chatbot systems are based on those traditional algorithms and It has own strong points to compared with DL algorithms. 형태소 분석 품사 태깅 패턴 매칭 구문 분석 의미 분석 감성 분석 대화 처리 CharCNN BiLSTMCrf Seq2Seq Word2Vec RNN DMN E2E MMN Attention DNN TFIDF SVM Dictionary Bayesian Logistic LSA HMM USE BOTH
  • 10. About ChatBot Session 1 - Understand NLP Long Conversation vs Short Conversation Short Conversation the goal is to create a single response to a single input. For example, you may receive a specific question from a user and reply with an appropriate answer. Long conversation go through multiple turns and need to keep track of what has been said. Customer support conversations are typically long conversational threads with multiple questions.
  • 11. About ChatBot Session 1 - Understand NLP Open Domain vs Closed Domain “Closed Domain You can ask a limited set of questions on specific topics. (Easier). What is the Weather in Miami?” “Open Domain I can ask a question about any topic… and expect a relevant response. (Harder) Think of a long conversation around refinancing my mortgage where I could ask anything.” Mark Clark
  • 12. OverView Session 1 - Understand NLP Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning BasicNLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers SyntaxNet Scenario Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System ChatBot Server Numpy Pandas Tensorflow 파이프 라인 데이터 처리 ML & DL Library Scikit Learn Konlpy 개발 관련 데이터 수집 데이터 전처리 모델 훈련 모델 평가 모델 서비스 BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message 1 2 3 기본 이론 관련 딥러닝 이론 설명 예제를 통한 구현 설명 Session 1 - Understand NLP Memory Network Seq2SeqResponse Generation Ontology DM Legacy Data Base [AI Based] Chat-Bot Research Environment Data MartMonitoring Summary Result Train Data AI Model Pipe Line
  • 13. Session 1 - Contents 1. 자연어 처리 이론 > 일반적으로 자연어를 처리하기 위해 필요한 언어학적 이론 설명 2. 딥러닝 이론 > 자연어 처리 이론에서 이야기하는 문제에 해당하는 딥러닝 이론 3. 구현 > 딥러닝 및 라이브러리 등을 사용한 이론의 구현
  • 14. About NLP (Natural Language Process) Session 1 - Understand NLP Mostly Solved Making Good Progress Still Really Hard Spam Detection (스팸분석) Text Categorization (텍스트 분류) Part of Speech Tagging (단어 분석) Named Entity Recognition (의미 구분 분석) Information Extraction (정보 추출) Sentiment Analysis (감정분석) Coreference Resolution (같은 단어 복수 참조) Word Sense Disambiguation (복수 의미 분류) Syntactic Parsing (구문해석) Machine Translation (기계번역) Semantic Search (의미 분석 검색) Question & Answer (질의 응답) Textual inference (문장 추론) Summarization (텍스트 요약) Discourse & Dialog (대화 & 토론)
  • 15. About NLP (Natural Language Process) Session 1 - Understand NLP Text Categorization Text Classification assigns one or more classes to a document according to their content. Classes are selected from a previously established taxonomy (a hierarchy of catergories or classes). Spam Detection Spam Detection is also the part of Text Classification problem. Part of Speech grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context
  • 16. About NLP (Natural Language Process) Session 1 - Understand NLP Low Level Information Extraction
  • 17. About NLP (Natural Language Process) Session 1 - Understand NLP Information Extraction on Broader view https://www.google.co.kr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwievZKlmMzVAhVCgrwKHbM_D88QFggyMAE&url=https%3A %2F%2Fweb.stanford.edu%2Fclass%2Fcs124%2Flec%2FInformation_Extraction_and_Named_Entity_Recognition.pptx&usg=AFQjCNFUT9ZjvrDrx F9su0J9KiWobVP4Kg Rule Based Extraction Named Entity recognition Syntax Anal Relation Search Ontology Information Extraction
  • 18. About NLP (Natural Language Process) Session 1 - Understand NLP Coreference Resolution I did not vote for the Donald Trump because I think he is too reckless Coreference resolution is the task of finding all expressions that refer to the same entity in a text. It is an important step for a lot of higher level NLP tasks that involve natural language understanding such as document summarization, question answering, and information extraction. Deep Reinforcement Learning for Mention-Ranking Coreference Models Improving Coreference Resolution by Learning Entity-Level Distributed Representations https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30
  • 19. About NLP (Natural Language Process) Session 1 - Understand NLP Word Sense Disambiguation [Example] 1. a type of fish 2. tones of low frequency and the sentences: 1. I went fishing for some sea bass. 2. The bass line of the song is too weak. http://www.cs.cornell.edu/courses/cs4740/2014sp/lectures/wsd-1.pdf supervised way lable data example simi-supervised way
  • 20. About NLP (Natural Language Process) Session 1 - Understand NLP Syntactic Parsing syntactic parsing is Find structural relationships between words in a sentence https://web.stanford.edu/~jurafsky/slp3/12.pdf
  • 21. About NLP (Natural Language Process) Session 1 - Understand NLP Machine translation (MT) is automated translation. It is the process by which computer software is used to translate a text from one natural language (such as English) to another (such as Spanish). Machine Translation
  • 22. About NLP (Natural Language Process) Session 1 - Understand NLP Semantic Search Semantic search seeks to improve search accuracy by understanding a searcher’s intent through contextual meaning. Question and Answer Able to answer questions in natural language based on Knowledge data (usually ontology) ex) Best example is IBM Watson Textural Inference Recognize, generate, or extract pairs <T,H> of natural language expressions, such that a human who reads (and trusts) T would infer that His most likely also true Summarization Extracting interesting parts of the text and create a summary by using these parts of the text and allow for rephrasings to make summary more grammatically correct. Discourse & Dialog Do conversation with understanding the whole history of dialog and semantic meaning of speaker.
  • 23. Standard Natural Language Process Session 1 - Understand NLP Spoken Utterance Lexical (어휘) Analysis : Word Structure Speech Recognition Written Utterance Syntactic (구문) Analysis : Sentence Structure Morphemes, Word Semantic (의미) Analysis : Meaning of Words & Sentence Sentence Discourse (대화) Analysis : Relationship between sentence Context beyond Sentence
  • 24. Lexical Analysis Syntactic Analysis Semantic Analysis NLU Server (Understand) NLG Server (Generate) Voice Recognition Discourse Analysis 자연어 처리 이론 기본 이론 Session 1 - Understand NLP Session 1 - Now We are Here! Response Generation
  • 25. Session 1 - Understand NLP AI Speaker Alexa Alexa Microphone System NLP - Voice Recognition
  • 26. Session 1 - Understand NLP Deep Learning for Classification Hidden Markov Model for Language Model NLP - Voice Recognition
  • 27. Lexical Analysis Syntactic Analysis Semantic Analysis NLU Server (Understand) NLG Server (Generate) Voice Recognition Discourse Analysis 자연어 처리 이론 기본 이론 Session 1 - Understand NLP Session 1 - Now We are Here! Response Generation
  • 28. Session 1 - Understand NLP NLP - Lexical Analysis Main Factors on Lexical Analysis 1. Sentence Splitting 2. Tokenizing 3. Morphological 4. Part of speech Tagging
  • 29. Session 1 - Understand NLP NLP - Lexical Analysis Lexical Analysis What if there is no line change char (‘n’) ? Where is the EOS point? What if sentence is not separated into words properly with space? [Examples] [Problems]
  • 30. Session 1 - Understand NLP NLP - Lexical Analysis Word stemming lemmatization Love Lov Love Loves Lov Love Loved Lov Love Loving Lov Love Innovation Innovat Innovation Innovations Innovat Innovation Innovate Innovat Innovate Innovates Innovat Innovate Innovative Innovat Innovative Morphing Examples Stemming & lemmatization Morphology is process of finding morpheme which is smallest“meaningful unit (Lexical meaning or grammatical function)” and other features like stem in a language that carries information. Lexical Analysis
  • 31. Session 1 - Understand NLP NLP - Lexical Analysis Lexical Analysis Ambiguity “that” can be a subordinating conjunction or a relative pronoun - The fact that/IN you’re here - A man that/WDT I know “Around” can be a preposition, particle, or adverb - I bought it at the shop around/IN the corner. - I never got around/RP to getting a car. - A new Toyota Prius costs around/RB $25K. Degree of ambiguity (in Brown corpus) - 11.5% of word types (40% of word tokens) are ambiguous # of Tags 1 2 3 4 5 6 7 # of Words 35340 3760 264 61 12 2 1 #Ambiguity Problem is much serious in Korean Part-of-speech tagging is one of the most important text analysis tasks used to classify words into their part-of-speech and label them according the tagset which is a collection of tags used for the pos tagging. Part-of-speech tagging also known as word classes or lexical categories
  • 32. Session 1 - Understand NLP NLP - Lexical Analysis Lexical Analysis Hannanum Kkma Komoran Mecab Twitter 하늘 / N 하늘 / NNG 하늘 / NNG 하늘 / NNG 하늘 / Noun 을 / J 을 / JKO 을 / JKO 을 / JKO 을 / Josa 나 / N 날 / VV 나 / NP 나 / NP 나 / Noun 는 / J 는 / ETD 는 / JX 는 / JX 는 / Josa 자동차 / N 자동차 / NNG 자동차 / NNG 자동차 / NNG 자동차 / Noun Anal Result Comparison Library Performance Comparison
  • 33. Session 1 - Understand NLP NLP - Lexical Analysis Lexical Analysis [Code]
  • 34. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 35. Session 1 - Understand NLP NLP - Lexical Analysis (1) Word Segmentation (2) POS Tagging (3) Chunking (4) Clause Identification (5) Named Entity Recognition (6) Semantic Role Labeling (7) Information Extraction What we can do with sequence labeling What’s sequence labeling Sequence Labeling
  • 36. Session 1 - Understand NLP NLP - Lexical Analysis Word POS Chunk NE West NNP B-NP B-MISC Indian NNP I-NP I-MISC all-around NN I-NP O Phil NNP I-NP B-PER Simons NNP I-NP I-PER took VBD B-VP O four CD B-NP O for IN B-PP O 38 CD B-NP O on IN B-PP O Friday NNP B-NP O <iob data set example> POS Tag 의미 ttps://docs.google.com/spreadsheet/ccc?key=0ApcJghR6UMXxdEdU RGY2YzIwb3dSZ290RFpSaUkzZ0E&usp=sharing Chunk Tag 의미 B : Begin of Chunk I : Continuation of Chunk E: End of Chunk NP : Noun VP : Verb NER BIO Tag 의미 B : Start with new Chunk I : word inside Chunk O: Outside of Chunk Sequence Labeling
  • 37. Session 1 - Understand NLP NLP - Lexical Analysis BiLSTM-CRF Description Sequence Labeling with Deep Learning Deep Learning Basic Word Embedding DL FrameWorks Prerequisite
  • 38. Session 1 - Understand NLP NLP - Lexical Analysis VIDEO Deep Learning Basic
  • 39. Session 1 - Understand NLP New Algorithms Back Propagation CNN, RNN .. etc Big Data HDFS MapReduce Hardware GPU Parallel Execution Cloud Service NLP - Lexical Analysis Deep Learning Basic
  • 40. Session 1 - Understand NLP 3 5 7 9 (1) Problem (2) Algorithm (3) Programming Y = 2 * X + 1 function(x) { return x*2 + 1 } NLP - Lexical Analysis Deep Learning Basic
  • 41. Session 1 - Understand NLP 3 5 7 9 (1) Problem (2) Algorithm (3) Programming Y = w * X + b 3 5 7 9 initial optimized NLP - Lexical Analysis Deep Learning Basic
  • 42. Session 1 - Understand NLP Supervised Learning Unsupervised Learning Reinforcement Learning CAT CAT CAT DOG DOG DOG Deep Learning Basic NLP - Lexical Analysis
  • 43. Session 1 - Understand NLP 1. Perceptron 2. Activation Function 3. Cost 4. Gradient Descent 5. Back Propagation 6. Optimizers Deep Learning Basic NLP - Lexical Analysis
  • 44. Session 1 - Understand NLP Deep Learning Basic - Perceptron wX + b NLP - Lexical Analysis
  • 45. Session 1 - Understand NLP Deep Learning Basic - Perceptron wX + b Activation Function NLP - Lexical Analysis
  • 46. Session 1 - Understand NLP Deep Learning Basic - Activation Function Logistic Regression Nonlinear Problems NLP - Lexical Analysis
  • 47. Session 1 - Understand NLP Deep Learning Basic - Activation Function NLP - Lexical Analysis
  • 48. Session 1 - Understand NLP Deep Learning Basic - Loss (Error) Initial Optimized LOSS x y y~ 0 3 7 1 5 9 2 7 11 3 9 13 4 11 15 5 13 17 6 15 19 Y X0 1 2 3 Y = wX + b NLP - Lexical Analysis
  • 49. Session 1 - Understand NLP x y init opt 0 3 7 3 1 5 9 5 2 7 11 7 init : ((7-3)^2 + (9-5)^2 + (6-11)^2) / 3 = 16 opt : ((3-3)^2 + (5-5)^2 + (7-7)^2) / 3 = 0 HOW? Deep Learning Basic - Loss (Error) W, b Cost(W, b) NLP - Lexical Analysis
  • 50. Session 1 - Understand NLP Deep Learning Basic - Gradient Descent weight Learning Rate gradient NLP - Lexical Analysis
  • 51. Session 1 - Understand NLP Output Hidden Input Train Data Forward Propagation y-y~ (Error) Back Propagation Update Each Weight partial derivative chain rule Deep Learning Basic - BackPropagation NLP - Lexical Analysis
  • 52. Session 1 - Understand NLP Deep Learning Basic - Optimizer NLP - Lexical Analysis https://www.youtube.com/watch?v=hMLUgM6kTp8
  • 53. Session 1 - Understand NLP NLP - Lexical Analysis SGD Adagrad RMS Momentum Nag Adadelta Adam Adaptive 계열 알고리즘 Deep Learning Basic - Optimizer 기존 진행 방향 반영 가속도 개념의 적용 Momentum과 유사 이동 위치에서 반영 2차 미분 값 활용 느린 것은 더 빨리 빠른 것은 더 꼼꼼히 누적 Gradient 를 Sum이 아닌 지수평균으로대체하여 G가 무한이 커지는 것을 방지 지수평균 사용, StepSize 변화 값의 제곱 사용 Adadelta, Momentum 특성 두 가지 모두 적용 http://shuuki4.github.io/deep%20learning/2016/05/20/Gradient-Descent-Algorithm-Overview.html
  • 54. Session 1 - Understand NLP NLP - Lexical Analysis https://arxiv.org/pdf/1705.08292.pdf "Gradient descent (GD)나 Stochastic gradient descent (SGD)를 이용하여 찾은 solution이 다른 adaptive methods (e.g. AdaGrad, RMSprop, and Adam)으로 찾은 solution보다 훨씬 generalization 측면에서 뛰어나다." The Marginal Value of Adaptive Gradient Methods in Machine Learning Ashia C. Wilson] , Rebecca Roelofs] , Mitchell Stern] , Nathan Srebro† , and Benjamin Recht]∗ ] University of California, Berkeley. † Toyota Technological Institute at Chicago May 24, 2017 There is no optimizer best for all cases!! When to use adaptive optimizer? If input embedding vectors are sparse, it’s better to use adaptive optimizer! Deep Learning Basic - Optimizer
  • 55. Session 1 - Understand NLP # tf Graph input x = tf.placeholder("float", [None, 784]) y = tf.placeholder("float", [None, 10]) # Store layers weight & bias weights = { 'h1': tf.Variable(tf.random_normal([784, 256])), 'h2': tf.Variable(tf.random_normal([256, 256])), 'out': tf.Variable(tf.random_normal([256, 10])) } biases = { 'b1': tf.Variable(tf.random_normal([256])), 'b2': tf.Variable(tf.random_normal([256])), 'out': tf.Variable(tf.random_normal([10])) } # Hidden layer with RELU activation layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1']) layer_1 = tf.nn.relu(layer_1) # Hidden layer with RELU activation layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']) layer_2 = tf.nn.relu(layer_2) # Output layer with linear activation pred = tf.matmul(layer_2, weights['out']) + biases['out'] hypothesis = tf.nn.softmax(pred ) # Define loss and optimizer cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), reduction_indices=1)) tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) input Hidden Out 784 256 10 Hidden 256 784 256 784 256 256 10 256 S O F T M A X Y=Activation(W*x + b) [Error] Cross Entropy W W1 A(W*x + b) b b A(W*x + b)x 2 1 3 4 5 256 784 1 Deep Learning Basic NLP - Lexical Analysis
  • 56. Session 1 - Understand NLP START 오늘 날씨 는 ? PAD PAD END START 오늘 날씨 는 어때 ? PAD END START 오늘 비가 오 려 나 ? END Case of long sentence … Vanishing Problem happens Various length of data cause waste of computing power Here we have concept of Dynamic RNN BiDirectional Lstm learn given data from backward Long Short Term Memory Cell Cell State https://brunch.co.kr/@chris-song/9 updateforget out cell state https://blog.altoros.com/the-magic-behind-google-translate- sequence-to-sequence-models-and-tensorflow.html NLP - Lexical Analysis Deep Learning Basic
  • 57. Session 1 - Understand NLP NLP - Lexical Analysis Deep Learning Basic Overfitting Fine Tuning Multi Tasking Ensemble Data Preprocessing Drop Out Batch Normalization Network Compression https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdfhttps://arxiv.org/pdf/1510.00149.pdf Adam+SGD Learning Rate Decaying Fully Convolutional 1by1 Convolutional Filter Quantize Neural Networks AutoML Hyper Parameter Random Search Grid Search Genetic Algorithm
  • 58. Session 1 - Understand NLP Session 1 - Now We are Here ! Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Numpy Pandas Tensorflow 데이터 처리 ML & DL Library Scikit Learn Konlpy 개발 관련 구현 Response Generation Memory Network Seq2Seq
  • 59. Session 1 - Understand NLP https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software#cite_note-29 NLP - Lexical Analysis - Implementation Deep Learning Framework comparison pytorch
  • 60. Session 1 - Understand NLP NLP - Lexical Analysis - Implementation Deep Learning Framework comparison  dynamic vs static graph definition Debugging Visualization Deployment VS
  • 61. Session 1 - Understand NLP NLP - Lexical Analysis - Implementation Deep Learning Framework - Tensorflow with tf.Graph().as_default() : X = tf.placeholder("float") Y = tf.placeholder("float") W = tf.Variable(rng.randn(), name="weight") b = tf.Variable(rng.randn(), name="bias") pred = tf.add(tf.multiply(X, W), b) cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples) optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) tf.summary.FileWriter(logs_path, graph=tf.get_default_graph()) # Fit all training data for epoch in range(training_epochs): for (x, y) in zip(train_X, train_Y): sess.run(optimizer, feed_dict={X: x, Y: y}) Tensorflow : static graph definition Pytorch : dynamic graph definition
  • 62. Session 1 - Understand NLP https://medium.com/@karpathy/a-peek-at-trends-in-machine-learning-ab8a1085a106 NLP - Lexical Analysis - Implementation Deep Learning Framework comparison https://blog.paperspace.com/which-ml-framework-should-i-use/
  • 63. Session 1 - Understand NLP NLP - Lexical Analysis - Implementation Deep Learning Framework - Tensorflow Graph (Edge + Node) + Session
  • 64. Session 1 - Understand NLP NLP - Lexical Analysis - Implementation Deep Learning Framework - Tensorflow https://github.com/TensorMSA/tensormsa_jupyter/blob/master/chap03_basic_models/linear_regressions.ipynb
  • 65. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here! Response Generation Memory Network Seq2Seq
  • 66. Session 1 - Understand NLP Word Embedding 이란 ? 텍스트를 구성하는 하나의 음소, 음절, 단어, 문장, 문서 단위를 수치화하여 표현하는 방법의 일종
  • 67. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding Word Representation Discrete Representation WordNet OneHot Vector Distributed Representation Direct Prediction Word2Vec Count Based Full Document Windows LSA SVD of x Glove FastText
  • 68. Session 1 - Understand NLP WordNet NLP - Lexical Analysis - Word Embedding 과거에는 WordNet과 같은 방법을 사용했다. WordNet이란, 각 단어끼리의 관계(상위단어, 동의어) 가 나타나 있는 트리구조의 그래프 모형이다. 물론 이를 구축하기 위한 작업은 전부 사람이 했다. 그러다보니 주관적이고 유지하는데 있어 많은 노동이 필요하다는 한계가 존재했다.
  • 69. Session 1 - Understand NLP OneHot Vector NLP - Lexical Analysis - Word Embedding
  • 70. Session 1 - Understand NLP LSA(잠재적 의미 분석) with SVD(특이값 분해) NLP - Lexical Analysis - Word Embedding https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/04/06/pcasvdlsa/ - doc1 doc2 doc3 나 1 0 0 는 1 1 2 학교 1 1 0 에 1 1 0 가 1 1 0 ㄴ 1 0 0 다 1 0 1 영희 0 1 1 좋 0 0 1 truncated SVDSVD LSA(잠재적 의미 분석)
  • 71. Session 1 - Understand NLP SVD of X NLP - Lexical Analysis - Word Embedding https://swalloow.github.io/cs224d-lecture2 이 방법은 Window의 길이 (일반적으로 5 - 10) 에 따라 대칭적으로 이동하면서 확인하는 방법이다. ● I like deep learning. ● I like NLP. ● I enjoy flying 위와 같은 corpus가 있을 때, 이를 matrix로 표현하면 다음과 같다. 간단히 보면 각 단어의 빈도 수를 체크한 것이다. SVD 로 차원 축소Window size로 빈도 조사 결과
  • 72. Session 1 - Understand NLP https://www.tensorflow.org/tutorials/word2vec http://w.elnn.kr/search/ Word2Vector Demo Site 장점 : 차원의 축소 , 의미적 유사성의 표현 단점 : 동음이의어 처리, 데이터 적을 경우 신경망 훈련시 신호 강도 NLP - Lexical Analysis - Word Embedding Word2Vec
  • 73. Session 1 - Understand NLP C-Bow the quick brown fox jumped over the lazy dog ([brown, jumped], fox) window size : 1 brown jumped over the . . brown jumped over fox . . Input OutputHidden Hidden Size Hidden Size Vocab Size Data Set Original Text NLP - Lexical Analysis - Word Embedding Word2Vec
  • 74. Session 1 - Understand NLP the quick brown fox jumped over the lazy dog (fox, brown), (fox, jumped) window size : 1 brown jumped over the . . brown jumped over fox . . Input OutputHidden Hidden Size Hidden Size Vocab Size Data Set Original Text Skip-Gram NLP - Lexical Analysis - Word Embedding Word2Vec
  • 75. Session 1 - Understand NLP (1)PV-DM (2)PV-DBOW (3)DM + DBOW (Vector Concat) W2V W2V W2V (4)AVG(TF-IDF * W2V) the quick brown fox jumped over the lazy dog (paragraph, the) (paragraph, quick) (paragraph, brown) (paragraph, fox) (paragraph, jumped) ([paragraph, quick, brown, fox, juped], over) ([paragraph, quick, brown, fox, juped,over],the) vector vector vector TF-IDF TF-IDF TF-IDF X X X vector AVG NLP - Lexical Analysis - Word Embedding Doc2Vec
  • 76. Session 1 - Understand NLP tfidf(t,d,D) = tf(t,d) x idf(t,D) https://thinkwarelab.wordpress.com/2016/11/14/ir-tf-idf-%EC%97%90-%EB%8C%80%ED%95%B4-%EC%95%8C%EC%95%84%EB%B4%85%EC%8B%9C%EB%8B%A4/ http://www.popit.kr/bm25-elasticsearch-5-0%EC%97%90%EC%84%9C-%EA%B2%80%EC%83%89%ED%95%98%EB%8A%94-%EC%83%88%EB%A1%9C%EC%9A%B4-%EB%B0%A9%EB%B2%95/ Not exactly word embedding but used on nlp with deep learning pretty often - Document similarity - Words importance on document - Used on search engine (like elasticsearch though it use BM25 for now) NLP - Lexical Analysis - Word Embedding TF-IDF
  • 77. Session 1 - Understand NLP - Introduce several ways to embed char as vector 안 녕 하 세 요 1 가 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 나 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 다 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 라 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 마 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 바 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 사 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 아 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 자 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 An Neung Ha Se Yo (ㅇ ㅏ ㄴ) (ㄴ ㅕ ㅇ) . . . . 2 a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 b 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 c 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 e 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 f 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 g 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 h 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 i 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 3 ㄱ 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ㄴ 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ㄷ 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ㄹ 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 ㅁ 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ㅂ 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 ㅅ 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ㅇ 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 ㅈ 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 NLP - Lexical Analysis - Word Embedding Char Embeding
  • 78. Session 1 - Understand NLP the quick brown fox jumped over the lazy dog 0.2 0.1 0.4 0.21 0 0 0 f o x fox Word2Vector 0 1 0 0 0 0 1 0 OneHot Encoding OneHot Encoding OneHot Encoding 1.Word2Vec 계열은 의미적 상관성을 잘 표현 2.OneHot 은 강한 신호적 특성으로 Train 에 효과적 3.Word 단위 Embedding 은 단어를 잘 기억함 4.Char 단위 Embedding 은 미훈련 단어 처리에 용이 NLP - Lexical Analysis - Word Embedding + Char +Word Concat
  • 79. Session 1 - Understand NLP Words not exactly matched with the pretrained dict will return “UNKNOWN” So FastText (by Facebook ) use ngram on their word embedding algorithm.. 에어컨 ~ 에어조단 비교 에어컨 ['$$에', '$에어', '에어컨', '어컨$', '컨$$'] => 5 에어조단 ['$$에', '$에어', '에어조', '어조단', '조단$', '단$$'] => 6 일치 ['$$에', '$에어'] => 2 점수 일치 2건 / 중복제거 전체 7건 => 0.2222 NLP - Lexical Analysis - Word Embedding FastText
  • 80. Session 1 - Understand NLP Glove NLP - Lexical Analysis - Word Embedding (their dot product equals the logarithm of the words’ probability of co-occurrence) “임베딩된 단어벡터 간 유사도 측정을 수월하게 하면서도 말뭉치 전체의 통계 정보를 좀 더 잘 반영해보자”가 GloVe가 지향하는 핵심 목표라 말할 수 있을 것 같습니다. 동시 등장 확률 https://ratsgo.github.io/from%20frequency%20to%20semantics/2017/04/09/glove/ Glove 는 특정 문맥 단어가 주어졌을 때 임베딩된 두 단어벡터의 내적이 두 단어의 동시 등장 확률 간 비율이 되게끔 단어를 임베딩 하고자 하였음
  • 81. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here! Numpy Pandas Tensorflow 데이터 처리 ML & DL Library Scikit Learn Konlpy 개발 관련 구현 Response Generation Memory Network Seq2Seq
  • 82. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding OneHot Encoding : Simple Test Code show concept of onehot http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/ [Code]
  • 83. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding Word2Vector : Using Gensim word2vec package http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
  • 84. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding FastText : FaceBook fasttext with gensim wrapper http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/
  • 85. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding FastText : Possible to use pretrained vector and do find tuning on it http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/wordembedding/ https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
  • 86. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding N-grams are simply all combinations of adjacent words or letters of length n that you can find in your source text.
  • 87. Session 1 - Understand NLP NLP - Lexical Analysis - Word Embedding For large dataset word2vec training GPU acceleration is needed You can also think about using Tensorflow or Keras for training model https://github.com/SimonPavlik/word2vec-keras-in-gensim/blob/keras106/word2veckeras/word2veckeras.py https://github.com/tensorflow/models/blob/master/tutorials/embedding/word2vec.py
  • 88. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 89. Session 1 - Understand NLP NLP - Lexical Analysis - DL ALgorithms Paper Model CoNLL 2003 (F1 %) Collobert et al.(2011) MLP with word embeddings+gazetteer 89.59 Passos et al.(2014) Lexicon Infused Phrase Embeddings 90.90 Chiu and Nichols(2015) Bi-LSTM with word+char+lexicon embeddings 90.77 Luo et al.(2015) Semi-CRF jointly trained with linking 91.20 Lample et al.(2016) Bi-LSTM-CRF with word+char embeddings 90.94 Lample et al.(2016) Bi-LSTM with word+char embeddings 89.15 https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/ https://arxiv.org/pdf/1708.02709.pdf NER (Named Entity Recognition) Algorithm Performance
  • 90. NLP - Lexical Analysis - DL ALgorithms what do we want to do with this algorithm?
  • 91. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf 김승우 B-PERSON 전화번호 B-TARGET 검색 O 김승우 B-PERSON 이메일 B-TARGET 검색 O 김승우 B-PERSON 이미지 B-TARGET 검색 O IOB Data 김승우 전화번호 검색 김승우 이메일 검색 김승우 이미지 검색 Plain Data Sentence Splitting Token Morphing Part of Speech Tagging Lexical Analysis Word2Vector OneHot Encoding 1 0 0 0 0 1 0 0 0 0 1 0 김승우 전화번호 이메일 검색 B-PERSON B-TARGET 김 우 승 Index List
  • 92. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf 김승우 전화번호 이메일 검색 B-PERSON B-TARGET 김 우 승 Index List [Code]
  • 93. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf 김 우 승 김승우 전화번호 이메일 Concat Vector [Code]
  • 94. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf Concat Vector 김승우 전화번호 이메일 검색 B-PERSONB-TARGET BiLstm Fully Connected Layer B-? B-? B-? [Code]
  • 95. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf Conditional Random Field Soft Max [Code]
  • 96. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf http://people.cs.umass.edu/~mccallum/papers/crf-tutorial.pdf Probabilistic Model for sequence data segmentation and labeling https://www.slideshare.net/kanimozhiu/tdm-probabilistic-models-part-2 he first method makes local choices. In other words, even if we capture some information from the context in our hh thanks to the bi-LSTM, the tagging decision is still local. We don’t make use of the neighbooring tagging decisions. For instance, in New York, the fact that we are tagging York as a location should help us to decide that New corresponds to the beginning of a location. Given a sequence of words w1,…,wmw1,…,wm, a sequence of score vectors s1,…,sms1,…,sm and a sequence of tags y1,…,ymy1,…,ym, a linear-chain CRF defines a global score s∈Rs∈R
  • 97. Session 1 - Understand NLP NLP - Lexical Analysis - BiLstmCrf Real Project BiLstm Result Sample Code Predict Test Result Test data Not Included in Train Set Predicts well http://ip:8888/tree/tensormsa_jupyter/chap05_nlp/sequence_tagging/
  • 98. Lexical Analysis Syntactic Analysis Semantic Analysis NLU Server (Understand) NLG Server (Generate) Voice Recognition Discourse Analysis 자연어 처리 이론 기본 이론 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation
  • 99. Session 1 - Understand NLP NLP - Lexical Analysis - SyntaxNet 구문 분석(構文分析, 문화어: 구문해석, 문장해석)은 문장을 그것을 이루고 있는 구성 성분으로 분해하고 그들 사이의 위계 관계를 분석하여 문장의 구조를 결정하는 것을 말한다. Graph-Based Models Transition-Based Models CYK Style Parsing MST finding Algorithm Projective & Non Projective Model
  • 100. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models Sentence W Repeat until all words have their head - Select two target words in data structure (One dependent & one head candidate) - Deterministically predict next parsing action from parsing model - Modify structure according parsing action C0 -> C1 -> C2 -> ……..C8 -> C9 -> C10 -> .… -> Cm D-tree t1 t2 t3 t8 t9 t10 tm Oracle (Classifier) Predict the best transition
  • 101. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System
  • 102. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Assume that we are given an oracle : - for any non-terminal configuration, it can predict the correct transition (for deterministic parsing) - That is, it takes two words & magically gives us the dependency relation b/w item if one exists
  • 103. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Shift : Move Economic from buffer B to stack S
  • 104. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Left-arc : Add left-arc (had, news, nsubj) to A Remove news from stack (since it now has head in A)
  • 105. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Right-arc : Add right-arc (ROOT, had, root) to A keep had in stack : because it can have other dependents on the right
  • 106. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Left-arc : Add left-arc (effect, little, amod) to A Remove little from stack (since it now has head in A)
  • 107. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Right-arc : Add right-arc (had, effect, dobj) to A Keep effect in stack : because it can have other dependents on right
  • 108. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Right-arc : Add right-arc (effect, on, prep) to A Keep on in stack : because it can have other dependents on the right
  • 109. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Shift : Move financial from buffer B to stack S
  • 110. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Left-arc : Add left-arc (market, financial, amod) to A Remove financial from stack (since it now has head in A)
  • 111. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Right-arc : Add right-arc (on, markets, pmod) to A Keep markets in stack : because it can have other dependents on the right
  • 112. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Reduce : Remove markets, on, effect from stack (since they already have head in A) ※ All decisions like right-arc, left-arc, reduce, shift will be made by oracle
  • 113. Session 1 - Understand NLP NLP - Syntactic Analysis Transition-Based Models - Arc Eager Transition System Right-arc : Add right-arc (had, period, p) to A Keep period in stack Done !
  • 114. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 115. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet Parsing type Paper Model WSJ Dependency Parsing Chen and Manning(2014) Fully-connected NN with features including POS 91.8/89.6 (UAS/LAS) Dependency Parsing Weiss et al.(2015) Deep fully-connected NN with features including POS 94.3/92.4 (UAS/LAS) Dependency Parsing Dyer et al.(2015) Stack LSTM 93.1/90.9 (UAS/LAS) Constituency Parsing Petrov et al.(2006) Probabilistic context-free grammars (PCFG) 91.8 (F1 Score) Constituency Parsing Zhu et al.(2013) Feature-based transition parsing 91.3 (F1 Score) Constituency Parsing Vinyals et al.(2015b) seq2seq learning with LSTM+Attention 93.5 (F1 Score) Syntax Parsing Algorithm Performance 파싱(parsing, 구문분석)에는 두 가지 유형이 있다. 하나는 개별 단어를 이들 사이의 관계를 고려해 연결하는 의존구문분석(dependency parsing)과 텍스트를 반복적으로 하위 구문으로 분리하는 구성성분분석(constituency parsing)이다.
  • 116. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet We show this layout in the schematic below: the state of the system (a stack and a buffer, visualized below for both the POS and the dependency parsing task) is used to extract sparse features, which are fed into the network in groups. We show only a small subset of the features to simplify the presentation in the schematic Google SyntaxNet with Deep Learning - Pos Tagging
  • 117. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet Google SyntaxNet with Deep Learning - A Fast and Accurate Dependency Parser using Neural Networks https://arxiv.org/pdf/1603.06042.pdf 1 2 3 1 I _ PRP PRP _ 2 nsubj _ _ 2 knew _ VBD VBD _ 0 ROOT _ _ 3 I _ PRP PRP _ 5 nsubj _ _ 4 could _ MD MD _ 5 aux _ _ 5 do _ VB VB _ 2 ccomp _ _ 6 it _ PRP PRP _ 5 dobj _ _ 7 properly _ RB RB _ 5 advmod _ _ 8 if _ IN IN _ 9 mark _ _ 9 given _ VBN VBN _ 5 advcl _ _ 10 the _ DT DT _ 12 det _ _ 11 right _ JJ JJ _ 12 amod _ _ 12 kind _ NN NN _ 9 dobj _ _ 13 of _ IN IN _ 12 prep _ _ 14 support _ NN NN _ 13 pobj _ _ 15 . _ . . _ 2 punct _ _ 18 units (1),(2),(3) 18 units (1),(2),(3) 12 units (2),(3) (1) The top 3 words on the stack and buffer: s1, s2, s3, b1, b2, b3; => 6 (2) The first and second leftmost / rightmost children of the top two words on the stack: lc1(si), rc1(si), lc2(si), rc2(si), i = 1, 2. => 8 (3) The leftmost of leftmost / rightmost of rightmost children of the top two words on the stack: lc1(lc1(si)), rc1(rc1(si)), i = 1, 2. => 4
  • 118. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet Google SyntaxNet with Deep Learning - Local Parser 1. SHIFT: Push another word onto the top of the stack, i.e. shifting one token from the buffer to the stack. 2. LEFT_ARC: Pop the top two words from the stack. Attach the second to the first, creating an arc pointing to the left. Push the first word back on the stack. 3. RIGHT_ARC: Pop the top two words from the stack. Attach the second to the first, creating an arc point to the right. Push the second word back on the stack.
  • 119. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet As we describe in the paper, there are several problems with the locally normalized models we just trained. The most important is the label-bias problem: the model doesn't learn what a good parse looks like, only what action to take given a history of gold decisions. This is because the scores are normalized locally using a softmax for each decision. Google SyntaxNet with Deep Learning - Global Training
  • 120. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet What’s Beam Search Algorithm on RNN ? https://www.youtube.com/watch?v=UXW6Cs82UKo Instead of try only the best every iteration, try all cases to the end and choose the sum is maximum. But if you try to calculate all cases algorithms will be too heavy, so remain only the best few every step and remove others (pruning). This is for find global maximum predict result .
  • 121. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet What’s Beam Search Algorithm on RNN ? Follow best every step may can miss chance to find global optimal case
  • 122. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet What’s Beam Search Algorithm on RNN ? Consider all cases will require too much computing power
  • 123. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet What’s Beam Search Algorithm on RNN ? Remove low score cases for every step (Pruning)
  • 124. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet http://universaldependencies.org/ Google SyntaxNet do not support Korean as a default language. But as we can see bellow, we can train the model with Sejong corpus data. Though we have to covert the format for SyntaxNet to understand. Google SyntaxNet with Deep Learning - How about Korean
  • 125. Session 1 - Understand NLP NLP - Syntactic Analysis - SyntaxNet Demo Site (we also use samples on this site) http://sejongpsg.ddns.net/syntaxnet/psg_tree.htm SyntaxNet Korean with Docker (We pretrained Korean corpus and set up webserver for service) https://github.com/TensorMSA/tensormsa_syntax_docker Google SyntaxNet with Deep Learning - Test it by yourself
  • 126. Lexical Analysis Syntactic Analysis Semantic Analysis NLU Server (Understand) NLG Server (Generate) Voice Recognition Discourse Analysis 자연어 처리 이론 기본 이론 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation
  • 127. Session 1 - Understand NLP NLP - Semantic Analysis Sentential semantics - Semantic role labeling (SRL) - Phrase similarity (=paraphrase) - Sentence Classification, Sentence Emotion Analysis and etc What is Semantic in study of language Three perspectives on meaning - Lexical semantics : individual words - Sentential semantics : individual sentences - Discourse or Pragmatics : longer piece of text or conversation NLP Tasks for Semantics
  • 128. Session 1 - Understand NLP NLP - Semantic Analysis What is Semantic Role Labeling (SRL) SRL = Semantic roles express the abstract role that arguments of a predicate can take in the event. The police arrested the suspect in the park last night Agent predicate Theme Location Time Who did what to whom where when Can we figure out that these sentences have the same meaning? Can we figure out the bought, sold, purchase used on sentence with same meaning? XYZ corporation bought the stock. The sold the stock to XYZ corporation. The stock was bought by XYZ corporation. The purchase of the stock by XYZ corporation.
  • 129. Session 1 - Understand NLP NLP - Semantic Analysis - Semantic Role Labeling Common Semantic Role Labeling Architecture http://naacl2013.naacl.org/Documents/semantic-role-labeling-part-1-naacl-2013-tutorial.pdf Syntatic Parse Argument Identification Argument Classification Structural Inference Prune Constituents Candidates Semantic roles Arguments Step-1 Candidate Selection - Parse the sentence - Prune/filter the parse tree (eliminate some tree constituents to speed up the execution) Step-2 Argument Identification - A binary classification of each node as Argument or NONE - Local scoring Step-3 Argument Classification - A multi class (one-of-N) classification of all the argument candidates - Global /joint scoring ML ML ML
  • 130. Session 1 - Understand NLP Paper Model CoNLL2005 (F1 %) CoNLL2012 (F1 %) Collobert et al.(2011) CNN with parsing features 76.06 Tackstrom et al.(2015) Manual features with DP for inference 78.6 79.4 Zhou and Xu(2015) Bidirectional LSTM 81.07 81.27 He et al.(2017) Bidirectional LSTM with highway connections 83.2 83.4 의미역 결정(Semantic Role Labeling, SRL)은 문장에서 술어(predicate)-논항(argument) 구조를 발견하는 것을 목표로 한다. 각 목표 동사(술어)에 대해, 동사의 의미역을 취하는 문장의 모든 구성요소가 인식된다. 전형적인 의미 논항은 행위주, 대상, 도구 등이며 위치, 시간, 방법, 원인 등도 포함된다(Zhou and Xu, 2015). 표7은 CoNLL 2015 및 2012 데이터셋에서 여러 모델의 성능을 보여준다. 전통적인 SRL 시스템은 여러 단계로 구성된다. 파싱 트리를 생성한 뒤 트리의 노드가 주어진 동사의 논항을 나타내는지 판별한 다음, 해당 SRL 태그를 결정한다. 각 분류 과정은 많은 피처를 추출하여 통계 모델(statistical model)로 전달하는 과정을 대개 수반한다. (Collobert et al., 2011) Tackstrom et al. (2015)는 술어가 주어지면 파싱 트리를 기반으로 하는 일련의 피처로 구성요소의 범위와 해당 술어에 대한 의미역 후보들에 점수를 매긴다. 그들은 효율적인 추론을 위한 동적 프로그래밍(dynamic programming) 알고리즘을 제안했다. Collobert et al., (2011)은 추가적인 참조 테이블의 형태로 제공된 파싱 정보에 의해 보강된 CNN을 사용하여 유사한 결과를 얻었다. Zhou and Xu(2015)는 임의의 긴 문맥을 모델링하기 위해 bidirectional LSTM을 제안했는데, 파싱 트리 정보 없이도 성공적인 것으로 판명되었다. He et al. (2017)은 이 연구를 더욱 확장해 ‘highway connection’을 소개했다. NLP - Semantic Analysis - Semantic Role Labeling LSTM is effective of SRL problem too !
  • 131. Session 1 - Understand NLP NLP - Semantic Analysis - Semantic Role Labeling Bidirectional LSTM with highway connections Stack more layers on RNN with highway technique ! https://homes.cs.washington.edu/~luheng/files/acl2017_hllz.pdf
  • 132. Session 1 - Understand NLP NLP - Semantic Analysis - Semantic Role Labeling Semantic Role Labeling Applications Information : Anna is friend of mine. http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/neo4j/neo4j_basic.ipynb Who WhoWhat session.run("MATCH (you:Person {name:'You'})" "FOREACH (name in ['Anna'] |" " CREATE (you)-[:FRIEND]->(:Person {name:name}))") result = session.run("MATCH (you {name:'You'})-[:FRIEND]->(yourFriends)" "RETURN you, yourFriends") Neo4j Insert Query Neo4j Jupyter example & visualize
  • 133. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 134. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN What kind of problem we want to solve ? Can we figure out that these sentences are positive or negative? 돈이 아깝지 않다 (긍정) 다시는 오지 않을 거야 (부정) 음식이 정말 맛이 없다 (부정) 이 식당은 정말 맛있다 (긍정) Analysis negative and positive with dictionary word “않다” is usually negative but ? 돈이 아깝지 않다 => Positive 다시는 오지 않을 거야 => Negative
  • 135. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN There are many ways of doing text classification.. Traditional Rule based Machine Learning - Logistic & SVM Deep Learning - CharCNN, RNN, Etc..
  • 136. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN Paper Model SST-1 SST-2 Socher et al.(2013) Recursive Neural Tensor Network 45.7 85.4 Kim(2014) Multichannel CNN 47.4 88.1 Kalchbrenner et al.(2014) DCNN with k-max pooling 48.5 86.8 Tai et al.(2015) Bidirectional LSTM 48.5 87.2 Le and Mikolov(2014) Paragraph Vector 48.7 87.8 Tai et al.(2015) Constituency Tree-LSTM 51.0 88.0 Kumar et al.(2015) DMN 52.1 88.6 https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/ https://arxiv.org/pdf/1708.02709.pdf Semantic Analysis - CharCNN
  • 137. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb Deep Learning Method CharCNN can be a solution for this kind of problem. 1 2
  • 138. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb Preparing Data for embedding is pretty similar to other neural networks 1. Word Embedding & OneHot didn’t show that much difference. 2. Personally, prefer to concat char onehot + word2vector오늘 메뉴 는 뭐 지? PAD PAD 1. Need to define sentence max length 2. Need padding like other nlp neural networks
  • 139. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb Using Multi Convolution Filter Size
  • 140. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN http://localhost:8888/notebooks/tensormsa_jupyter/chap05_nlp/charcnn/charcnn.ipynb Other steps are same (fully connected > softmax > loss> optimizer)
  • 141. Session 1 - Understand NLP NLP - Semantic Analysis - CharCNN You can see Char CNN can distinguish two sentences
  • 142. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 143. Session 1 - Understand NLP NLP - Discourse Analysis https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/ Paper Model bAbI (Mean accuracy %) Farbes (Accuracy %) Fader et al.(2013) Paraphrase-driven lexicon learning 0.54 Bordes et al.(2014) Weekly supervised embedding 0.73 Weston et al.(2014) Memory Networks 93.3 0.83 Sukhbaatar et al.(2015) End-to-end Memory Networks 88.4 Kumar et al.(2015) DMN 93.6 Discourse Analysis - End2End Memory Network
  • 144. Session 1 - Understand NLP Discourse Analysis - End2End Memory Network https://arxiv.org/pdf/1503.08895v4.pdf https://arxiv.org/pdf/1503.08895v4.pdf NLP - Discourse Analysis
  • 145. Session 1 - Understand NLP Here is the network architecture of end2end memory network https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/ https://www.slideshare.net/mobile/carpedm20/ss-63116251 NLP - Discourse Analysis - Memory Network
  • 146. Session 1 - Understand NLP (1) Feed data (“Sentences”, “Question”, “Target”) 1 2 3 NLP - Discourse Analysis - Memory Network
  • 147. Session 1 - Understand NLP Convert word index to embedding vector (Training target vector A,B,C) 1 3 Vocab Size 2 Dim Size vocab size Mem Size NLP - Discourse Analysis - Memory Network
  • 148. Session 1 - Understand NLP Embedding A from given context sentences multiply Input Question Embedding (using embedding B which is not defined on this code) ※ if it’s a first layer, if not it would be output of t-1 layer 1 2 1 2 multiply NLP - Discourse Analysis - Memory Network
  • 149. Session 1 - Understand NLP NLP - Lexical Analysis - Memory Network Set embedding C(on the code it’s B) this is also the target variable for train
  • 150. Session 1 - Understand NLP Embedding C(one the code it’s B) Multiply softmax result NLP - Discourse Analysis - Memory Network
  • 151. Session 1 - Understand NLP For the last multiply question and output of memory network again NLP - Discourse Analysis - Memory Network
  • 152. Session 1 - Understand NLP stack more memory layers NLP - Discourse Analysis - Memory Network
  • 153. Session 1 - Understand NLP Set fully connected layer and calculate error with softmax cross entropy NLP - Discourse Analysis - Memory Network
  • 154. Session 1 - Understand NLP On the given code I removed 90% of data set because we are using CPU for education.. So result may can be poor….. NLP - Discourse Analysis - Memory Network
  • 155. Session 1 - Understand NLP https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/ https://github.com/YerevaNN/Dynamic-memory-networks-in-Theano Dynamic Memory Networks Episodic Memory Other types of memory networks .. NLP - Discourse Analysis - Memory Network
  • 156. Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning Basic NLU Server (Understand) NLG Server (Generate) SyntaxNet Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론 기본 이론 관련 딥러닝 이론 설명 Session 1 - Understand NLP Session 1 - Now We are Here ! Response Generation Memory Network Seq2Seq
  • 157. Session 1 - Understand NLP Seq2Seq 모델은 기계번역, 요약, 간단한 질답 등 말 그대로 Input 과 Output 이 모두 Sequence Data 인 다양한 케이스에 적용이 가능하며, 이를 간단한 트릭을 적용하여 답변을 생성하는 용도로 사용할 수 있다. - Input : 딥 러닝 재미 즐거운 일 - Output : 딥 러닝은 재미있고 즐거운 일이다 https://arxiv.org/pdf/1406.1078.pdf https://www.slideshare.net/KeonKim/attention-mechanisms-with-tensorflow NLP - Response Generator - Seq2Seq https://nlp.stanford.edu/pubs/emnlp15_attn.pdf
  • 158. Session 1 - Understand NLP NLP - Response Generator - Attention Mechanism Attention Mechanism on Machine Translation https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
  • 159. Session 1 - Understand NLP NLP - Response Generator - Attention Mechanism Attention Mechanism on Machine Translation Bahdanau http://aclweb.org/anthology/D15-1166 Luong https://blog.heuritech.com/2016/01/20/attention-mechanism LocalGlobal Input Feeding
  • 160. Session 1 - Understand NLP NLP - Response Generator - Bahdanau https://blog.heuritech.com/2016/01/20/attention-mechanism/ Without Attention Mechanism With Attention Mechanism
  • 161. Session 1 - Understand NLP NLP - Response Generator - Bahdanau 1.embedding layer with inputs ○ embedded = embedding(last_rnn_output) 2.attention layer with inputs and outputs , normalized to create ○ attn_energies[j] = attn_layer(last_hidden, encoder_outputs[j]) ○ attn_weights = normalize(attn_energies) 3.context vector as an attention-weighted average of encoder outputs ○ context = sum(attn_weights * encoder_outputs) 4.RNN layer(s) with inputs and internal hidden state, outputting ○ rnn_input = concat(embedded, context) ○ rnn_output, rnn_hidden = rnn(rnn_input, last_hidden) 5.an output layer with inputs , outputting ○ output = out(embedded, rnn_output, context)
  • 162. Session 1 - Understand NLP NLP - Response Generator - Implementation http://localhost:8888/tree/chap05_nlp/attention_seq2seq data_util (1)Data Processing & Feed Data
  • 163. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (2)Word Embedding
  • 164. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (3)Encoder
  • 165. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (4)Attention
  • 166. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (5)Decoder & Attention
  • 167. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (6)Loss & Optimization
  • 168. Session 1 - Understand NLP http://localhost:8888/tree/chap05_nlp/attention_seq2seq NLP - Response Generator - Implementation (7)Inference Task
  • 169. Session 1 - Understand NLP NLP - Response Generator - Seq2Seq Pointer Network https://medium.com/@devnag/pointer-networks-in-tensorflow-with-sample-code-14645063f264 논문 저자들은 “포인터 네트워크"라는 새로운 뉴럴넷 구조를 제안합니다. 포인터 네트워크는 집중 메커니즘을 가진 seq2seq 구조로, 입력의 "인덱스"를 출력합니다. 출력 보카가 입력 시퀀스의 길이에 따라 달라지므로 다양한 크기의 입력을 다룰 수 있다는 장점이 있습니다. (주석: 기존의 seq2seq나 뉴럴 튜링 머신은 고정된 길이만 다룰 수 있었습니다.) 여기서 사용한 집중 메커니즘은 표준 seq2seq 집중 메커니즘을 살짝 변형했으며 O(n^2)의 시간 복잡도를 갖습니다. 논문 저자들은 제안한 구조를 평가하기 위해 컨벡스 헐, 딜루나이 삼각화, 순환 판매원 문제(TSP) 등 입력의 위치(순서)를 정답으로 출력해야하는 과제를 사용했습니다. 그 결과 포인터 네트워크는 잘 작동했고, 심지어 학습 데이터보다 더 긴 길이의 시퀀스에서도 동작했습니다. What else ?
  • 170. Session 2 - Make ChatBot
  • 171. Session 2 - 강의 목표 Sessionn 1에서 배운 NLP에 대한 이해를 바탕으로 AI를 적용하여 전체 아키텍쳐를 이해하고 피자 주문 봇을 바탕으로 수강생분들이 자기만의 ChatBot을 만들어 가는 것을 목표로 함 Session 2 - Make ChatBot
  • 172. Session 2 : Susang Kim healess1@gmail.com ●Chatbot Develover ○ Released in POSCO (Find people using by NLP/AI) ○ Deep Learning MSA (ML,DNN, CNN, RNN) ●Agile Develover (worked at Pivotal Labs) ○ TDD, CI, Pair programming, User Story ●iOS Develover (Ranked App store in 100th - 2011 Korea) ●Front-End Developer (React, D3, Typescript and ES6) ●OSS world Challenge 2017 (on top 12 , on progress now) ●POSCO MES ... (working at POSCO ICT for 10 year)
  • 173. Facebook AI shut down after creating their own language 논문 https://arxiv.org/abs/1706.05125
  • 174. Remind of Session 1 Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning BasicNLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers SyntaxNet Scenario Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System ChatBot Server Numpy Pandas Tensorflow 파이프 라인 데이터 처리 ML & DL Library Scikit Learn Konlpy 개발 관련 데이터 수집 데이터 전처리 모델 훈련 모델 평가 모델 서비스 BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message 기본 이론 관련 딥러닝 이론 설명 예제를 통한 구현 설명 Memory Network Seq2SeqResponse Generation Ontology DM Legacy Data Base [AI Based] Chat-Bot Research Environment Data MartMonitoring Summary Result Train Data AI Model Pipe Line Session 2 - Make ChatBot
  • 175. Session 2 - Make Chatbot [출처 Deview 2016 - https://deview.kr/2016/schedule#session/176] 요즘 왜 Chatbot이 뜨는가?? 직관적인 UX 일관성 있는 경험 음성과 연결 가능 별도 App 설치 필요 없음 다양한 서비스와 연결 가능 빠른 Feedback 플랫폼에 독립
  • 176. Chatbot의 특징 • 많은 기술이 필요 (NLP, AI, F/W, Text Mining and 다양한 개발 skill) • Deep Learning을 공부하는 입장에서 결과 확인이 빠름 - 적은 Computing으로 빠른 결과확인 가능 (Text 기반) • 재미가 있음(Micro Data처리에 비해 Biz dependency가 적은편) - 이미지(CNN)이나 정형Data(DNN)보다는 Data처리에 대한 부담감이 적음 (형태소 분석기등으로 쉽게 전처리 쓴다는 가정하에) • 응용분야가 많음 (API기반의 다양한 서비스 연결 Smart Management) - Intent와 Slot만 채워주면 어느 서비스와 연결가능 • 관련 오픈소스가 적어 블루오션 (한글은 대부분 자체개발해야함) - 다행인건 딥러닝 기반의 언어독립적 Text algorithm이 많이 공개되어 활용 가능 • Bot Service가 있으나 가격부담, 한국어는 잘안됨, Customize 불가 Session 2 - Make ChatBot
  • 177. Session 2 - Understand Chatbot Chatbot은? AI (패턴,맥락) 언어학 (자연언어처리) 프로그래밍 (Data처리-Python) Bot F/W (Story/Slot설계) Architecture (응답속도) Text Mining (Data구성) Chatbot Chatbot 구현을 위해서는 많은 분야의 다양한 기술 필요
  • 178. Session 2 - Make Chatbot 다양한 Chatbot Platform이 존재는하고 있음 API.AI로 코딩없이 챗봇 만들기 https://calyfactory.github.io/api.ai-chatbot/ 모든 챗봇에는 의도와 개체인식이 존재 또한 그 것을 위해서는 Data가 중요함!!! api.ai에 가입해서 챗봇을 만들어보면서 원리를 파악해보면 도움이 됨
  • 179. Session 2 - Make Chatbot Closed Domain vs Open Domain Rule Based General (abstract) Open Closed Retrieval (accuracy) Impossible Strong AI Weak AI level of difficulty 작은 Biz 도메인으로 시작해서 정확도를 높이면서 여러 Biz를 추가하는 상황
  • 180. Session 2 - Make Chatbot Rule Based vs AI Computer Input Program Output Rule 이름, 지역, 팀등 조건별로 일일이 rule을 등록해야한다 - 정확도는 올라가나 모든 질문을 다 등록?? (룰을 백만개 등록하면 가능) Computer Input Output Program AI(ML, DL) 라벨링된 Data만으로 결과를 구할 수 있는 모델을 만들 수 있다 - 비슷한 Data들도 잘찾는편(Word2Vec,Glove) intent = 판교에 근무하는 김수상 찾아줘 => Intent : 특정 지역 사람 찾아줘 NER = 판교에 근무하는 김수상 찾아줘 => NER : B-Loc O O B-Name O 정확한 결과를 얻을 수 있으나 모든 질문은 불가 비슷한 유형의 질문은 적당히 잘 찾아줌 Data가 많을 수록 정확도 향상(학습효과) If (loc = 판교 and comp = 포스코ICT) person = 김수상 elif (loc = 판교 and comp = SK) person = 가나다 else person = 홍길동
  • 181. Make ChatBot Now Lexical Analysis Syntactic Analysis Semantic Analysis Word Embedding BilstmCrf CharCNN Deep Learning BasicNLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers SyntaxNet Scenario Voice Recognition Discourse Analysis 자연어 처리 이론 ML & DL 이론[Retrieval Based] Chat-Bot System ChatBot Server Numpy Pandas Tensorflow 파이프 라인 데이터 처리 ML & DL Library Scikit Learn Konlpy 개발 관련 데이터 수집 데이터 전처리 모델 훈련 모델 평가 모델 서비스 BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message 기본 이론 관련 딥러닝 이론 설명 예제를 통한 구현 설명 Memory Network Seq2SeqResponse Generation Ontology DM Legacy Data Base [AI Based] Chat-Bot Research Environment Data MartMonitoring Summary Result Train Data AI Model Pipe Line Session 2 - Make ChatBot This Lesson
  • 182. Session 2 - Make Chatbot 나만의 ChatBot를 만들어보자 피자 주문 챗봇을 어떻게 만들지? 피자를 주문하려면 피자 종류도 여러가지고, 사이즈도 다양하고,장소와 날짜, 사이드메뉴도등 다양한데 어떻게 ChatBot으로 만들 수 있을까? ⇒ 피자주문과 관련된 스토리가 구성되야함 ⇒ 딥러닝과 적당한 로직으로 피자 주문 Bot을 만들어보자
  • 183. Session 2 - Make Chatbot NLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers Scenario Chat-Bot System ChatBot Server BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message Session 2 - Make Chatbot 질문 : 판교에 포스코ICT에 배달해줘 답변 : 사이즈를 선택해주세요 답변 : 장소를 입력해주세요 답변 : 피자주문 처리가 완료되었습니다. Text(Message) 1 3 4 2
  • 184. Session 2 - Make Chatbot Chatbot Interface Flow NLP Context Analyzer Decision Maker 판교에 포스코ICT에 배달해줘 Intent : 피자주문 Entity : 장소 = 판교 포스코ICT Service Manager Response Generator 메뉴=null 시간=null 배달관련 Slot 분석(Knowlodge Base/Scenario) Entity : 메뉴:Null, 시간:null 피자주문 처리가 완료되었습니다. 피자주문 Slot 완성 어떤 메뉴를 원하시나요? 어떤 메뉴를 원해? (Tone Gen) Slot OK
  • 185. Session 2 - Make Chatbot Story slot의 구성 (Frame-based DM) 피자 주문하고 싶어 Pizza Slot Size Type Side menu 피자 주문 의도 파악 피자 Bot의 스토리 구성 1) 어떤 사이즈를 원하시나요? 2) 어떤 종류를 원하시나요? 3) 사이드 메뉴는 필요하신가요? 사용자 답변 - 페파로니 피자로 라지 사이즈에 콜라추가해주세요 NER처리 및 Slot 구성 Pizza Slot Size Large Type Pepperoni Side menu cola 서비스 연결 (Slot API Call) 처리를 위해 Slot를 선택할 수 있게 보여주는 것도 방법 (UX기술까지 필요??)
  • 186. Session 2 - Make Chatbot 1. 맥북 프로 검색해줘 2. 전처리 -> 맥북 프로 NER 3. 맥북프로 -> 대표 Entity처리 -> MacBook Pro API Call 4. 검색결과 출력 5. 상세 서비스 조회를 위한 Slot 출력 6. 새상담 원할 경우 새상담 클릭 Slot를 선택할 수 있게 화면에 출력함으로써 챗봇의 정확도를 대폭 향상 시킬 수 있음 (해당 Frame안에서만 선택할 수 있기에…) ex) “삼성 노트북” 쳐보면 Slot별 선택 바로봇 http://www.11st.co.kr/toc/bridge.tmall?method=chatPage Slot Trigger API
  • 187. Session 2 - Make Chatbot NLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers Scenario Chat-Bot System ChatBot Server BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message Session 2 - Make Chatbot 판교에 포스코ICT에 배달해줘 NLU를 어떻게 한다는거지? => AI 적용을 위해 Vector로 변환을 해야함 1
  • 188. Session 2 - Make Chatbot Word Represention의 정의 (컴퓨터가 잘 이해할수 있게) - One Hot은 단어별 강한 신호적 특성으로 Train 에 효과적 (Scope가 작을경우-Sparse) - Word 단위 Embedding 은 단어를 잘 기억함 (But Sparse) / W2V (유사도) - Gloves는 단어의 세부 종류까지도 구분 (카라칼-고양이) - Char 단위 Embedding 은 미훈련 단어 처리에 용이 (Vector을 줄이기위한 영어변환) - 한글을 변환한 영어 Char 단위 Embedding는 백터 수를 줄이면서 영어 처리도 가능 Train을 위한 Word Representation 15 한국어에 적합한 단어 임베딩 모델 및 파라미터 튜닝에 관한 연구.pdf
  • 189. Session 2 - Make Chatbot 일반적으로 Biz에 따른 Text는 존재하나 Deep Learning를 구현하기 위해서는 정제된 Text와 Tagging이 가능한 매우 많은 Data가 있어야함 한국어 Corpus를 일반적으로 세종 말뭉치를 사용하여 추가적인 Biz 어휘는 새로 학습시킴(노가다) - Corpus (annotation) 세종말뭉치(2007 ) https://ithub.korean.go.kr/user/main.do - 물결21 (2001~2014) 소스X http://corpus.korea.ac.kr/ - Web Crawling or down (Wiki, Namu Wiki) - Domain Specific의 경우엔 Text Data는 직접 만들어야함(Augmentation) 특화된 단어의 경우 새로 학습시켜야함 (ㅎㅇ? , 방가방가) ※ 고유명사등 새로운 어휘가 생성될때 새로 등록을 해주어야함 Data를 어떻게 얻는가?
  • 190. Session 2 - Make Chatbot 문체부·국립국어원 '2차 세종계획' 추진 4차 산업혁명의 기반인 인공지능(AI)의 핵심 중 하나는 사람과 기계의 자유로운 의사소통이다. 컴퓨터가 인간의 말이나 글을 제대로 이해하고 반응하려면 인간이 말하고 쓰는 자연언어를 처리할 수 있는 방대한 언어 데이터베이스가 필요하다. 이러한 언어 데이터베이스를 말뭉치(corpus)라고 한다. 최근 빠르게 보급되는 음성인식 인공지능의 정확도는 이러한 말뭉치가 얼마나 풍부하게 정교하게 구축돼 있느냐에 달려있다. 문화체육관광부와 국립국어원은 한국어 인공지능 기술의 발전을 위해 2018~2022년 총 154억7천만 어절의 말뭉치를 구축하는 국어 정보화사업 계획을 마련했다고 9일 밝혔다.
  • 191. Session 2 - Make Chatbot Train Vector를 정한 후 Feature를 뽑아야함 Cleansing -> Feature Engineering -> Train (상황별 특수문자 제거, 의미 있는 단어 도출 - Tagging) 의도나 객체와 상관있는 단어만 추출해내어 성능을 향상시킴Train Cost를 줄이고 모델의 성능을 향상) 임베딩 차원도 줄이는 효과 (Dense Respresention-SVD) abcd~z, 0~9, ?, !, (,),’,’,공백등 약 70여개 초중종성으로 글자를 쪼개기에는 어려움 .lower()를 활용하는것도 방법 백터 줄이기 학습시킬 Data의 구성
  • 192. Session 2 - Make Chatbot 판교에 포스코ICT에 배달해줘 Data의 양이 적은데 어떻게 정제된 Data를 구하지? [AI Based] Chat-Bot Research Environment Data MartMonitoring AI Model Pipe Line Session 2 - Make ChatBot Session 2 - Make Chatbot 1
  • 193. Session 2 - Make Chatbot Data Augmentation for AI (Intent - tag) 판교에 오늘 피자 주문해줘 Story Definition Intent Mapping주문 해줘 Entity Mapping 메뉴 : 피자, 장소 : 판교, 날짜 : 오늘 Pattern Generation 30% of Train Data 의도 : 피자 주문 (주문) Preprocessing판교 오늘 피자 주문 Story key value (주문) tagloc tagdate tagmenu 주문 Model Train(Char-CNN) Evaluation tagloc tagdate tagmenu 주문 tagloc tagdate 주문 tagdate tagmenu 주문 tagloc tagmenu 주문 Predictiontagloc tagdate 주문 tagmenu Hyper parameter Selection 의도 = 주문
  • 194. Session 2 - Make Chatbot Data flow for Model in AI (NER - BIO) 판교에 오늘 피자 주문해줘 Story Definition tagloc tagdate tagmenu 주문 BIO-Mapping Preprocessing판교 오늘 피자 주문 B_Loc / B_Date / B_menu Model Train(Bi-LSTM) B-loc B-date B-menu 주문 B-loc B-date 주문 B-date B-menu 주문 B-loc B-menu 주문 Text Generator Pattern Matching tagloc tagdate tagmenu 주문 tagloc tagdate 주문 tagdate tagmenu 주문 tagloc tagmenu 주문 W2V 30% of Train Data Evaluation Prediction판교 오늘 피자 주문 Hyper parameter Selection 피자 : 0.12 장소 : 0.7 메뉴 : 0.3 객체인식 B_loc O B_Date B_menu 주문 O
  • 195. Session 2 - Make Chatbot NLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers Scenario Chat-Bot System ChatBot Server BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message Session 2 - Make Chatbot 판교에 포스코ICT에 배달해줘 Data는 구했는데 의도를 어떻게 알아내지? 1
  • 196. Intent를 알아내는법 (Text Classification) 피자주문 하고 싶어 / 여행 정보 알려줘 / 호텔 예약해줘 주문, 정보, 예약의 3가지 의도 문장 내 Word검색으로 일일이 파악할 수도 있으나 한계가 있음 ex) 피쟈 시켜먹고 싶어 / 여행 좋은데 알려줘…. Deeplearning를 활용하면 이런 문제들을 해결 할 수 있음 Char + CNN으로 분류해보자 (CNN - Feature 주문, 정보, 예약) (Word Similarity 피자, 피쟈 / 정보, 갈만한데)
  • 197. Intent를 알아내는법 (Text Classification - Data 구성) Word 피자 주문 하고 싶어 Vector가 많다면 영어발음변환 PIJA JUMUN HAGO SIPO 숫자, 특수문자,공백등 모두 고려해야함 W2V(Pretrained) 피자 (0.12, 0.54, 0.72) 주문(0.56, 0.65, 0.64) 하고(0.67, 0.91, 0.13) 싶어(0.89, 0.14, 0.11) Ont Hotencoding (Word단위 or 글자단위) (0100000000) (0000010000) (0010000000) (0000000100) Ont Hotencoding (A~Z Vector) (0100000000) (0000010000) (0010000000) (0000000100)
  • 198. Char CNN? CNN은 일반적으로 이미지의 특징을 추출하여 인식하는데 많이 쓰이나 이미지도 결국은 Vector이고 텍스트도 Vector을 감안하면 텍스트의 Feature를 뽑아낼 수 있음
  • 199. Text Classification - Char CNN 지금 피자 주문 하고 싶어 [논문 Convolutional Neural Networks for Sentence Classification - Yoon Kim - https://arxiv.org/abs/1408.5882] 예약 주문 정보 Feature 바라볼단어수 [3,4,5 filter] Vector (W2V) 길이/차원/윈도우 Static / Non Static / Random pooling 추상화 classification 분류 Char-CNN을 활용하여 의도를 파악해보자
  • 200. Why Char-CNN?? Char-CNN이 일반적인 다른 알고리즘과 비교하여 좋은 성능을 보임 논문 Convolutional Neural Networks for Sentence Classification - Yoon Kim - https://arxiv.org/abs/1408.5882
  • 201. Text Classification (Multi-class SVM) Char-CNN보다 간단하게 Machine Learning를 활용하여 의도를 파악할 수 있음
  • 202. Session 2 - Make Chatbot NLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers Scenario Chat-Bot System ChatBot Server BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services message Session 2 - Make Chatbot 판교에 포스코ICT에 배달해줘 Entity는 어떻게 알아내지? 1
  • 203. RNN에 대한 이해 연속된 Data에 대한 모델링에 유용 시퀀스를 입력으로 받기 때문에 Backpropagation을 시간에 대해서도 수행(BPTT) http://aikorea.org/blog/rnn-tutorial-3/
  • 204. http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf Seq2Seq (RNN+RNN) 이해 Chatbot에서는 Generator의 역활 Sentence Generator 영화 자막이나 소설책을 활용하여 학습시킬 수 있음 (형태소 분석기로 input/output정의)
  • 205. http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf LSTM에 대한 이해 Cell State https://brunch.co.kr/@chris-song/9 updateforget out cell state
  • 207. Named Entity Recognition 알아내기 Bidirectional LSTM (양방향 Layer) - RNN기반의 모델 - 특정위치에 있는 단어의 태깅에 유용 문장내 단어 위치에 따른 의미 처리하는 효과적인 방법 [ 한국어 정보처리 학술대회 - https://sites.google.com/site/2016hclt/jalyosil]
  • 208. Why Bi-LSTM CRF ? [ Bidirectional LSTM-CRF Models for Sequence Tagging - https://arxiv.org/pdf/1508.01991.pdf ]
  • 209. 피자 주문하고 싶어 B-Pizza B-Order O O 여행 정보 알려줘 B-Travel B-Information O 호텔 예약해줘 B-Hotel B-Reserve O Named Entity Recognition 알아내기 brat를 활용 BIO Tagging B-시작어휘 I-이어지는 어휘 O-어휘아님, 공백(OUT) U-Unknown (Word Embedding이 없을시) ※New York?,수상하다? Brat - http://brat.nlplab.org/examples.html / https://wapiti.limsi.fr/
  • 210. Bi-LSTM으로 사전 강화 -> 모델 학습 피자 주문하고 싶어 B-Pizza B-Order O O 여행 정보 알려줘 B-Travel B-Info O 호텔 예약해줘 B-Hotel B-Reserve O 피이쟈 주문하고 싶어 놀러갈 정보 알려줘 숙소 예약해줘 피자 여행 호텔 Bi-LSTM을 통해서 신규 어휘를 도출하고 학습Data에 반영하여 모델의 성능을 지속적으로 향상 시킴
  • 211. Session 2 - Make Chatbot NLU Server (Understand) NLG Server (Generate) DM Server Messaging Platform BackEnd Service Servers Scenario Chat-Bot System ChatBot Server BackEnd Service Servers message intent & slot information message message Semantic Frame Semantic Frame connect services 판교에 포스코ICT에 배달해줘 의도도 파악했고 Entity도 알아냈으니 서비스를 만들어보자 message Session 2 - Make Chatbot 12 3
  • 212. Session 2 - Make Chatbot ChatBot Layer Log File Chatbot Architecture Deep Learning Layer 위에 ChatBot Layer 와 같은 Application Layer 를 구성하고 각 Application Layer 는 필요한 기능을 DL Layer 와 연동. DeepLearning Layer Bi-LSTM CRF Char-CNN SVM Seq2Seq Attention NAS File Model Bot DB Residual Vgg NLP Context Analyzer Decision Maker Response Genertor ※ 이미지검색을 위해 Residual등과 같은 모델 활용 Bot Builder GPU Deeplearning Predict Dict File Bot config Train Train Intent / NER
  • 213. Session 2 - Make Chatbot NLP Architecture Preprocessing Python Konlpy Mecab (Sejong Corpus) Tensorflow SVM Char-CNN Bi-LSTM CRF Gensim FastText User-Dic Synonym Voting Python API Service (Swagger) 판교 근무하는 포스코ICT에 김수상한테 피자 주문하고 싶어... [Intent 도출] 피자 주문 [NER 도출] 판교 - Loc 포스코ICT - Loc 김수상 - Name 고유명사 ('포스코'ICT'', 'NNP'), ('김수상', 'NNP'), ※Mecab 고유명사등록 링크 문장길이체크 , 특수기호 (...) 삭제 명사 추출 명사 추출 [('판교', 'NNG'), ('근무', 'NNG'), ('하', 'XSV'), ('는', 'ETM'), ('포스코'ICT'', 'NNP'), ('에', 'JKB'), ('김수상', 'NNP'), ('한테', 'JKB'), ('피자', 'NNG'), ('주문', 'NNG'), ('하', 'XSV'), ('고', 'EC'), ('싶', 'VX'), ('어', 'EC')] Intent Slot 및 모델 비교 피자주문 Slot의 Entity값 NER 결과값 Input Data=’’ 판교 근무하는 포스코ICT에 김수상한테 피자 주문하고 싶어…” Intent=’피자주문’ Intent_History=[‘피자주문’,’’] story_slot_entity { ‘메뉴’:피자’’, ‘지역’ : ’판교 포스코ICT’, ‘이름’ : ‘김수상’} request_type=’text’ service_type=’pizza order’ output_data=’’ } Meta
  • 214. Session 2 - Make Chatbot Docker (Ubuntu) in AWS EC2 (c4.8xlarge / p2.xlarge GPU) NAS DB Server Bot Builder (analysis) React Chatbot Server (Django) Python Tensorflow Postgres SQL Bootstrap Web Service Architecture(MSA) D3 SCSS Konlpy Nginx Celery Log File Model File Rabbit MQ Service Java Node Python Rest Gensim Front-End Java (Trigger) Rest LB Rest AP2 GPU Server (HDF5) GPU Server (HDF5) Dict File Hbase
  • 215. Session 2 - Make Chatbot Bot Builder and UX (Story)
  • 216. Session 2 - Make Chatbot ChatBot Definition ChatBot Intent ChatBot Service ChatBot Intent Entity ChatBot Story ChatBot Response ChatBot Model ChatBot Tagging ChatBot Entity Relation ChatBot Synonym Bot Builder DB Service의 확대를 위해 가능하면 Common하게 구성
  • 217. Session 2 - Make Chatbot Rest API Client Input Data=페파로니 피자 주문할께 Intent=’’ Intent_History=[‘ ’,’’] story_slot_entity { 메뉴:’’, 사이즈:’’, 사이드:’’ } request_type=text service_type=’’ output_data=’’ Server Input Data= 페파로니 피자 주문할께 Intent=피자주문 Intent_History=[‘피자주문’,’’] story_slot_entity { 메뉴:피자, 사이즈:라지, 사이드: 콜라 } request_type=text service_type=’’ output_data=주문완료 Chatbot API ※ 필수 값들만 JSON으로 통신하고 다른 값은 Dilog Manager(Log)에서 관리
  • 218. Session 2 - Make Chatbot Case별 Test Coverage 코드 구현 1. 로직 변경 (단위테스트) 2. Model 변경 (Hyper Parameter) 3. Data 변경(Slot, Dict, Entity,유의어) 4. 속성 값 변경 (Threshold, Rule기준) 단순 로직 변경과는 다르게 Data와 Model의 변경사항을 지속적 검증 할 수 있는 방안 필요 가동상황에서 정확도를 올리기 위해선 Continous Integration이 필수 (Jenkins / Travis CI등) Test Codes for Chatbot 피자주문 호텔예약 의도점검->NER점검-> Slot점검 여행정보 input 판교에 피자주문할께 -> intent : 피자주문 slot : {메뉴,크리,사이드-extra}
  • 220. 모델의 정합성을 올리기 위해 복수개의 모델과 로직으로 보완 (Scoring / Voting) 의도를 찾는 경우 여러모델을 비교하여 가장 근접한 값을 찾는다 Textming과 앙상블의 조합으로 정합도르 올리자(Fine tunning) 포스코ICT에 지금 피자 배달해줘 Char-CNN VotingSVM(Multi-class) Result naive_bayes.MultinomialNB 각 의도별 Slot 비교 배달의 경우엔 장소,시간이 필수 여행정보 메뉴배달 메뉴배달 피자 배달 Ensemble and Voting 모델별 가중치 Slot 비교 병렬 수행
  • 221. Trigger 처리 (사랑, 이미지 검색) 1. 사랑단어가 포함될 경우 <실재 가동 사례> 직원 : XXX 사원에게 사랑한다고 포스톡 보내줘 챗봇 : 너무 쉽게 사랑하지 마세요. 직원 : 니가 먼제 내 사랑을 논해 챗봇 : 학습중이라 아직 잘 모르는게 많아요. 직원 : ㅋㅋㅋㅋ 챗봇 : ㅋㅋㅋ [안녕, 사랑, ㅋㅋㅋ] 등에 Trigger를 적용하고 이에 확보된 Data를 Seq2Seq모델에 학습시켜 NLP전처리 모델로 사용 https://www.youtube.com/watch?v=x9bvkXJ-JeQ 2.이미지 검색 시(ResNet Model Call)
  • 222. 필요시 Tone Generater을 쓰자 말투를 다르게만듬 (지역별, 존댓말 , 부하톤) 주문이 완료되었습니다 (일반) 주문이 완료되었단다 (공손) 주문이 완료되었어요 (존대) 주문이 완료되었다니깐 (짜증) Seq2Seq Model활용 - Encoder에 명사등 구성 Decoder에 명사+조사 구성 Response Generator의 경우 형태소 분석기의 응용
  • 223. 유의어 처리(N-Gram) 페파로니 - Pepperoni, 폐파로니, 페파피자..... / Mac Book Pro - 맥프로, 맥북프로... 고객별로 다양한 단어를 사용하나 API호출시에는 지정 값으로 해야 함 N-Gram을 활용하여 유의어로 학습한 결과를 Dict에 찾는 방식 (일반적 trigram) 링크 https://www.simplicity.be/article/throwing-dices-recognizing-west-flemish-and-other-languages/ 각 Entity별 N과 Threshold 값을 적절하게 조절 ※ threshold : 작을수록 비슷하게 찾음
  • 224. Response Speed LB 구성 Nginx 사용 적절한 수의 Thread와 AP Caching of Data (Memory - API사용) Chatbot에서 수용할수 있는 MAX Time반영
  • 225. 학습시 병렬 처리를 위한 Coding tf.device를 통해 연산할 Device를 지정 CPU와 GPU의 적절한 분배 GPU가 많다고 무조건 빠른지는...
  • 227. 마무리 ● 챗봇의 구현에 있어서 Hot한 기술의 사용도 중요하지만 무엇보다 Domain별 Data의 의미를 알고 컴퓨터가 잘 이해할 수 있게 해야함 ● 학습할 Data와 예측 Data의 패턴을 일치화하는 것이 중요(일관성) ● 딥러닝은 대량의 정제된 Data와 확보가 중요함 ● 딥러닝은 성능개선에 있어 충분한 해결 방안이 될 수 있음
  • 228. When the singularity comes... Google IO17 : https://www.youtube.com/watch?v=Y2VF8tmLFHw
  • 229. Reference 모두를 위한 딥러닝 http://hunkim.github.io/ml/ 제28회 한글 및 한국어 정보처리 학술 대회 한국어에 적합한 단어 임베딩 모델 및 파라미터 튜닝에 관한 연구등 Stanford University CS231n http://cs231n.stanford.edu/ Creating AI chat bot with Python 3 and Tensorflow[신정규] https://speakerdeck.com/inureyes/building-ai-chat-bot-using-python-3-and-tensorflow 파이썬으로 챗봇_만들기 [김선동] https://www.slideshare.net/KimSungdong1/20170227-72644192?next_slideshow=1 딥러닝을 이용한 지역 컨텍스트 검색 [김진호] http://www.slideshare.net/deview/221-67605830 Developing Korean Chatbot 101 [조재민] https://www.slideshare.net/JaeminCho6/developing-korean-chatbot-101-71013451 Tensorflow-Tutorials https://github.com/golbin/TensorFlow-Tutorials