4. Agenda
1 • What is machine learning
• Myths & Facts about AI
• Myths & Facts about Chatbot
ABOUT A.I
3 TAKE AWAY
2
• The right metrics for evaluating the
ML model
• How we test FAQ model
• Demo
HOW I TEST THE AI MODEL
4 REFERENCES
• Tools & Libraries
6. What Is Machine Learning?
Machine learning is the subfield of
computer science that gives
computers the ability to learn without
being explicitly programmed.
7. Myths And Facts About A.I
MYTH FACT
Artificial intelligence and machine learning will wipe out
all the jobs.
A.I is no different from other technological advances in
that it helps humans become more effective and
processes more efficient.
“Cognitive AI” technologies are able to understand and
solve new problems the way the human brain can.
“Cognitive” technologies can’t solve problems they
weren’t designed to solve.
You need a PH.D. to work in machine learning & data
science.
Nowadays, a lot of documents and tutorial on the Internet
can help people step by step approach machine learning
world.
8. v
What Is Chatbot?
A computer program designed to
simulate conversation with human
users, especially over the Internet.
9. Myths And Facts About Chatbot
MYTH FACT
Chatbot have only been around for a short while.
ELIZA is one of the most well-known Chatbot
therapists and the bot was created about 50 years
ago.
Texts or voice is the only way to interact with Bots.
Actually Chatbot platforms allows users to interact
with them via graphical interfaces or graphical
widgets, and recent Chatbot platforms follow this
development approach.
All Chatbot platforms use AI.
Not all Chatbot platforms use AI. Most Chatbot
platforms are rule-based which follow a simple,
autonomous process, something along the lines of a
decision tree.
11. Regression
• MSPE
• MSAE
• R Square
• Adjusted R Square
Classification
• Precision – Recall
• ROC-AUC
• Accuracy
• Log-Loss
Unsupervised Models
• Rand Index
• Mutual
• Information
Others
• CV Error
• Heuristic methods to
find K
• BLEU Score (NLP)
The Right Metric For Evaluating
Ml Models
12. Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
Confusion Matrix
Commonly Used Metrics In Classification
13. Accuracy:
• Percentage of total items classified correctly
• Formula:
Commonly Used Metrics In Classification
14. Recall/Sensitivity/TPR (True Positive Rate):
• Number of items correctly identified as
positive out of total true positives
• Formula:
Commonly Used Metrics In Classification
Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
15. Precision
• Number of items correctly identified as
positive out of total items identified as
positive
• Formula:
Commonly Used Metrics In Classification
Actual positive Actual negative
Predicted positive True positive
False positive
(Type I errors)
Predicted negative
False negative
(Type II errors)
True negative
16. Precision
• It is a harmonic mean of precision and recall
• Formula:
Commonly Used Metrics In Classification
Precision Recall F1
1 1 1
0.1 0.1 0.1
0.5 0.5 0.5
1 0.1 0.182
0.3 0.8 0.36
0.8 0.3 0.436
18. Prepare test
data
•Crawl FAQ data
•Generate question
from FAQ data
Run test
•Train model with FAQ
data
•Run test
Analyze
result
•Pre-process the raw
result
•Calculate metrics to
evaluate the AI model
in classification
•Visualize the metrics
Model
Result
•Select the threshold
value
The Process To Test FAQ Model?
19. • Collect FAQ questions data (Manual and
Automate)
• Use NLTK to generate new question data
(NLG)
• Self-defined question data
How We Define Test Data
Set?
20. Train with domain X and run the test defined for domain X.
How We Evaluate The AI Model?
21. • Pre-process the raw result.
• Calculate metrics to evaluate the AI model
in classification.
• Visually metrics.
How We Analyze The
Result?
24. Take Away
• Know main metrics for evaluating ML model.
• Know how to test the classification AI model.
• It is up to your self-learning skills and adaptability to decide whether working on
___ projects (AI, blockchain, VR, etc.) is difficult.
• Use Automation to reduce time and effort to prepare test data
Artificial intelligence and machine learning will wipe out all the jobs:
Technology has been threatening jobs and displacing jobs throughout history. Telephone switching technology replaced human operators. Automatic call directors replaced receptionists. Word processing and voicemail replaced secretaries, email replaced inter-office couriers. Call center technology innovation has added efficiency and effectiveness at various stages of standing up customer service capabilities—from recruiting new reps using machine learning to screen resumes, to selecting the right training program based on specific learning styles, to call routing based on sentiment of the caller and disposition of the rep, to integration of various information sources and channels of communication. In each of these processes, technology augmentation enhanced the capabilities of humans. Were some jobs replaced? Perhaps, but more jobs were created, albeit requiring different skills.
The use of AI-driven chatbots and virtual assistants is another iteration of this ongoing evolution. It needs to be thought of as augmentation rather than complete automation and replacement. Humans engage, machines simplify. There will always be the need for humans in the loop to interact with humans at some level.
Bots and digital workers will enable the “super CSR” of the future and enable increasing levels of service with declining costs. At the same time, the information complexity of our world is increasing and prompting the need for human judgment. Some jobs will be lost, but the need and desire for human interaction at critical decision points will increase, and the CSR’s role will change from answering rote questions to providing better customer service at a higher level, especially for interactions requiring emotional engagement and judgment.
“Cognitive AI” technologies are able to understand and solve new problems the way the human brain can:
Cognitive AI simulates how a human might deal with ambiguity and nuance; however, we are a long way from AI that can extend learning to new problem areas. AI is only as good as the data on which it is trained, and humans still need to define the scenarios and use cases under which it will operate. Within those scenarios, cognitive AI offers significant value, but AI cannot define new scenarios in which it can successfully operate. This capability is referred to as “general AI” and there is much debate about when, if ever, it will emerge. For computers to answer broad questions and approach problems the way that humans do will require technological breakthroughs that are not yet on the horizon.
RMSE (Root Mean Square Error)
MAE is the average of the absolute difference between the predicted values and observed value.
BLEU (Bilingual Evaluation Understudy)
Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive.
Specificity or TNR (True Negative Rate): Number of items correctly identified as negative out of total negatives- TN/(TN+FP)
Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive.
False Positive Rate or Type I Error: Number of items wrongly identified as positive out of total true negatives- FP/(FP+TN)
False Negative Rate or Type II Error: Number of items wrongly identified as negative out of total true positives- FN/(FN+TP)
Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive. Hay còn gọi là tỉ lệ dự đoán chính xác giá trị positive của model
Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive. Hay còn gọi là khả năng phân loại Positive chính xác của model
Precision: Number of items correctly identified as positive out of total items identified as positive- TP/(TP+FP): được định nghĩa là tỉ lệ số điểm true positive trong số những điểm được phân loại là positive. Hay còn gọi là khả năng phân loại Positive chính xác của model
Recall or Sensitivity or TPR (True Positive Rate): Number of items correctly identified as positive out of total true positives- TP/(TP+FN) : được định nghĩa là tỉ lệ số điểm true positive trong số những điểm thực sự là positive. Hay còn gọi là tỉ lệ dự đoán chính xác giá trị positive của model (tỉ lệ bỏ sót positive data)
Mô hình 1: lý tưởng
Mô hình 2: tệ vì dự doán chính xác giá trị positive thấp cũng như bỏ sót giá tị là positive
Mô hình 3: balance
Mô hình 4: tỉ lệ dự đoán chính xác giá trị positive chính xác tuyết đối nhưng tỉ lệ tìm ra positive thấp. Ví dụ: tập data có 100 giá trị positive nhưng model chỉ dự đoán đuọc đúng 1 giá trị là positive data và giá trị đó được dự đoán đúng là positive
Mô hình 5: tỉ lệ dự đóán chính xác giá trị positive thấp nhưng tỉ lệ tìm ra positive cao. Ví dụ: tập data có 100 giá trị positive, model dự đoán 80 giá tị positive nhưng chỉ có 10 trong số đó là positive
Mô hình 5: tỉ lệ dự đóán chính xác giá trị positive cao nhưng tỉ lệ tìm ra positive thấp. Ví dụ: tập data có 100 giá trị positive, model dự đoán 30 giá tị positive và trong 20 giá trị trong số đó là positive