SlideShare a Scribd company logo
1 of 25
Understanding the Pandemic Through Mining
Covid News Using Natural Language
Processing
- IEEE CCWC, 2021
Presented by: Nishat Anjum
Authors:
Nafiz Sadman1 Nishat Anjum2 Kishor Datta Gupta3
M. A. Parvez Mahmud4
1. Silicon Orchard Research and Analytics Lab (SORAL, research.siliconorchard.com), Dhaka, Bangladesh
2. Independent University, Bangladesh
3. University of Memphis, Memphis, TN, USA
4. Deakin University, Geelong, Australia
ROAD MAP
INTRODUCTION
OUR RESEARCH AIM & CONTRIBUTION
NNK DATASET
EXPERIMENTAL FINDINGS
LIMITATIONS AND FUTURE WORK
INTRODUCTION
88
million
reported
cases
1.9
million
deaths
As of 12 January, 2021, Weekly Epidemiological Update World Wide, World Health Orgnaization
The first cluster of
the COVID-19 was
initially reported
on 31 December
2019, when the
WHO China
Country Office
was informed.
Information exchange media
Social Media
Newspaper
Television/
Digital news
3.6 bil
2.5 bil
600 mil
http://www.ifabc.org/news/More-People-Read-Newspapers-Worldwide-Than-Use-Web.
https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/
Two types of fight agaist COVID-19:
a. Tangible
- Front line doctors, nurses, military personnel, NGOs,
volunteers, etc.
b. Intagible
- Researchers, scientists, academics, etc.
Insignificant number of research based on Natural Language
Processing
compared to:
- Computer Vision applications
- Chest X-ray classifications1
- CT-scans classifications1
- Genome sequencing2
1 - M. M. Ahsan, K. D. Gupta, M. M. Islam, S. Sen, M. Rahman,M. Shakhawat Hossainet al., “Covid-19 symptoms detection basedon
nasnetmobile with explainable ai using various imaging modalities,”Machine Learning and Knowledge Extraction, vol. 2, no. 4, pp. 490–504,2020
2 - G. S. Randhawa, M. P. Soltysiak, H. El Roz, C. P. de Souza, K. A. Hill,and L. Kari, “Machine learning using intrinsic genomic signatures forrapid
classification of novel pathogens: Covid-19 case study,”Plos one,vol. 15, no. 4, p. e0232391, 2020
- A. Alimadadi, S. Aryal, I. Manandhar, P. B. Munroe, B. Joe, andX. Cheng, “Artificial intelligence and machine learning to fight covid-19,”2020
- S. Tuli, S. Tuli, R. Tuli, and S. S. Gill, “Predicting the growth and trendof covid-19 pandemic using machine learning and cloud
computing,”Internet of Things, p. 100222, 2020
OUR RESEARCH AIM & CONTRIBUTION
1. Assert importance of newspapers (print/digital) in battling
COVID-19 through raising public awareness.
2. Utilize newspaper as primary source of information extraction
using Natural Language Processing (NLP) techniques.
3. Understand how newspapers portray the pandemic in a
developed country and in under developing country.
Contribution:
•Analysis and findings of the information extracted
fromnewspapers.1
•The code used to perform data analysis on the newspapers.1
•The dataset (NNK-Dataset) used in this paper.1,2
1. https://github.com/NNK-Dataset
2. https://doi.org/10.34740/kaggle/dsv/1511505
NNK DATASET
1. Data Collection
10 human
annotators
Age: 23-25
Occupation:
Under Grads
The headline must
have one or more
words directly
orindirectly related to
COVID-19.
The content of each news
must have 5 or more
keywords directly or
indirectly related to
COVID-19.
Avoid taking duplicate
reports.
Maintain a time frame for
the newspa-pers.
Covid-News-USA-NNK1
Covid-News-BD-NNK2
Google Forms
500 news from The
Washington Post
500 news from Star
Tribune
25 news from The
Daily Star
25 news from
Prothom Alo
1. https://github.com/NNK-Dataset/USA-NNK/blob/master/usaformlink.md
2. https://github.com/NNK-Dataset/BD-NNK/blob/master/bdformlink.md
2. Data Pre-processing
• Remove hyperlinks.
• Remove non-English, alphanumeric characters.
• Remove stop words
• Lemmatization
3. Data Description
No. of words per
headline
7 - 20
No. of words per
body content
150 - 2100
No. of words per
headline
10 - 20
No. of words per
body content
100 - 1500
Table 1: Covid-News-USA-NNK Table 2: Covid-News-BD-NNK
Date Date when news was posted
Link Hyperlink
Newspaper
Name
Name of newspaper
Headline
Keywords
Keywords extracted from
headline
Report
Keywords
Keyword extracted from
body
Date Date when news was posted
Link Hyperlink
Newspaper
Name
Name of newspaper
Headline Keywords extracted from
headline
Report Keyword extracted from body
4. Dataset Repository, Policy and License
• Project stored in Github: https://github.com/NNK-Dataset
• Covid-News-USA-NNK: https://github.com/NNK-Dataset/USA-NNK
• Covid-News-BD-NNK: https://github.com/NNK-Dataset/BD-NNK
• Kaggle: https://doi.org/10.34740/kaggle/dsv/1511505
• License: CCO (Creative Commons)
EXPERIMENTAL FINDINGS
Word Clouds: Washington Post News (USA)
February, 2020 March, 2020 April, 2020 May, 2020
Word Clouds: Star Tribune News (USA)
February, 2020 March, 2020 April, 2020 May, 2020
Word Clouds:
March, 2020 April, 2020 March, 2020 April, 2020
Daily Star News (BD) Prothom Alo News (BD)
Covid-cases through number extractions:
Cases(based on keyword in news report) related to COVID-19 fromFebruary till
March. X axis represents the month and Y axis represents casesin 10,000.
Numeric Extraction
keywords:
Infected, Died,
Infections, Died,
Quarantined, Lock-
down, Diagnosed.
Vader Sentiment Analysis:
- Average : -0.5 to -0.9 (Scale -1(highly negative) to +1(highly positive))
Keyword extraction using PageRank:
- : ’China’, Government’, ’Masks’, ’Economy’,’Crisis’, ’Theft’ , ’Stock market’ ,
’Jobs’ , ’Election’, ’Missteps’,’Health’, ’Response’.
LIMITATION AND FUTURE WORK
- Starting point for an important dataset.
- Assert importance of NLP in newspaper report analysis.
- Dataset open for research and enhancement
THANK YOU

More Related Content

Similar to understanding the pandemic through mining covid news using natural language processing

Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docxAcademic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docxmakdul
 
Scott Edmunds at Tech4Dev on Open Publishing for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data EraScott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing for the Big-Data EraGigaScience, BGI Hong Kong
 
healthcare-9667352 (1).pdf
healthcare-9667352 (1).pdfhealthcare-9667352 (1).pdf
healthcare-9667352 (1).pdfSouvikMahato4
 
Why pandemics and climate change are hard to understand and make decision mak...
Why pandemics and climate change are hard to understand and make decision mak...Why pandemics and climate change are hard to understand and make decision mak...
Why pandemics and climate change are hard to understand and make decision mak...Alan Dix
 
Coronavirus (COVID-19) SLIDE
Coronavirus (COVID-19) SLIDECoronavirus (COVID-19) SLIDE
Coronavirus (COVID-19) SLIDEBhagyeshTrivedi3
 
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds
 
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sgScott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sgGigaScience, BGI Hong Kong
 
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONE
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONEPANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONE
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONEijcsitcejournal
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science suresh sood
 
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...GigaScience, BGI Hong Kong
 
The COVID-19 fake news detection in Thai social texts
The COVID-19 fake news detection in Thai social textsThe COVID-19 fake news detection in Thai social texts
The COVID-19 fake news detection in Thai social textsjournalBEEI
 
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...GigaScience, BGI Hong Kong
 
Fake news detection for Arabic headlines-articles news data using deep learning
Fake news detection for Arabic headlines-articles news data  using deep learningFake news detection for Arabic headlines-articles news data  using deep learning
Fake news detection for Arabic headlines-articles news data using deep learningIJECEIAES
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Bill Liu
 
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to Start
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to StartJournalism in an Age of Big Data: What It Is, Why It Matters and Where to Start
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to StartLiliana Bounegru
 
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...Liliana Bounegru
 
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...Jake Chen
 
Supporting epidemic intelligence, personalised and public health with advance...
Supporting epidemic intelligence, personalised and public health with advance...Supporting epidemic intelligence, personalised and public health with advance...
Supporting epidemic intelligence, personalised and public health with advance...Joao Pita Costa
 

Similar to understanding the pandemic through mining covid news using natural language processing (20)

Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docxAcademic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
 
Scott Edmunds at Tech4Dev on Open Publishing for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data EraScott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing for the Big-Data Era
 
healthcare-9667352 (1).pdf
healthcare-9667352 (1).pdfhealthcare-9667352 (1).pdf
healthcare-9667352 (1).pdf
 
Why pandemics and climate change are hard to understand and make decision mak...
Why pandemics and climate change are hard to understand and make decision mak...Why pandemics and climate change are hard to understand and make decision mak...
Why pandemics and climate change are hard to understand and make decision mak...
 
Coronavirus (COVID-19) SLIDE
Coronavirus (COVID-19) SLIDECoronavirus (COVID-19) SLIDE
Coronavirus (COVID-19) SLIDE
 
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
 
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sgScott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
 
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONE
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONEPANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONE
PANDEMIC INFORMATION DISSEMINATION WEB APPLICATION: A MANUAL DESIGN FOR EVERYONE
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
 
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
 
The COVID-19 fake news detection in Thai social texts
The COVID-19 fake news detection in Thai social textsThe COVID-19 fake news detection in Thai social texts
The COVID-19 fake news detection in Thai social texts
 
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
 
Fake news detection for Arabic headlines-articles news data using deep learning
Fake news detection for Arabic headlines-articles news data  using deep learningFake news detection for Arabic headlines-articles news data  using deep learning
Fake news detection for Arabic headlines-articles news data using deep learning
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19
 
How to get started with Data Journalism
How to get started with Data JournalismHow to get started with Data Journalism
How to get started with Data Journalism
 
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to Start
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to StartJournalism in an Age of Big Data: What It Is, Why It Matters and Where to Start
Journalism in an Age of Big Data: What It Is, Why It Matters and Where to Start
 
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...
The Rise of Data Journalism: The Making of Journalistic Knowledge through Qua...
 
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...
Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical...
 
Supporting epidemic intelligence, personalised and public health with advance...
Supporting epidemic intelligence, personalised and public health with advance...Supporting epidemic intelligence, personalised and public health with advance...
Supporting epidemic intelligence, personalised and public health with advance...
 

More from Kishor Datta Gupta

Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Kishor Datta Gupta
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataKishor Datta Gupta
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defenseKishor Datta Gupta
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Kishor Datta Gupta
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Kishor Datta Gupta
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Kishor Datta Gupta
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer securityKishor Datta Gupta
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionKishor Datta Gupta
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digitKishor Datta Gupta
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...Kishor Datta Gupta
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Kishor Datta Gupta
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Kishor Datta Gupta
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)Kishor Datta Gupta
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Kishor Datta Gupta
 

More from Kishor Datta Gupta (20)

GAN introduction.pptx
GAN introduction.pptxGAN introduction.pptx
GAN introduction.pptx
 
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
 
A safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable dataA safer approach to build recommendation systems on unidentifiable data
A safer approach to build recommendation systems on unidentifiable data
 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
 
Who is responsible for adversarial defense
Who is responsible for adversarial defenseWho is responsible for adversarial defense
Who is responsible for adversarial defense
 
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
 
Zero shot learning
Zero shot learning Zero shot learning
Zero shot learning
 
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
 
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
 
Machine learning in computer security
Machine learning in computer securityMachine learning in computer security
Machine learning in computer security
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 
Cyber intrusion
Cyber intrusionCyber intrusion
Cyber intrusion
 
Different representation space for MNIST digit
Different representation space for MNIST digitDifferent representation space for MNIST digit
Different representation space for MNIST digit
 
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui..."Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
 
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
 
Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)Adversarial Input Detection Using Image Processing Techniques (IPT)
Adversarial Input Detection Using Image Processing Techniques (IPT)
 
Clustering report
Clustering reportClustering report
Clustering report
 
Basic digital image concept
Basic digital image conceptBasic digital image concept
Basic digital image concept
 
An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)An empirical study on algorithmic bias (aiml compsac2020)
An empirical study on algorithmic bias (aiml compsac2020)
 
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
 

Recently uploaded

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themeitharjee
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 

Recently uploaded (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 

understanding the pandemic through mining covid news using natural language processing

  • 1. Understanding the Pandemic Through Mining Covid News Using Natural Language Processing - IEEE CCWC, 2021
  • 2. Presented by: Nishat Anjum Authors: Nafiz Sadman1 Nishat Anjum2 Kishor Datta Gupta3 M. A. Parvez Mahmud4 1. Silicon Orchard Research and Analytics Lab (SORAL, research.siliconorchard.com), Dhaka, Bangladesh 2. Independent University, Bangladesh 3. University of Memphis, Memphis, TN, USA 4. Deakin University, Geelong, Australia
  • 3. ROAD MAP INTRODUCTION OUR RESEARCH AIM & CONTRIBUTION NNK DATASET EXPERIMENTAL FINDINGS LIMITATIONS AND FUTURE WORK
  • 5. 88 million reported cases 1.9 million deaths As of 12 January, 2021, Weekly Epidemiological Update World Wide, World Health Orgnaization The first cluster of the COVID-19 was initially reported on 31 December 2019, when the WHO China Country Office was informed.
  • 6. Information exchange media Social Media Newspaper Television/ Digital news 3.6 bil 2.5 bil 600 mil http://www.ifabc.org/news/More-People-Read-Newspapers-Worldwide-Than-Use-Web. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/
  • 7. Two types of fight agaist COVID-19: a. Tangible - Front line doctors, nurses, military personnel, NGOs, volunteers, etc. b. Intagible - Researchers, scientists, academics, etc.
  • 8. Insignificant number of research based on Natural Language Processing compared to: - Computer Vision applications - Chest X-ray classifications1 - CT-scans classifications1 - Genome sequencing2 1 - M. M. Ahsan, K. D. Gupta, M. M. Islam, S. Sen, M. Rahman,M. Shakhawat Hossainet al., “Covid-19 symptoms detection basedon nasnetmobile with explainable ai using various imaging modalities,”Machine Learning and Knowledge Extraction, vol. 2, no. 4, pp. 490–504,2020 2 - G. S. Randhawa, M. P. Soltysiak, H. El Roz, C. P. de Souza, K. A. Hill,and L. Kari, “Machine learning using intrinsic genomic signatures forrapid classification of novel pathogens: Covid-19 case study,”Plos one,vol. 15, no. 4, p. e0232391, 2020 - A. Alimadadi, S. Aryal, I. Manandhar, P. B. Munroe, B. Joe, andX. Cheng, “Artificial intelligence and machine learning to fight covid-19,”2020 - S. Tuli, S. Tuli, R. Tuli, and S. S. Gill, “Predicting the growth and trendof covid-19 pandemic using machine learning and cloud computing,”Internet of Things, p. 100222, 2020
  • 9. OUR RESEARCH AIM & CONTRIBUTION
  • 10. 1. Assert importance of newspapers (print/digital) in battling COVID-19 through raising public awareness. 2. Utilize newspaper as primary source of information extraction using Natural Language Processing (NLP) techniques. 3. Understand how newspapers portray the pandemic in a developed country and in under developing country.
  • 11. Contribution: •Analysis and findings of the information extracted fromnewspapers.1 •The code used to perform data analysis on the newspapers.1 •The dataset (NNK-Dataset) used in this paper.1,2 1. https://github.com/NNK-Dataset 2. https://doi.org/10.34740/kaggle/dsv/1511505
  • 13. 1. Data Collection 10 human annotators Age: 23-25 Occupation: Under Grads The headline must have one or more words directly orindirectly related to COVID-19. The content of each news must have 5 or more keywords directly or indirectly related to COVID-19. Avoid taking duplicate reports. Maintain a time frame for the newspa-pers. Covid-News-USA-NNK1 Covid-News-BD-NNK2 Google Forms 500 news from The Washington Post 500 news from Star Tribune 25 news from The Daily Star 25 news from Prothom Alo 1. https://github.com/NNK-Dataset/USA-NNK/blob/master/usaformlink.md 2. https://github.com/NNK-Dataset/BD-NNK/blob/master/bdformlink.md
  • 14. 2. Data Pre-processing • Remove hyperlinks. • Remove non-English, alphanumeric characters. • Remove stop words • Lemmatization
  • 15. 3. Data Description No. of words per headline 7 - 20 No. of words per body content 150 - 2100 No. of words per headline 10 - 20 No. of words per body content 100 - 1500 Table 1: Covid-News-USA-NNK Table 2: Covid-News-BD-NNK Date Date when news was posted Link Hyperlink Newspaper Name Name of newspaper Headline Keywords Keywords extracted from headline Report Keywords Keyword extracted from body Date Date when news was posted Link Hyperlink Newspaper Name Name of newspaper Headline Keywords extracted from headline Report Keyword extracted from body
  • 16. 4. Dataset Repository, Policy and License • Project stored in Github: https://github.com/NNK-Dataset • Covid-News-USA-NNK: https://github.com/NNK-Dataset/USA-NNK • Covid-News-BD-NNK: https://github.com/NNK-Dataset/BD-NNK • Kaggle: https://doi.org/10.34740/kaggle/dsv/1511505 • License: CCO (Creative Commons)
  • 18. Word Clouds: Washington Post News (USA) February, 2020 March, 2020 April, 2020 May, 2020
  • 19. Word Clouds: Star Tribune News (USA) February, 2020 March, 2020 April, 2020 May, 2020
  • 20. Word Clouds: March, 2020 April, 2020 March, 2020 April, 2020 Daily Star News (BD) Prothom Alo News (BD)
  • 21. Covid-cases through number extractions: Cases(based on keyword in news report) related to COVID-19 fromFebruary till March. X axis represents the month and Y axis represents casesin 10,000. Numeric Extraction keywords: Infected, Died, Infections, Died, Quarantined, Lock- down, Diagnosed.
  • 22. Vader Sentiment Analysis: - Average : -0.5 to -0.9 (Scale -1(highly negative) to +1(highly positive)) Keyword extraction using PageRank: - : ’China’, Government’, ’Masks’, ’Economy’,’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’,’Health’, ’Response’.
  • 24. - Starting point for an important dataset. - Assert importance of NLP in newspaper report analysis. - Dataset open for research and enhancement