SlideShare a Scribd company logo
1 of 11
ABUSIVE LANGUAGE
DETECTION
Vishnu Vardhan Reddy Yeruva
Introduction
◦ With the increase in the culture of social media and social networking sites, the use of abusive
language has increased rapidly.
◦ Abusive language can be expressed as dirty words in conversation (oral or text)
◦ Every day, millions of comments are posted on the uploaded posts. Abusive language in online
comments initiates cyber-bullying that targets individuals
How to Tackle
◦ There are many algorithms in ML and DL to deal this problem programmatically.
◦ There is an issue even with this i.e. all of the ML and DL algorithms that are available as LSTM,
CNN are only possible with English as there is lot of rich resources available.
◦ This paper includes the description of an application that helped in one of the very rare
language like URDU. That means it even can help any other language using CNN
Challenges
◦ Text classification using DL models has several advantages over conventional
ML models.
◦ The main challenge for ML models is that their performance is dependent on
feature selection methods, Several studies conclude that there is no universal
feature selection method that works well with all types of ML models.
◦ Hidden layers within a DL model extract the useful features automatically. It is
difficult to observe which features are chosen for a specific task.
Steps to approach
◦ Exploring the performance of DL models to detect abusive language from
online comments.
◦ Empirically investigating the performance of these models on two popular
scripts of the Urdu language:
Urdu and Roman Urdu.
◦ Exploring the architectures of RNN models having one-layer and two-layers of
hidden units.
◦ Comparing the performance of DL models with ML models to detect abusive
language.
Learning Technique
◦ ensemble learning technique was applied to classify abusive text collected from
web pages.
◦ three ML models decision tree (DT), NB, and SVM were used to classify abusive
and non-abusive text of the Indonesian language from social media.
◦ Similarly, n-grams techniques were applied to extract features from Arabic
comments, and then ML models were used to classify these comments into
abusive or non-abusive labels.
Architecture
Comparison
Model Training
◦ Wordset = set(vectors.wv.index2word)
◦ This checks whether the word is in Word2vec (a library) corpus (set of words) or not
◦ If word is in the dataset reporting it as abusive. And lopping it continuously by incrementing
counter in order to find the number of abusive words
Real-Time
◦ In real time while running the application it get’s the data automatically updated with json file
as an URL request
◦ In return it responses back the list of prediction of abusive and non abusive with count.
◦ The response will have a key uuid which will have the same uuid which was provided in the url
Abusive Language Detection.pptx

More Related Content

What's hot

Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 

What's hot (20)

Vanishing & Exploding Gradients
Vanishing & Exploding GradientsVanishing & Exploding Gradients
Vanishing & Exploding Gradients
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Spam Detection Using Natural Language processing
Spam Detection Using Natural Language processingSpam Detection Using Natural Language processing
Spam Detection Using Natural Language processing
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detection
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Email spam detection
Email spam detectionEmail spam detection
Email spam detection
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Text classification & sentiment analysis
Text classification & sentiment analysisText classification & sentiment analysis
Text classification & sentiment analysis
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Zero shot-learning: paper presentation
Zero shot-learning: paper presentationZero shot-learning: paper presentation
Zero shot-learning: paper presentation
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to-machine-learning
Introduction to-machine-learningIntroduction to-machine-learning
Introduction to-machine-learning
 
Machine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domainsMachine learning applications nurturing growth of various business domains
Machine learning applications nurturing growth of various business domains
 
MACHINE LEARNING PPT(ML) rohit.pptx
MACHINE LEARNING  PPT(ML) rohit.pptxMACHINE LEARNING  PPT(ML) rohit.pptx
MACHINE LEARNING PPT(ML) rohit.pptx
 
Machine learning for social media analytics
Machine learning for  social media analyticsMachine learning for  social media analytics
Machine learning for social media analytics
 
House price prediction
House price predictionHouse price prediction
House price prediction
 

Similar to Abusive Language Detection.pptx

A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
kevig
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptx
MariYam371004
 

Similar to Abusive Language Detection.pptx (20)

Unit 5f.pptx
Unit 5f.pptxUnit 5f.pptx
Unit 5f.pptx
 
Language Modeling.docx
Language Modeling.docxLanguage Modeling.docx
Language Modeling.docx
 
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLMCrafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
Crafting Your Customized Legal Mastery: A Guide to Building Your Private LLM
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
 
Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)
 
Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)
 
Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)
 
2201.00598.pdf
2201.00598.pdf2201.00598.pdf
2201.00598.pdf
 
short_story.pptx
short_story.pptxshort_story.pptx
short_story.pptx
 
Detection of cyber-bullying
Detection of cyber-bullying Detection of cyber-bullying
Detection of cyber-bullying
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Langauage model
Langauage modelLangauage model
Langauage model
 
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptx
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
1808.10245v1 (1).pdf
1808.10245v1 (1).pdf1808.10245v1 (1).pdf
1808.10245v1 (1).pdf
 

Recently uploaded

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 

Abusive Language Detection.pptx

  • 2. Introduction ◦ With the increase in the culture of social media and social networking sites, the use of abusive language has increased rapidly. ◦ Abusive language can be expressed as dirty words in conversation (oral or text) ◦ Every day, millions of comments are posted on the uploaded posts. Abusive language in online comments initiates cyber-bullying that targets individuals
  • 3. How to Tackle ◦ There are many algorithms in ML and DL to deal this problem programmatically. ◦ There is an issue even with this i.e. all of the ML and DL algorithms that are available as LSTM, CNN are only possible with English as there is lot of rich resources available. ◦ This paper includes the description of an application that helped in one of the very rare language like URDU. That means it even can help any other language using CNN
  • 4. Challenges ◦ Text classification using DL models has several advantages over conventional ML models. ◦ The main challenge for ML models is that their performance is dependent on feature selection methods, Several studies conclude that there is no universal feature selection method that works well with all types of ML models. ◦ Hidden layers within a DL model extract the useful features automatically. It is difficult to observe which features are chosen for a specific task.
  • 5. Steps to approach ◦ Exploring the performance of DL models to detect abusive language from online comments. ◦ Empirically investigating the performance of these models on two popular scripts of the Urdu language: Urdu and Roman Urdu. ◦ Exploring the architectures of RNN models having one-layer and two-layers of hidden units. ◦ Comparing the performance of DL models with ML models to detect abusive language.
  • 6. Learning Technique ◦ ensemble learning technique was applied to classify abusive text collected from web pages. ◦ three ML models decision tree (DT), NB, and SVM were used to classify abusive and non-abusive text of the Indonesian language from social media. ◦ Similarly, n-grams techniques were applied to extract features from Arabic comments, and then ML models were used to classify these comments into abusive or non-abusive labels.
  • 9. Model Training ◦ Wordset = set(vectors.wv.index2word) ◦ This checks whether the word is in Word2vec (a library) corpus (set of words) or not ◦ If word is in the dataset reporting it as abusive. And lopping it continuously by incrementing counter in order to find the number of abusive words
  • 10. Real-Time ◦ In real time while running the application it get’s the data automatically updated with json file as an URL request ◦ In return it responses back the list of prediction of abusive and non abusive with count. ◦ The response will have a key uuid which will have the same uuid which was provided in the url