SlideShare a Scribd company logo
1 of 14
Download to read offline
August 2018 | by Manasi Kulkarni
Understanding NLP
or How Securly (almost) aced the Turing test
NLP
SCANNING...
80%
This paper provides a transparent overview of the engineering that allows Securly to detect
students in danger with an almost 95% accuracy. Securly deploys complex machine
learning algorithms that analyze data against carefully curated data sets. Securly's Natural
Language Process (NLP) engines are put through rigorous training and multiple levels of
data analysis that train them to think like humans when detecting grief, depression,
bullying, self-harm, and suicidal thoughts in kids. Given the impact this has on the ability to
save lives, Securly’s NLP engines are being trained to ace the Turing Test - an industry
benchmark that defines the viability of any AI engine.
ABSTRACT
Understanding NLP- the securly:// way
1
Abstract
Introduction
What is the Turing Test?
Understanding NLP
Saving Lives with NLP
How does NLP work at Securly?
Data Pre-processing
Classification
Alert generation
Training the engines
The Future
Conclusion
1
3
4
5
5
6
9
12
13
CONTENT
Understanding NLP- the securly:// way
2
There has always existed a level of fragility to human emotions, which is why experiences
like depression, sorrow, even suicide ideation, while unfortunate, are nothing new to the
human condition. But when it comes to school-aged kids still in the process of developing
emotionally, the implications of such emotions can at times be fatal.
Technology is learning to be empathetic toward human emotion by recognizing signs of
trouble which might otherwise go overlooked. AI enables Securly to recognize kids who are
suffering and then intervene to make a positive difference. Back in 1950 when the Turing
Test was proposed, the evolution of machines that could ace the test and imitate human
thinking seemed fantastic. Today it is possible. Currently, Securly’s NLP engines are able to
recognize signs of distress in students with a ~ 95% rate of accuracy. When woven into the
fabric of K-12 tech, this can directly be measured in the number of students saved.
INTRODUCTION
Understanding NLP- the securly:// way
3
The Turing Test, proposed by famed computer scientist and mathematician Alan Turing,
tests the capability of a machine to think like a human, an important marker of the viability
of any AI engine. When put through the Turing Test, the machine should be able to
convince a judge (usually a human) that the machine is, in fact, a human. If a machine
passes the Turing Test, then the machine has proven itself to be as intelligent as and
having the ability to think like a human being.
Given the sensitivity of information Securly’s AI engines process and the impact it has on
student lives, our engines are trained in a way that allows them the ability to imitate human
thinking with an almost 95% accuracy. When our 24 teams and NLP engines analyze the
same datasets, the engines interpret the data in the same way the human team does
almost 95% of the times. This has huge implications not just as a reliable source of
detection, but also for reducing the reaction time for parents/ schools/ first-responders
when tragedy seems to be imminent.
WHAT IS THE TURING TEST?
Understanding NLP- the securly:// way
4
Natural Language Processing (NLP) is what happens when linguistics meets computer
science to analyze language and derive meaning that can be used to better understand
the intricacies of the sentiments involved. This analysis involves processing the language
using various mathematical and linguistic rules. NLP helps us understand word frequency
and decipher sentiments behind sentences or large passages that can later be used
meaningfully for a stated purpose.
SAVING LIVES WITH NLP
At Securly, we use NLP for sentiment analysis that helps us analyze emails, Facebook and
Twitter posts, and search queries to understand if the student is suffering from bullying,
grief, depression; has suicidal thoughts, or is contemplating self-harm. This view into the
students’ state of mind on the basis of their written word has helped us saved lives.
NLP forms that first step in our process of understanding students and helps our 24 team
help schools and parents intervene before it is too late. The 24 team is comprised of
former school psychologists, therapists, and others who have worked extensively with
troubled students. Even as we serve millions of students and scan through millions of
emails and posts every day, our NLP engine analyzes every incoming message in a matter
of seconds and helps us dramatically cut reaction times from a scenario where only
humans are processing these messages.
UNDERSTANDING NLP
Understanding NLP- the securly:// way
5
a. Removing stop-words: As a first step, our algorithm removes all commonly occurring
words such as ‘the’, ‘a’, ‘an’ etc. These words do not add special meaning to the text and
it can be understood just as well without it.
HOW DOES NLP WORK AT SECURLY?
Understanding NLP- the securly:// way
6
The process that seems as simple as
input post >> analyze sentiment >> issue alert/pass through
is, in fact, a long computational process that involves complex machine learning algorithms
that analyze incoming data against carefully curated datasets.
1. Data Pre-processing
All raw data coming into the system needs to be pre-processed to prepare it for our
sentiment analysis engine. This pre-processing of data scraps off the noise in it such that
there are no words or characters, deemed unnecessary for sentiment analysis, left over in it.
b. Customized Securly pre-processing: Considering the frequency of spam text and
certain other elements of electronic communication we introduce a second round of
processing where we remove elements that are not part of the actual “conversation” that we
are interested in understanding. This includes timestamps, attachments, meta tags, URLs,
Twitter handles etc.
c. Lemmatization: This is a very important part of the process which helps us derive the
base word of every word we encounter. A standard NLP library that gives the base word for
every word possible is used for this purpose. Lemmatization takes into account the
morphological information of a word and provides its base word. E.g. car, cars, car’s the
base is car; for am, are, it is be. Lemmatization is more than just looking at the stem of a
word and is very important when it comes to analyzing the sentiment behind a sentence.
2. Classification
This pre-processed data is then sent to the sentiment analysis engine for classification. The
data is classified as either clean, bully or grief. Identifying a text as bullying or grief requires
rigorous training of the engine. (We will go into the depths of how the engine is trained in
subsequent sections.) If a text is identified as clean, it is checked against a special
dictionary for filth. No matter what you throw at it, our engine can accurately predict more
than 80% of the time if the text is grief, bully, or clean.
Understanding NLP- the securly:// way
7
3. Alert generation
All grief and bully instances trigger an alert to the respective school admins and parents. It
is also shared with our 24 teams who analyze it individually for criticality and contact the
school, parents or police as required.
The flagged text is also displayed as flagged activity to the school admin (and counselors
or teachers if delegated by the admin) on their Securly admin portal and is available at all
times for reference. It is possible to download student specific and time-range based
activity reports as well if required to share with principals, parents, school boards etc.
Understanding NLP- the securly:// way
8
The accuracy of Securly’s NLP and sentiment analysis engines is dependent upon their
training, and so we lay great stress to ensure that they are trained above industry standards.
The entire pre-processing of data is done during the training phase for the training datasets
as well.
1. Data generation
The Securly datasets are completely driven by the real-life data generated by its filtering
products. We maintain separate datasets for each of the three classes – grief, bully and
clean. The words and phrases in these datasets have a 1:1 mapping with the labels. This
dataset is regularly updated with more words and phrases to ensure greater accuracy of
classification.
The dataset for filth is a dictionary and is not used in the training of the engines as it does
not require any machine learning algorithms for identification of filth during analysis.
2. Vectorization using N gram range
To help us understand data better it is important to break it down into smaller parts or
vectors. This helps analyze a given sentence more accurately. To do this we use the trigram
range method. For example, the text – I am going out – is broken down into groups of one
word, two words, and three words. These N-grams are called feature.
In our example, the features for ‘I am going out’ will be:
One word: I + am + going + out = 4 features
Two words: I am + am going + going out = 3 features
Three words: I am going + am going out = 2 features
Which leads to a total of 9 features for this particular sentence.
Such feature sets are created for every text in the dataset and give us an exhaustive
dictionary of unique features to work with.
TRAINING THE ENGINES
Understanding NLP- the securly:// way
9
Thereafter, a scoring mechanism gives us a score for each of the features. If a feature
occurs too many times it is considered commonplace and its score reduces. The scoring
mechanism is important in identifying the total score of a sentence and classifying it as
clean, bully or grief.
We also use the chi-squared statistics method to help identify the best features to be used
in an engine. This is necessary considering the volume of the dataset, which generates a
large number of features. Using all the features without prioritization can impact the
efficiency and accuracy of the engine.
Understanding NLP- the securly:// way
10
I am going out
I amam
out
I
going
going out
am going
I am going
am going out
Two WordsOne Words Three Words
Features
3
Features
2Features
4
Features in total
9
3. Classifier/ Classification
Here we use supervised Machine Learning algorithms such as logistic regression among
other things to help us draw boundaries around features to determine their class.
Considering that every sentence the engine analyzes is a group of features, it is important
to lay down these boundaries to help us determine what class most of the sentence falls
into. The thickness of these boundaries is also determined by these algorithms and helps
us deal with grey areas efficiently.
4. Cross validation
A process of cross-validation is used to separate out three sets of 10% of the datasets,
which are then used to test and train the engine. Each set is put through all the parameters
during testing and the results are compared to identify the best engine.
A hold-out set is also identified and set aside before the other sets are put through training.
The best engine is then put to test against this hold-out set. The hold-out set remains
constant and is used as the baseline against which every new engine created has to pass
before it is released for use.
Understanding NLP- the securly:// way
11
Working up from this foundation for NLP that currently processes shorter text (1 sentence)
within 1.87 seconds and longer text (approx. 4KB) within 2.21 seconds, we are working
on incorporating the neural networks method as well. Research is underway on this
aspect and will help us achieve onwards of 90% accuracy when released.
Also in progress is our Correlations project, which analyzes and assigns a rating to
individual students that corresponds with the seriousness and urgency of their online
activities. The lowest rating reflects harmless and everyday content, the highest indicates
an imminent threat to a person's life. This will give schools a broader view of a student's
suspicious online activity, and help them determine when an incident needs intervention.
In addition, we are also constantly updating our hold-out set; and working on a
spell-correction algorithm that will help us auto-correct misspelled words, mark them as
distinct features and give us better insight into such text. The difference between a
situation missed and a student saved can be as small as a typo. As the Student Safety
Company, Securly is dedicated to filling in those cracks.
THE FUTURE
Understanding NLP- the securly:// way
12
There are many movies where computers become too smart and attempt to eradicate
humanity. These cautionary tales generate income for the storyteller but simply do not
reflect the reality of the advancements being made in technology. The reality is that
technology like Natural Language Processing and Sentiment Analysis are key to providing
Securly and schools an extended awareness of those times when kids may feel alone or,
worse, utterly hopeless. In situations where time is of the essence, technological advances
allow us to be more aware of and more connected with those in need of help. And if further
advancements result in additional lives saved, then rest assured those advancements will
be made, and the Turing Test aced.
CONCLUSION
Understanding NLP- the securly:// way
13

More Related Content

What's hot

IRJET - Secure Banking Application with Image and GPS Location
IRJET - Secure Banking Application with Image and GPS LocationIRJET - Secure Banking Application with Image and GPS Location
IRJET - Secure Banking Application with Image and GPS LocationIRJET Journal
 
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORD
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORDAN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORD
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORDIJNSA Journal
 
M-Pass: Web Authentication Protocol
M-Pass: Web Authentication ProtocolM-Pass: Web Authentication Protocol
M-Pass: Web Authentication ProtocolIJERD Editor
 
The Immune System of Internet
The Immune System of InternetThe Immune System of Internet
The Immune System of InternetMohit Kanwar
 
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...IJNSA Journal
 
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfing
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-SurfingA Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfing
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfingijcoa
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKijcseit
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 

What's hot (12)

E0962833
E0962833E0962833
E0962833
 
IRJET - Secure Banking Application with Image and GPS Location
IRJET - Secure Banking Application with Image and GPS LocationIRJET - Secure Banking Application with Image and GPS Location
IRJET - Secure Banking Application with Image and GPS Location
 
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORD
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORDAN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORD
AN EFFICIENT IDENTITY BASED AUTHENTICATION PROTOCOL BY USING PASSWORD
 
M-Pass: Web Authentication Protocol
M-Pass: Web Authentication ProtocolM-Pass: Web Authentication Protocol
M-Pass: Web Authentication Protocol
 
J0704055058
J0704055058J0704055058
J0704055058
 
The Immune System of Internet
The Immune System of InternetThe Immune System of Internet
The Immune System of Internet
 
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...
KNOWLEDGE-BASED AUTHENTICATION USING TWITTER CAN WE USE LUNCH MENUS AS PASSWO...
 
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfing
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-SurfingA Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfing
A Novel Crave-Char Based Password Entry System Resistant to Shoulder-Surfing
 
Cyber security
Cyber securityCyber security
Cyber security
 
Smart Password
Smart PasswordSmart Password
Smart Password
 
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORKMALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 

Similar to Nlp whitepaper the securly way

Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...IRJET Journal
 
Karl Mehta Masterclass on AI.pdf
Karl Mehta Masterclass on AI.pdfKarl Mehta Masterclass on AI.pdf
Karl Mehta Masterclass on AI.pdfKarl Mehta
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine LearningIRJET Journal
 
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...IRJET Journal
 
Supervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its applicationSupervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its applicationTara ram Goyal
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxsrikanthkallem1
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisWriteMyThesis
 
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMIRJET Journal
 
Top 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfTop 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfSuraj Kumar
 
Hr salary prediction using ml
Hr salary prediction using mlHr salary prediction using ml
Hr salary prediction using mlshaiksafi1
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeOsama Ghandour Geris
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfPranavPatil822557
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...Wuhan University
 
NLP Techniques for Question Answering.docx
NLP Techniques for Question Answering.docxNLP Techniques for Question Answering.docx
NLP Techniques for Question Answering.docxKevinSims18
 
Machine learning
Machine learningMachine learning
Machine learningAbrar ali
 
B tech vi sem cse ml lecture 1 RTU Kota
B tech vi sem cse ml lecture 1 RTU KotaB tech vi sem cse ml lecture 1 RTU Kota
B tech vi sem cse ml lecture 1 RTU Kotahimanshu swarnkar
 
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...Impetus Technologies
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlpMahmoud Farag
 

Similar to Nlp whitepaper the securly way (20)

Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
 
Karl Mehta Masterclass on AI.pdf
Karl Mehta Masterclass on AI.pdfKarl Mehta Masterclass on AI.pdf
Karl Mehta Masterclass on AI.pdf
 
IRJET - Twitter Sentiment Analysis using Machine Learning
IRJET -  	  Twitter Sentiment Analysis using Machine LearningIRJET -  	  Twitter Sentiment Analysis using Machine Learning
IRJET - Twitter Sentiment Analysis using Machine Learning
 
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
Live Twitter Sentiment Analysis and Interactive Visualizations with PyLDAvis ...
 
Supervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its applicationSupervised Machine Learning Techniques common algorithms and its application
Supervised Machine Learning Techniques common algorithms and its application
 
INTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptxINTERNSHIP ON MAcHINE LEARNING.pptx
INTERNSHIP ON MAcHINE LEARNING.pptx
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHMANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
ANALYSING SPEECH EMOTION USING NEURAL NETWORK ALGORITHM
 
Top 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdfTop 40 Data Science Interview Questions and Answers 2022.pdf
Top 40 Data Science Interview Questions and Answers 2022.pdf
 
Hr salary prediction using ml
Hr salary prediction using mlHr salary prediction using ml
Hr salary prediction using ml
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
machinecanthink-160226155704.pdf
machinecanthink-160226155704.pdfmachinecanthink-160226155704.pdf
machinecanthink-160226155704.pdf
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
NLP Techniques for Question Answering.docx
NLP Techniques for Question Answering.docxNLP Techniques for Question Answering.docx
NLP Techniques for Question Answering.docx
 
Machine learning
Machine learningMachine learning
Machine learning
 
B tech vi sem cse ml lecture 1 RTU Kota
B tech vi sem cse ml lecture 1 RTU KotaB tech vi sem cse ml lecture 1 RTU Kota
B tech vi sem cse ml lecture 1 RTU Kota
 
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...
Building a Sentiment Analytics Solution Powered by Machine Learning- Impetus ...
 
introduction to machine learning and nlp
introduction to machine learning and nlpintroduction to machine learning and nlp
introduction to machine learning and nlp
 
Machine learning
Machine learningMachine learning
Machine learning
 

More from Securly

Best practices to shape and secure your 1:1 Chromebook program
Best practices to shape and secure your 1:1 Chromebook programBest practices to shape and secure your 1:1 Chromebook program
Best practices to shape and secure your 1:1 Chromebook programSecurly
 
Best Practices for Configuring YouTube Restricted Mode
Best Practices for Configuring YouTube Restricted ModeBest Practices for Configuring YouTube Restricted Mode
Best Practices for Configuring YouTube Restricted ModeSecurly
 
David's Law
David's Law David's Law
David's Law Securly
 
Auditor by Securly
Auditor by SecurlyAuditor by Securly
Auditor by SecurlySecurly
 
Anti-Bullying Legislation in the United States
Anti-Bullying Legislation in the United StatesAnti-Bullying Legislation in the United States
Anti-Bullying Legislation in the United StatesSecurly
 
What is Securly?
What is Securly?What is Securly?
What is Securly?Securly
 
Lee's Summit
Lee's SummitLee's Summit
Lee's SummitSecurly
 
Baugo Community Schools
Baugo Community SchoolsBaugo Community Schools
Baugo Community SchoolsSecurly
 
Auditor Admin Config - Uswest
Auditor Admin Config - UswestAuditor Admin Config - Uswest
Auditor Admin Config - UswestSecurly
 
Auditor Admin Config - Useast
Auditor Admin Config - UseastAuditor Admin Config - Useast
Auditor Admin Config - UseastSecurly
 
Securly - Pickens County Case Study
Securly - Pickens County Case StudySecurly - Pickens County Case Study
Securly - Pickens County Case StudySecurly
 
Best practices to shape and secure your 1:1 program for Windows
Best practices to shape and secure your 1:1 program for WindowsBest practices to shape and secure your 1:1 program for Windows
Best practices to shape and secure your 1:1 program for WindowsSecurly
 
Student Safety Reimagined - Product Brief
Student Safety Reimagined - Product BriefStudent Safety Reimagined - Product Brief
Student Safety Reimagined - Product BriefSecurly
 
Can social media save kids' lives?
Can social media save kids' lives?Can social media save kids' lives?
Can social media save kids' lives?Securly
 
East Maine School District Case Study
East Maine School District Case StudyEast Maine School District Case Study
East Maine School District Case StudySecurly
 
Securly Product Brief
Securly Product BriefSecurly Product Brief
Securly Product BriefSecurly
 
Tom's River Case Study
Tom's River Case StudyTom's River Case Study
Tom's River Case StudySecurly
 
Managing Screen Time - The Student's Perspective
Managing Screen Time - The Student's PerspectiveManaging Screen Time - The Student's Perspective
Managing Screen Time - The Student's PerspectiveSecurly
 
1:1 Device Theft in K-12 Schools
1:1 Device Theft in K-12 Schools1:1 Device Theft in K-12 Schools
1:1 Device Theft in K-12 SchoolsSecurly
 
Case Study: Webb City R-VII School District
Case Study: Webb City R-VII School DistrictCase Study: Webb City R-VII School District
Case Study: Webb City R-VII School DistrictSecurly
 

More from Securly (20)

Best practices to shape and secure your 1:1 Chromebook program
Best practices to shape and secure your 1:1 Chromebook programBest practices to shape and secure your 1:1 Chromebook program
Best practices to shape and secure your 1:1 Chromebook program
 
Best Practices for Configuring YouTube Restricted Mode
Best Practices for Configuring YouTube Restricted ModeBest Practices for Configuring YouTube Restricted Mode
Best Practices for Configuring YouTube Restricted Mode
 
David's Law
David's Law David's Law
David's Law
 
Auditor by Securly
Auditor by SecurlyAuditor by Securly
Auditor by Securly
 
Anti-Bullying Legislation in the United States
Anti-Bullying Legislation in the United StatesAnti-Bullying Legislation in the United States
Anti-Bullying Legislation in the United States
 
What is Securly?
What is Securly?What is Securly?
What is Securly?
 
Lee's Summit
Lee's SummitLee's Summit
Lee's Summit
 
Baugo Community Schools
Baugo Community SchoolsBaugo Community Schools
Baugo Community Schools
 
Auditor Admin Config - Uswest
Auditor Admin Config - UswestAuditor Admin Config - Uswest
Auditor Admin Config - Uswest
 
Auditor Admin Config - Useast
Auditor Admin Config - UseastAuditor Admin Config - Useast
Auditor Admin Config - Useast
 
Securly - Pickens County Case Study
Securly - Pickens County Case StudySecurly - Pickens County Case Study
Securly - Pickens County Case Study
 
Best practices to shape and secure your 1:1 program for Windows
Best practices to shape and secure your 1:1 program for WindowsBest practices to shape and secure your 1:1 program for Windows
Best practices to shape and secure your 1:1 program for Windows
 
Student Safety Reimagined - Product Brief
Student Safety Reimagined - Product BriefStudent Safety Reimagined - Product Brief
Student Safety Reimagined - Product Brief
 
Can social media save kids' lives?
Can social media save kids' lives?Can social media save kids' lives?
Can social media save kids' lives?
 
East Maine School District Case Study
East Maine School District Case StudyEast Maine School District Case Study
East Maine School District Case Study
 
Securly Product Brief
Securly Product BriefSecurly Product Brief
Securly Product Brief
 
Tom's River Case Study
Tom's River Case StudyTom's River Case Study
Tom's River Case Study
 
Managing Screen Time - The Student's Perspective
Managing Screen Time - The Student's PerspectiveManaging Screen Time - The Student's Perspective
Managing Screen Time - The Student's Perspective
 
1:1 Device Theft in K-12 Schools
1:1 Device Theft in K-12 Schools1:1 Device Theft in K-12 Schools
1:1 Device Theft in K-12 Schools
 
Case Study: Webb City R-VII School District
Case Study: Webb City R-VII School DistrictCase Study: Webb City R-VII School District
Case Study: Webb City R-VII School District
 

Recently uploaded

What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 

Recently uploaded (20)

What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 

Nlp whitepaper the securly way

  • 1. August 2018 | by Manasi Kulkarni Understanding NLP or How Securly (almost) aced the Turing test NLP SCANNING... 80%
  • 2. This paper provides a transparent overview of the engineering that allows Securly to detect students in danger with an almost 95% accuracy. Securly deploys complex machine learning algorithms that analyze data against carefully curated data sets. Securly's Natural Language Process (NLP) engines are put through rigorous training and multiple levels of data analysis that train them to think like humans when detecting grief, depression, bullying, self-harm, and suicidal thoughts in kids. Given the impact this has on the ability to save lives, Securly’s NLP engines are being trained to ace the Turing Test - an industry benchmark that defines the viability of any AI engine. ABSTRACT Understanding NLP- the securly:// way 1
  • 3. Abstract Introduction What is the Turing Test? Understanding NLP Saving Lives with NLP How does NLP work at Securly? Data Pre-processing Classification Alert generation Training the engines The Future Conclusion 1 3 4 5 5 6 9 12 13 CONTENT Understanding NLP- the securly:// way 2
  • 4. There has always existed a level of fragility to human emotions, which is why experiences like depression, sorrow, even suicide ideation, while unfortunate, are nothing new to the human condition. But when it comes to school-aged kids still in the process of developing emotionally, the implications of such emotions can at times be fatal. Technology is learning to be empathetic toward human emotion by recognizing signs of trouble which might otherwise go overlooked. AI enables Securly to recognize kids who are suffering and then intervene to make a positive difference. Back in 1950 when the Turing Test was proposed, the evolution of machines that could ace the test and imitate human thinking seemed fantastic. Today it is possible. Currently, Securly’s NLP engines are able to recognize signs of distress in students with a ~ 95% rate of accuracy. When woven into the fabric of K-12 tech, this can directly be measured in the number of students saved. INTRODUCTION Understanding NLP- the securly:// way 3
  • 5. The Turing Test, proposed by famed computer scientist and mathematician Alan Turing, tests the capability of a machine to think like a human, an important marker of the viability of any AI engine. When put through the Turing Test, the machine should be able to convince a judge (usually a human) that the machine is, in fact, a human. If a machine passes the Turing Test, then the machine has proven itself to be as intelligent as and having the ability to think like a human being. Given the sensitivity of information Securly’s AI engines process and the impact it has on student lives, our engines are trained in a way that allows them the ability to imitate human thinking with an almost 95% accuracy. When our 24 teams and NLP engines analyze the same datasets, the engines interpret the data in the same way the human team does almost 95% of the times. This has huge implications not just as a reliable source of detection, but also for reducing the reaction time for parents/ schools/ first-responders when tragedy seems to be imminent. WHAT IS THE TURING TEST? Understanding NLP- the securly:// way 4
  • 6. Natural Language Processing (NLP) is what happens when linguistics meets computer science to analyze language and derive meaning that can be used to better understand the intricacies of the sentiments involved. This analysis involves processing the language using various mathematical and linguistic rules. NLP helps us understand word frequency and decipher sentiments behind sentences or large passages that can later be used meaningfully for a stated purpose. SAVING LIVES WITH NLP At Securly, we use NLP for sentiment analysis that helps us analyze emails, Facebook and Twitter posts, and search queries to understand if the student is suffering from bullying, grief, depression; has suicidal thoughts, or is contemplating self-harm. This view into the students’ state of mind on the basis of their written word has helped us saved lives. NLP forms that first step in our process of understanding students and helps our 24 team help schools and parents intervene before it is too late. The 24 team is comprised of former school psychologists, therapists, and others who have worked extensively with troubled students. Even as we serve millions of students and scan through millions of emails and posts every day, our NLP engine analyzes every incoming message in a matter of seconds and helps us dramatically cut reaction times from a scenario where only humans are processing these messages. UNDERSTANDING NLP Understanding NLP- the securly:// way 5
  • 7. a. Removing stop-words: As a first step, our algorithm removes all commonly occurring words such as ‘the’, ‘a’, ‘an’ etc. These words do not add special meaning to the text and it can be understood just as well without it. HOW DOES NLP WORK AT SECURLY? Understanding NLP- the securly:// way 6 The process that seems as simple as input post >> analyze sentiment >> issue alert/pass through is, in fact, a long computational process that involves complex machine learning algorithms that analyze incoming data against carefully curated datasets. 1. Data Pre-processing All raw data coming into the system needs to be pre-processed to prepare it for our sentiment analysis engine. This pre-processing of data scraps off the noise in it such that there are no words or characters, deemed unnecessary for sentiment analysis, left over in it.
  • 8. b. Customized Securly pre-processing: Considering the frequency of spam text and certain other elements of electronic communication we introduce a second round of processing where we remove elements that are not part of the actual “conversation” that we are interested in understanding. This includes timestamps, attachments, meta tags, URLs, Twitter handles etc. c. Lemmatization: This is a very important part of the process which helps us derive the base word of every word we encounter. A standard NLP library that gives the base word for every word possible is used for this purpose. Lemmatization takes into account the morphological information of a word and provides its base word. E.g. car, cars, car’s the base is car; for am, are, it is be. Lemmatization is more than just looking at the stem of a word and is very important when it comes to analyzing the sentiment behind a sentence. 2. Classification This pre-processed data is then sent to the sentiment analysis engine for classification. The data is classified as either clean, bully or grief. Identifying a text as bullying or grief requires rigorous training of the engine. (We will go into the depths of how the engine is trained in subsequent sections.) If a text is identified as clean, it is checked against a special dictionary for filth. No matter what you throw at it, our engine can accurately predict more than 80% of the time if the text is grief, bully, or clean. Understanding NLP- the securly:// way 7
  • 9. 3. Alert generation All grief and bully instances trigger an alert to the respective school admins and parents. It is also shared with our 24 teams who analyze it individually for criticality and contact the school, parents or police as required. The flagged text is also displayed as flagged activity to the school admin (and counselors or teachers if delegated by the admin) on their Securly admin portal and is available at all times for reference. It is possible to download student specific and time-range based activity reports as well if required to share with principals, parents, school boards etc. Understanding NLP- the securly:// way 8
  • 10. The accuracy of Securly’s NLP and sentiment analysis engines is dependent upon their training, and so we lay great stress to ensure that they are trained above industry standards. The entire pre-processing of data is done during the training phase for the training datasets as well. 1. Data generation The Securly datasets are completely driven by the real-life data generated by its filtering products. We maintain separate datasets for each of the three classes – grief, bully and clean. The words and phrases in these datasets have a 1:1 mapping with the labels. This dataset is regularly updated with more words and phrases to ensure greater accuracy of classification. The dataset for filth is a dictionary and is not used in the training of the engines as it does not require any machine learning algorithms for identification of filth during analysis. 2. Vectorization using N gram range To help us understand data better it is important to break it down into smaller parts or vectors. This helps analyze a given sentence more accurately. To do this we use the trigram range method. For example, the text – I am going out – is broken down into groups of one word, two words, and three words. These N-grams are called feature. In our example, the features for ‘I am going out’ will be: One word: I + am + going + out = 4 features Two words: I am + am going + going out = 3 features Three words: I am going + am going out = 2 features Which leads to a total of 9 features for this particular sentence. Such feature sets are created for every text in the dataset and give us an exhaustive dictionary of unique features to work with. TRAINING THE ENGINES Understanding NLP- the securly:// way 9
  • 11. Thereafter, a scoring mechanism gives us a score for each of the features. If a feature occurs too many times it is considered commonplace and its score reduces. The scoring mechanism is important in identifying the total score of a sentence and classifying it as clean, bully or grief. We also use the chi-squared statistics method to help identify the best features to be used in an engine. This is necessary considering the volume of the dataset, which generates a large number of features. Using all the features without prioritization can impact the efficiency and accuracy of the engine. Understanding NLP- the securly:// way 10 I am going out I amam out I going going out am going I am going am going out Two WordsOne Words Three Words Features 3 Features 2Features 4 Features in total 9
  • 12. 3. Classifier/ Classification Here we use supervised Machine Learning algorithms such as logistic regression among other things to help us draw boundaries around features to determine their class. Considering that every sentence the engine analyzes is a group of features, it is important to lay down these boundaries to help us determine what class most of the sentence falls into. The thickness of these boundaries is also determined by these algorithms and helps us deal with grey areas efficiently. 4. Cross validation A process of cross-validation is used to separate out three sets of 10% of the datasets, which are then used to test and train the engine. Each set is put through all the parameters during testing and the results are compared to identify the best engine. A hold-out set is also identified and set aside before the other sets are put through training. The best engine is then put to test against this hold-out set. The hold-out set remains constant and is used as the baseline against which every new engine created has to pass before it is released for use. Understanding NLP- the securly:// way 11
  • 13. Working up from this foundation for NLP that currently processes shorter text (1 sentence) within 1.87 seconds and longer text (approx. 4KB) within 2.21 seconds, we are working on incorporating the neural networks method as well. Research is underway on this aspect and will help us achieve onwards of 90% accuracy when released. Also in progress is our Correlations project, which analyzes and assigns a rating to individual students that corresponds with the seriousness and urgency of their online activities. The lowest rating reflects harmless and everyday content, the highest indicates an imminent threat to a person's life. This will give schools a broader view of a student's suspicious online activity, and help them determine when an incident needs intervention. In addition, we are also constantly updating our hold-out set; and working on a spell-correction algorithm that will help us auto-correct misspelled words, mark them as distinct features and give us better insight into such text. The difference between a situation missed and a student saved can be as small as a typo. As the Student Safety Company, Securly is dedicated to filling in those cracks. THE FUTURE Understanding NLP- the securly:// way 12
  • 14. There are many movies where computers become too smart and attempt to eradicate humanity. These cautionary tales generate income for the storyteller but simply do not reflect the reality of the advancements being made in technology. The reality is that technology like Natural Language Processing and Sentiment Analysis are key to providing Securly and schools an extended awareness of those times when kids may feel alone or, worse, utterly hopeless. In situations where time is of the essence, technological advances allow us to be more aware of and more connected with those in need of help. And if further advancements result in additional lives saved, then rest assured those advancements will be made, and the Turing Test aced. CONCLUSION Understanding NLP- the securly:// way 13