Recurrent Neural Network : Multi-Class & Multi Label Text Classification

About Us
Abzooba is an Artificial Intelligence (AI) company. We partner with
enterprises in their cognitive journey to augment digital
transformation
Headquartered in Silicon Valley (USA),
with delivery centres in Kolkata & Pune
(India)
200+ Employees
We use xpresso.ai, a set of internal accelerators and
toolkits, to deliver custom built AI and ML solutions
to our customers

Service Offerings Overview
Big Data and Cloud
Collecting and processing of data for
ingestion
Data Science
Deep Learning, Computer Vision and NLP
to solve business problems
Building enterprise-class infrastructure for
seamless AI integration
Lightning fast data processing and real-time insight
provision through distributed processing
Structured and unstructured data ingestion
through a scalable data lake
Expertise in the state-of-the-art Big Data and data
preprocessing tools
Supervised and unsupervised ML algorithms to
solve industry-specific use cases
Creating an efficient ecosystem through natural
language understanding
Deep learning-based computer vision algorithms
to gather insight from images
Ability to process large volume of data using
parallel processing
Deployment of AI solutions in production using
DevOps framework
AI Ops
Integrated Development, Deployment and
Management infrastructure

Index
1. Problem Statement
2. Naïve Solution
3. Word Embedding
4. Recurrent Neural Network
5. Ensemble & Evaluation
6. Hands On

www.abzooba.co
A Recurrent Neural Pipeline
Build a Recommendation System, that can recommend Technology
Domains & Tags for questions posted on Stack overflow.
Given a question in Stack-Overflow, predict the Technology Domain
& Associated Tags for it.

Multi Class |
Multi Label Document
Politics
Election
Budget
Ban
Entertainment
Concert
Movies
Games
Player
Selection

Classification into multiple-classes
which are independent & Labels
which are not mutually exclusive in
a hierarchical manner.
Question
Dev Ops
Jenkins
AWS
Docker
QA
Testing
Selenium
Big Data
Hadoop
Apache Spark
Haskell

www.abzooba.co
Business Requirements
PRECISION-RECALL
TRADEOFF
NEW METRICS INFERENCE TIME FREQUENCY OF
RELEASE

www.abzooba.co
Understand
Dataset

www.abzooba.co
Understand Data Sourcing
• How was the data collected
• Biased data-collection technique
• Complete or subset of original
data
• Data bias due to the problem or
not ?
Understand Data Behavior
• Word-Character Distribution
• Time based Distribution
• Class/Label DistributionExploratory
Data
Analysis

www.abzooba.co
EDA
Summary
Group Group Name
1 Programming
2 MS-Development Environment
3 Server-Side Development
4 Mobile App Development
5 Dev Environment
6 Front-end/Designing
7 Dynamic UI
8 MVC
9 Dev Ops
10 Big Data
11 QA
12 Project Management
13 Scripting
14 Business Analytics

www.abzooba.co
Machine Learning with Bag of Words
Master Vocabulary
• Of all the words present in the document
Create
Words with Numbers
• Based on Index in the Vocabulary
Replace
To ML Model
• Document-Label Pair to model
Feed

www.abzooba.co
Words with Numbers
• Based on Index in the Vocabulary
Replace
Notes
• Frequency count based.
• Term frequency–Inverse document frequency.
• Sequence of words not conserved.

www.abzooba.co
To ML Model
• Document-Label Pair to model
Feed
We use a One v/s Rest strategy to learn a classifier for each Tag

www.abzooba.co
Problems
Lack of context &
understanding
Biasness towards certain words

www.abzooba.co
Word Embeddings

www.abzooba.co
What are
Word
Embedding?
Frequency based Embedding
• Count Vector
• TF-IDF Vector
• Co-Occurrence Vector
Prediction based Embedding
• CBOW (Continuous Bag of
words)
• Skip – Gram model
• Transformers acrhitecture are
leading the research front
Word embeddings are feature learning techniques in NLP where
words or phrases from the vocabulary are mapped to vectors of
real numbers.

www.abzooba.co
Why do we need them?
ML algorithms and almost all Deep Learning Architectures are
incapable of processing strings or plain text in their raw form.
To represent human understandable language to binary machine
codes
Better representation leads to better machine understanding

www.abzooba.co
Word
Embedding Data
Pre-processing
Text
Bag of Word
SOTA
Embeddings
Custom
Embeddings

www.abzooba.co
Data
Preparation
Data
Pre-processing
Text
Training
Format
Single Dataset Single Model
Multiple
Dataset
Standard
Division
Model Forest
Normalised
Division
Model Forest

www.abzooba.co
Neural
Networks

www.abzooba.co
Why
Recurrent
Neural
Network
Preserving context of
documents
Sequence of paragraphs &
contents matters

www.abzooba.co
Recurrent Neural
Network
• Every word is an input at a time-step
• Hidden State from previous time-step

www.abzooba.co
Backpropagation
Through Time
• Seq-Vec : Gradients from the last-step is propagated
• Seq-Seq : Gradient at each time-step is propagated

www.abzooba.co
Parameters
1. word_embeddings: The Embeddings that we want to stack
2. hidden_size : Sequence Length for rolling over time
3. rnn_layers : Number of RNN cells to be stack. Decides how deep the network is
4. bidirectional : Decides whether the models reads the sequence from Left/Right/Both
5. reproject_words : Decides whether we want to retune the word embeddings
6. reproject_words_dimension : Embedding dimension after retuning.
7. dropout : Dropout to be used
8. rnn_type : Type of RNN Cell to be used

www.abzooba.co
Challenges with RNN
• Sensitive to lr & batch size
• Training Loss unstable
• Hard to Train
• Early stopping may lead to dumb model
•High patience level
• Need to train for longer
• Seq-Seq are better than Seq-Vec

Single Model Question
Jenkins
AWS
Docker
Testing
Selenium
Hadoop
Apache Spark
Haskell

www.abzooba.co
Our Model
• Stacked Document Embeddings.
• Multiple Pre-trained and custom trained(by us) word
embeddings stacked.
• Enables capturing different relations & features in the
document.
• Project the embedding to a trainable embedding layer.
Enables fine-tuning for down-stream/explicit task.
• Model
• LSTM
• 2 Layers
• 64Hidden Items (Rolling over time)
• 256 dimension embedding reprojection

www.abzooba.co
Ensembled
Neural Networks

Multi Class |
Multi Label
Question
Dev Ops
Jenkins
AWS
Docker
QA
Testing
Selenium
Big Data
Hadoop
Apache Spark
Haskell

www.abzooba.co
Re-Designed
Model
• Improved Average Coverage by 17% by just changing to
Ensembled models.
• Trade-off with inference time & memory utilization.
• Possible expansion to ensembling on each category.
• Business Impact :
• Slower Inference
• Scalable Training
• Scalable to more domains

www.abzooba.co
Re-Designed
Model
• Ensembled Model of Categories
• Helped in Class imbalance. Alternative to
Bootstrapping.
• Flexibility to set Threshold.
• Flexibility for Hyper Parameter tuning differently.
• Negative Sampling for different classifiers.

www.abzooba.co
Take Away
Recurrent Neural Networks
needs to be trained cautisouly.
Single Deep Learning Models are
not a silver bullet solution to
problems.
Ensemble Models can be used to
sub-divide the problems and
reduce complexity.
Case-sensitivity in models and
embeddings are important factors
to be taken into consideration.
Data augmentation/extrapolation
leads to increased False Positive
apart from generalizing the
model.

www.abzooba.co
Contact
Amit Agarwal
• Email : amit.agarwal@abzooba.com
• GitHub : https://github.com/amitbcp
• LinkedIn: https://www.linkedin.com/in/amitagarwal6/
Ishant Wankhede
• Email : ishant.wankhede@abzooba.com
• GitHub : https://github.com/IshantWankhede
• LinkedIn: https://www.linkedin.com/in/ishantwankhede/

Recurrent Neural Network : Multi-Class & Multi Label Text Classification

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Recurrent Neural Network : Multi-Class & Multi Label Text Classification

Semelhante a Recurrent Neural Network : Multi-Class & Multi Label Text Classification (20)

Último

Último (20)

Recurrent Neural Network : Multi-Class & Multi Label Text Classification