SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
DEEP LEARNING &
APPLICATIONS IN
NON-COGNITIVE DOMAINS
PART II: PRACTICE
5/12/16 1
Truyen Tran
Deakin University
truyen.tran@deakin.edu.au
prada-research.net/~truyen
AusDM’16, Canberra, Dec 7th 2016
dbta.com
PART II: PRACTICE
APPLYING DEEP LEARNING TO NON-COGNITIVE DOMAINS
 Hand-on:
­ Introducing programming frameworks
(Theano, TensorFlow, Mxnet)
 Domains how-to:
­ Healthcare
­ Software engineering
­ Anomaly detection
5/12/16 2
THEANO & TENSORFLOW
 Two most popular frameworks at present. Both in Python.
 Theano
­  Academic-driven. Pioneer.
­  Symbolic computation à can be tricky to debug
­  Wrapper: Lasagne, Keras
 TensorFlow
­  Google à Native distributed computing support
­  A lot of support, huge community
­  Slightly bigger/messier code
­  Linux/Mac only but VirtualBox will help in Windows
­  Wrapper: Keras
5/12/16 3
√ Excellent support for many languages
√ Fast, portable
√ Intuitive syntax
√ Recent choice by AWS
5/12/16 4
http://bickson.blogspot.com.au/2016/02/mxnet-vs-tensorflow.html
https://github.com/dmlc/mxnet
BUILDING A MODEL
 Everything is a computational graph
 From here to there is a tensor
 So simple stacking is fine (the idea behind Keras)
 Fit small datasets first to test the water
­  But be cautious: small data do not always generalize
 Always monitor the gap between train/validation
sets: small gap indicates underfitting, big widening
gap indicates overfitting.
5/12/16 5
 Check the model assumption
­  Is this only the vector à FNN?
­  Is this a regular sequence à RNN?
­  Is there repeated motifs à CNN?
­  Is there a mix of static and dynamic features?
­  What does the output look like?
­  A class
­  A sequence
­  An image?
­  What are performance measures? à Surrogate smooth
objective functions
STEPS
 Prepare a clean big dataset
 Design a suitable architecture à the main ART
 Choose an optimizer (sgd, momentum, adagrad, adadelta, rmsprop, adam)
 Normalise data (very important for fast training & well-behaved learning curve)
 Shuffle data randomly (extremely important!)
 Run the optimizer
 Sit back & wait (in fact, should spend time monitor the convergence)
 Grid search if time permits (sometimes very important to get correct convergence!)
 Ensemble if time permits
 Reiterate if needed
5/12/16 6
THINGS TO TAKE CARE OF
 Data quality
 Leakage
­  Never touch validation data for feature engineering
­  Be aware of overlapping between training/validation in time-sensitive data
 Memory limitation
 CPU/GPU time
 Always shuffle the data BEFORE training – create a mixing of labels
 Initialisation matters
 Dropouts: almost always help, normally with bigger models. But be careful with RNNs.
 Numerical overflow/underflow: exp of large number, log of or division by zeros
5/12/16 7
 TEKsystems
dbta.com
APPLYING TO NON-COGNITIVE DOMAINS
­  Where humans need
extensive training to do well
­  Domains that demand
transparency &
interpretability.
5/12/16 8
Healthcare
Software engineering
Anomaly detection
http://www.bentoaktechnologies.com/Images/code_scrn.jpg
WHAT MAKE NON-COGNITIVE DOMAINS
HARD?
 Great diversity but may be small in size
 High uncertainty, low-quality/missing data
 Reusable models do not usually exist
 However, at the end of the day, we need few generic things:
­  Vector -> DNN (e.g., highway net)
­  Sequence -> RNN (e.g., LSTM, GRU)
­  Repeated Motifs -> CNN
­  Set -> attention (Will visit in Part III)
­  Graphs -> Column Networks (Will visit in Part III)
5/12/16 9
 TEKsystems
HEALTHCARE
5/12/16 10
INTEGRATED DATA VIEW OF MULTIPLE HOSPITAL SYSTEMS
MULTI DATA INPUT METHODS
FLEXIBLE TO CREATE, EASY TO USE
SECURE AND ACCESSIBLE, ANYWHERE
HOW DOES AI WORK FOR HEALTH?
11
Diagnosis Prognosis EfficiencyDiscovery
http://hubpages.com/education/Top-Medical-Inventions-of-The-1950s, http://www.ctrr.net/journal/
HEALTHCARE: CHALLENGES + OPPORTUNITIES
 Long-term dependencies
 Irregular timing
 Mixture of discrete codes and continuous
measures
 Complex interaction of diseases and care
processes
 Cohort of interest can be small (e.g.,
<1K)
 Rich domain knowledge & ontologies
5/12/16 12
 May include textual notes
 May contain physiological signals (e.g., EEG/
ECG)
 May contain images (e.g., MRI, X-ray, retina)
 Genomics
 Detailed neuronal mapping (US) & simulation
(EU)
 New modalities: social medial, wearable devices
THIS TUTORIAL WILL COVER:
 Electronic medical records (EMR)
5/12/16 13
visits/admissions
time gap
?
prediction point
http://www.healthpages.org/brain-injury/brain-injury-intensive-care-unit-icu/
 Physiological measures in Intensive
Care Unit (CU)
•  Time-stamped
•  Coded data: diagnosis, procedure
& medication
•  Text not considered, but in principle
can be mapped in to vector using
LSTM
<Time, Type, Value>
MEDICAL RECORDS: FEEDFORWARD
NETS
5/12/16 14
visits/admissions
time gap
?
prediction point
history future
assessment
15 days 30 days 60 days 120 days 180 days
hidden layers
pooling
history
future
360 days
fragmentation
+ aggregation
[0-3]m[3-6]m[6-12]m
data segments
[12-24]m[24-48]m
SUICIDE RISK PREDICTION: MACHINE VERSUS
CLINICIAN
5/12/16 15
DEEPR: CNN FOR REPEATED MOTIFS AND
SHORT SEQUENCES (NGUYEN ET AL, J-BHI, 2016)
5/12/16 16
output
max-pooling
convolution --
motif detection
embedding
sequencing
medical record
visits/admissions
time gaps/transferphrase/admission
prediction
1
2
3
4
5
time gap
record
vector
word
vector
?
prediction point
DISEASE EMBEDDING &
MOTIFS DETECTION
5/12/16 17
E11 I48 I50
Type 2 diabetes mellitus
Atrial fibrillation and flutter
Heart failure
E11 I50 N17
Type 2 diabetes mellitus
Heart failure
Acute kidney failure
DEEPCARE: DYNAMICS
5/12/16 18
memory
*
input
gate
forget
gate
prev. memory
output
gate
*
*
input
aggregation over
time → prediction
previous
intervention
history
states
current
data
time
gap
current
intervention
current
state
New in DeepCare
DEEPCARE:
STRUCTURE
5/12/16 19
Time gap
LSTM
Admission
(disease)
(intervention)
Vector embedding
Multiscale pooling
Neural network
Future risks
Long short-term
memory
Latent states
FutureHistory
LSTM LSTM LSTM
DEEPCARE: PREDICTION RESULTS
5/12/16 20
Intervention recommendation (precision@3) Unplanned readmission prediction (F-score)
12 months 3 months 12 months 3 months
DEEPIC: MORTALITY PREDICTION IN
INTENSIVE CARE UNITS (WORK IN PROGRESS)
 Existing methods: LSTM with
missingness and time-gap as input.
 New method: Deepic
 Steps:
­ Measurement quantization
­ Time gap quantization
­ Sequencing words into sentence
­ CNN
5/12/16 21
http://www.healthpages.org/brain-injury/brain-injury-intensive-care-unit-icu/
Time,Parameter,Value
00:00,RecordID,132539
00:00,Age,54
00:00,Gender,0
00:00,Height,-1
00:00,ICUType,4
00:00,Weight,-1
00:07,GCS,15
00:07,HR,73
00:07,NIDiasABP,65
00:07,NIMAP,92.33
00:07,NISysABP,147
00:07,RespRate,19
00:07,Temp,35.1
00:07,Urine,900
00:37,HR,77
00:37,NIDiasABP,58
00:37,NIMAP,91
00:37,NISysABP,157
00:37,RespRate,19
00:37,Temp,35.6
00:37,Urine,60
Data: Physionet 2012
DEEPIC: SYMBOLIC & TIME GAP REPRESENTATION
OF DATA
5/12/16 22
output
max-pooling
convolution --
motif detection
embedding
sequencing
measurement points
time gapsmeasurements
prediction
1
2
3
4
5
time gap
record
vector
word
vector
?
prediction point
discretization0
5/12/16 23http://www.bentoaktechnologies.com/Images/code_scrn.jpg
SOFTWARE ANALYTICS
DATA-DRIVEN SOFTWARE ENGINEERING
TOWARDS INTELLIGENT ASSISTANTS
 Goal: To model code, text, team, user, execution, project
& enabled business process à answer any queries by
developers, managers, users and business
 For now:
­ DeepSoft vision
­ LSTM for code language model
­ LD-RNN for report representation
­ Stacked/deep inference (later)
5/12/16 24
http://lifelong.engr.utexas.edu/images/course/swpm_b.jpg
ANALYTICS FOR AGILE SOFTWARE PROJECT
MANAGEMENT
5/12/16 25
http://www.solutionguidance.com/?page_id=1579
CHALLENGES: LONG-TERM TEMPORAL
DEPENDENCIES IN SOFTWARE
 Software is similar to an evolving organism
­ What will happen next to a software system depends heavily on what has previously been done to it.
­ E.g. the implementation of a functionality may constraint how other functionalities are implemented in the
future.
­ E.g. a previous change (to fix a bug or add a new feature) may inject new bugs and lead to further
changes.
­ E.g. refactoring a piece of code may have long-term benefits in future maintenance.
 Today’s software products undergo rapid cycles of development, testing and release
­ A software project typically has many releases
­ A release requires the completion of some tasks (i.e. resolution of some issues).
­ An issue is described using natural language (raw data).
­ The resolution of an issue may result in code patches (raw data).
26
A DEEP LANGUAGE MODEL FOR
SOFTWARE CODE (DAM ET AL, FSE’16 SE+NL)
 A good language model for source code would capture the long-term
dependencies
 The model can be used for various prediction tasks, e.g. defect prediction, code
duplication, bug localization, etc.
 The model can be extended to model software and its development process.
5/12/16 27
Slide by Hoa Khanh Dam
CHARACTERISTICS OF SOFTWARE CODE
 Repetitiveness
­  E.g. for (int i = 0; i < n; i++)
 Localness
­  E.g. for (int size may appear more often that for (int i in some source files.
 Rich and explicit structural information
­  E.g. nested loops, inheritance hierarchies
 Long-term dependencies
­  try and catch (in Java) or file open and close are not immediately followed each other.
28
Slide by Hoa Khanh Dam
CODE LANGUAGE MODEL
29
 Previous work has applied RNNs to model software code (White et al, MSR 2015)
 RNNs however do not capture the long-term dependencies in code
Slide by Hoa Khanh Dam
EXPERIMENTS
 Built dataset of 10 Java projects: Ant, Batik, Cassandra, Eclipse-E4, Log4J, Lucene, Maven2, Maven3, Xalan-J,
and Xerces.
 Comments and blank lines removed. Each source code file is tokenized to produce a sequence of code tokens.
­ Integers, real numbers, exponential notation, hexadecimal numbers replaced with <num> token, and
constant strings replaced with <str> token.
­ Replaced less “popular” tokens with <unk>
 Code corpus of 6,103,191 code tokens, with a vocabulary of 81,213 unique tokens.
30
Slide by Hoa Khanh Dam
EXPERIMENTS (CONT.)
31
 Both RNN and LSTM improve with more training data (whose size grows with sequence length).
 LSTM consistently performs better than RNN: 4.7% improvement to 27.2% (varying sequence length), 10.7% to 37.9% (varying embedding size).
Slide by Hoa Khanh Dam
STORY POINT ESTIMATION
 Traditional estimation methods require
experts, LOC or function points.
­  Not applicable early
­  Expensive
 Feature engineering is not easy!
 Needs a cheap way to start from just a
documentation.
5/12/16 32
LD-RNN FOR REPORT
REPRESENTATION
(CHOETKIERTIKUL ET AL, WORK IN PROGRESS)
 LD = Long Deep
 LSTM for document representation
 Highway-net with tied parameters for
story point estimation
5/12/16 33
pooling
Embed
LSTM
story point
estimate
W1 W2 W3 W4 W5 W6
Recurrent Highway NetRegression
Standardize XD logging to align with
document representation
h1
h2 h3
h4 h5
h6
….
….
….
….
RESULTS
5/12/16 34
MAE = Mean Absolute Error
TASK DEPENDENCY IN SOFTWARE PROJECT
(CHOETKIERTIKUL ET AL, WORK IN PROGRESS)
5/12/16 35
TASK DEPENDENCY IN SOFTWARE
PROJECT (MORE ON PART III)
5/12/16 36
Column networksStacked Inference
ANOMALY DETECTION
USING UNSUPERVISED LEARNING (PART III)
5/12/16 37
dbta.com
This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning
DETECTION METHODS
5/12/16 38
Auto-encoder
(deterministic)
External
outlier
detector?
Reconstruction
error
Restricted Boltzmann Machine
(probabilistic)
Detection threshold
Fee energy surface
MIXED DATA
5/12/16 39
MIXED-VARIATE RBM (TRAN ET AL, 2011)
5/12/16 40
þ
ý
þ
ý
¤
¡
¡
¤
¡
¡
1
2
3
RESULTS OVER REAL DATASETS
41
ABNORMALITY
ACROSS
ABSTRACTIONS
5/12/16 42
F1(x1)
F2(x2)
Rank 1 Rank 2
F3(x3)
Rank 3
Rank aggregation
Mv.RBM Mv.DBN-L2 Mv.DBN-L3
WA1 WA1
WA2
WD1
WD2
WD3
MALICIOUS URL CLASSIFICATION
5/12/16 43
5/12/16 44http://www.indiainfoline.com/article/news-sector-information-technology/india-ranks-4th-in-highest-users-who-clicked-malicious-urls-in-2015-trend-micro-116052700684_1.html
MODEL OF MALICIOUS URLS
5/12/16 45
Safe/Unsafe
max-pooling
convolution --
motif detection
Embedding (may
be one-hot)
Prediction with FFN
1
2
3
4
record
vector
char
vector
h t t p : / / w w w . s
Train on 900K malicious URLs
1,000K good URLs
Accuracy: 96%
No feature engineering!
SUMMARY OF PART II
 Hand-on:
­ Introducing programming frameworks (Theano,
TensorFlow, Mxnet)
 Domains how-to:
­ Healthcare
­ Software engineering
­ Anomaly detection
5/12/16 46
5/12/16 47
https://duroosullughatilarabiyyah.files.wordpress.com/2010/07/qa.jpg

Mais conteúdo relacionado

Mais procurados

Artificial intelligence and cognitive modeling have the same problem chapter2
Artificial intelligence and cognitive modeling have the same problem   chapter2Artificial intelligence and cognitive modeling have the same problem   chapter2
Artificial intelligence and cognitive modeling have the same problem chapter2
sabdegul
 

Mais procurados (20)

Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
 
3234150
32341503234150
3234150
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Artificial intelligence and cognitive modeling have the same problem chapter2
Artificial intelligence and cognitive modeling have the same problem   chapter2Artificial intelligence and cognitive modeling have the same problem   chapter2
Artificial intelligence and cognitive modeling have the same problem chapter2
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
 
Soft computing
Soft computingSoft computing
Soft computing
 
03
0303
03
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Bhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueBhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogue
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domains
 

Semelhante a Deep learning and applications in non-cognitive domains II

Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
Bhupesh Bansal
 
Jorge Galindo - Resume
Jorge Galindo - ResumeJorge Galindo - Resume
Jorge Galindo - Resume
Jorge Galindo
 
Software engineering
Software engineeringSoftware engineering
Software engineering
Fahe Em
 
Software engineering
Software engineeringSoftware engineering
Software engineering
Fahe Em
 
TidalScale Overview
TidalScale OverviewTidalScale Overview
TidalScale Overview
Pete Jarvis
 
Chandan's_Resume
Chandan's_ResumeChandan's_Resume
Chandan's_Resume
Chandan Das
 

Semelhante a Deep learning and applications in non-cognitive domains II (20)

Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational Science
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
 
Changing the tires on a big data racecar
Changing the tires on a big data racecarChanging the tires on a big data racecar
Changing the tires on a big data racecar
 
The Seven Main Challenges of an Early Warning System Architecture
The Seven Main Challenges of an Early Warning System ArchitectureThe Seven Main Challenges of an Early Warning System Architecture
The Seven Main Challenges of an Early Warning System Architecture
 
Jorge Galindo - Resume
Jorge Galindo - ResumeJorge Galindo - Resume
Jorge Galindo - Resume
 
Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009Designing for the Cloud Tutorial - QCon SF 2009
Designing for the Cloud Tutorial - QCon SF 2009
 
Deepak_Tripathi_Resume
Deepak_Tripathi_ResumeDeepak_Tripathi_Resume
Deepak_Tripathi_Resume
 
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for BeginnersThe Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
The Open Source and Cloud Part of Oracle Big Data Cloud Service for Beginners
 
How to Effectively Migrate Data From Legacy Apps
How to Effectively Migrate Data From Legacy AppsHow to Effectively Migrate Data From Legacy Apps
How to Effectively Migrate Data From Legacy Apps
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An Overview
 
TidalScale Overview
TidalScale OverviewTidalScale Overview
TidalScale Overview
 
Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt
Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt
Keynote: Building Tomorrow's Ceph - Ceph Day Frankfurt
 
HPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and WorkflowsHPC Storage and IO Trends and Workflows
HPC Storage and IO Trends and Workflows
 
Semantics in Sensor Networks
Semantics in Sensor NetworksSemantics in Sensor Networks
Semantics in Sensor Networks
 
2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research2014-10-10-SBC361-Reproducible research
2014-10-10-SBC361-Reproducible research
 
Chandan's_Resume
Chandan's_ResumeChandan's_Resume
Chandan's_Resume
 

Mais de Deakin University

Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of Materials
Deakin University
 

Mais de Deakin University (20)

Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Deep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advancesDeep learning and reasoning: Recent advances
Deep learning and reasoning: Recent advances
 
AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...AI for automated materials discovery via learning to represent, predict, gene...
AI for automated materials discovery via learning to represent, predict, gene...
 
Deep analytics via learning to reason
Deep analytics via learning to reasonDeep analytics via learning to reason
Deep analytics via learning to reason
 
Generative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of MaterialsGenerative AI to Accelerate Discovery of Materials
Generative AI to Accelerate Discovery of Materials
 
Generative AI: Shifting the AI Landscape
Generative AI: Shifting the AI LandscapeGenerative AI: Shifting the AI Landscape
Generative AI: Shifting the AI Landscape
 
From deep learning to deep reasoning
From deep learning to deep reasoningFrom deep learning to deep reasoning
From deep learning to deep reasoning
 
Machine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug DiscoveryMachine Learning and Reasoning for Drug Discovery
Machine Learning and Reasoning for Drug Discovery
 
Deep Learning 2.0
Deep Learning 2.0Deep Learning 2.0
Deep Learning 2.0
 
Machine reasoning
Machine reasoningMachine reasoning
Machine reasoning
 
AI/ML as an empirical science
AI/ML as an empirical scienceAI/ML as an empirical science
AI/ML as an empirical science
 
Machine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin UniversityMachine Reasoning at A2I2, Deakin University
Machine Reasoning at A2I2, Deakin University
 
AI in the Covid-19 pandemic
AI in the Covid-19 pandemicAI in the Covid-19 pandemic
AI in the Covid-19 pandemic
 
Visual reasoning
Visual reasoningVisual reasoning
Visual reasoning
 
AI for tackling climate change
AI for tackling climate changeAI for tackling climate change
AI for tackling climate change
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Deep learning for episodic interventional data
Deep learning for episodic interventional dataDeep learning for episodic interventional data
Deep learning for episodic interventional data
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 
Deep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining IIDeep learning for biomedical discovery and data mining II
Deep learning for biomedical discovery and data mining II
 
AI that/for matters
AI that/for mattersAI that/for matters
AI that/for matters
 

Último

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 

Último (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

Deep learning and applications in non-cognitive domains II

  • 1. DEEP LEARNING & APPLICATIONS IN NON-COGNITIVE DOMAINS PART II: PRACTICE 5/12/16 1 Truyen Tran Deakin University truyen.tran@deakin.edu.au prada-research.net/~truyen AusDM’16, Canberra, Dec 7th 2016
  • 2. dbta.com PART II: PRACTICE APPLYING DEEP LEARNING TO NON-COGNITIVE DOMAINS  Hand-on: ­ Introducing programming frameworks (Theano, TensorFlow, Mxnet)  Domains how-to: ­ Healthcare ­ Software engineering ­ Anomaly detection 5/12/16 2
  • 3. THEANO & TENSORFLOW  Two most popular frameworks at present. Both in Python.  Theano ­  Academic-driven. Pioneer. ­  Symbolic computation à can be tricky to debug ­  Wrapper: Lasagne, Keras  TensorFlow ­  Google à Native distributed computing support ­  A lot of support, huge community ­  Slightly bigger/messier code ­  Linux/Mac only but VirtualBox will help in Windows ­  Wrapper: Keras 5/12/16 3
  • 4. √ Excellent support for many languages √ Fast, portable √ Intuitive syntax √ Recent choice by AWS 5/12/16 4 http://bickson.blogspot.com.au/2016/02/mxnet-vs-tensorflow.html https://github.com/dmlc/mxnet
  • 5. BUILDING A MODEL  Everything is a computational graph  From here to there is a tensor  So simple stacking is fine (the idea behind Keras)  Fit small datasets first to test the water ­  But be cautious: small data do not always generalize  Always monitor the gap between train/validation sets: small gap indicates underfitting, big widening gap indicates overfitting. 5/12/16 5  Check the model assumption ­  Is this only the vector à FNN? ­  Is this a regular sequence à RNN? ­  Is there repeated motifs à CNN? ­  Is there a mix of static and dynamic features? ­  What does the output look like? ­  A class ­  A sequence ­  An image? ­  What are performance measures? à Surrogate smooth objective functions
  • 6. STEPS  Prepare a clean big dataset  Design a suitable architecture à the main ART  Choose an optimizer (sgd, momentum, adagrad, adadelta, rmsprop, adam)  Normalise data (very important for fast training & well-behaved learning curve)  Shuffle data randomly (extremely important!)  Run the optimizer  Sit back & wait (in fact, should spend time monitor the convergence)  Grid search if time permits (sometimes very important to get correct convergence!)  Ensemble if time permits  Reiterate if needed 5/12/16 6
  • 7. THINGS TO TAKE CARE OF  Data quality  Leakage ­  Never touch validation data for feature engineering ­  Be aware of overlapping between training/validation in time-sensitive data  Memory limitation  CPU/GPU time  Always shuffle the data BEFORE training – create a mixing of labels  Initialisation matters  Dropouts: almost always help, normally with bigger models. But be careful with RNNs.  Numerical overflow/underflow: exp of large number, log of or division by zeros 5/12/16 7
  • 8.  TEKsystems dbta.com APPLYING TO NON-COGNITIVE DOMAINS ­  Where humans need extensive training to do well ­  Domains that demand transparency & interpretability. 5/12/16 8 Healthcare Software engineering Anomaly detection http://www.bentoaktechnologies.com/Images/code_scrn.jpg
  • 9. WHAT MAKE NON-COGNITIVE DOMAINS HARD?  Great diversity but may be small in size  High uncertainty, low-quality/missing data  Reusable models do not usually exist  However, at the end of the day, we need few generic things: ­  Vector -> DNN (e.g., highway net) ­  Sequence -> RNN (e.g., LSTM, GRU) ­  Repeated Motifs -> CNN ­  Set -> attention (Will visit in Part III) ­  Graphs -> Column Networks (Will visit in Part III) 5/12/16 9
  • 10.  TEKsystems HEALTHCARE 5/12/16 10 INTEGRATED DATA VIEW OF MULTIPLE HOSPITAL SYSTEMS MULTI DATA INPUT METHODS FLEXIBLE TO CREATE, EASY TO USE SECURE AND ACCESSIBLE, ANYWHERE
  • 11. HOW DOES AI WORK FOR HEALTH? 11 Diagnosis Prognosis EfficiencyDiscovery http://hubpages.com/education/Top-Medical-Inventions-of-The-1950s, http://www.ctrr.net/journal/
  • 12. HEALTHCARE: CHALLENGES + OPPORTUNITIES  Long-term dependencies  Irregular timing  Mixture of discrete codes and continuous measures  Complex interaction of diseases and care processes  Cohort of interest can be small (e.g., <1K)  Rich domain knowledge & ontologies 5/12/16 12  May include textual notes  May contain physiological signals (e.g., EEG/ ECG)  May contain images (e.g., MRI, X-ray, retina)  Genomics  Detailed neuronal mapping (US) & simulation (EU)  New modalities: social medial, wearable devices
  • 13. THIS TUTORIAL WILL COVER:  Electronic medical records (EMR) 5/12/16 13 visits/admissions time gap ? prediction point http://www.healthpages.org/brain-injury/brain-injury-intensive-care-unit-icu/  Physiological measures in Intensive Care Unit (CU) •  Time-stamped •  Coded data: diagnosis, procedure & medication •  Text not considered, but in principle can be mapped in to vector using LSTM <Time, Type, Value>
  • 14. MEDICAL RECORDS: FEEDFORWARD NETS 5/12/16 14 visits/admissions time gap ? prediction point history future assessment 15 days 30 days 60 days 120 days 180 days hidden layers pooling history future 360 days fragmentation + aggregation [0-3]m[3-6]m[6-12]m data segments [12-24]m[24-48]m
  • 15. SUICIDE RISK PREDICTION: MACHINE VERSUS CLINICIAN 5/12/16 15
  • 16. DEEPR: CNN FOR REPEATED MOTIFS AND SHORT SEQUENCES (NGUYEN ET AL, J-BHI, 2016) 5/12/16 16 output max-pooling convolution -- motif detection embedding sequencing medical record visits/admissions time gaps/transferphrase/admission prediction 1 2 3 4 5 time gap record vector word vector ? prediction point
  • 17. DISEASE EMBEDDING & MOTIFS DETECTION 5/12/16 17 E11 I48 I50 Type 2 diabetes mellitus Atrial fibrillation and flutter Heart failure E11 I50 N17 Type 2 diabetes mellitus Heart failure Acute kidney failure
  • 18. DEEPCARE: DYNAMICS 5/12/16 18 memory * input gate forget gate prev. memory output gate * * input aggregation over time → prediction previous intervention history states current data time gap current intervention current state New in DeepCare
  • 19. DEEPCARE: STRUCTURE 5/12/16 19 Time gap LSTM Admission (disease) (intervention) Vector embedding Multiscale pooling Neural network Future risks Long short-term memory Latent states FutureHistory LSTM LSTM LSTM
  • 20. DEEPCARE: PREDICTION RESULTS 5/12/16 20 Intervention recommendation (precision@3) Unplanned readmission prediction (F-score) 12 months 3 months 12 months 3 months
  • 21. DEEPIC: MORTALITY PREDICTION IN INTENSIVE CARE UNITS (WORK IN PROGRESS)  Existing methods: LSTM with missingness and time-gap as input.  New method: Deepic  Steps: ­ Measurement quantization ­ Time gap quantization ­ Sequencing words into sentence ­ CNN 5/12/16 21 http://www.healthpages.org/brain-injury/brain-injury-intensive-care-unit-icu/ Time,Parameter,Value 00:00,RecordID,132539 00:00,Age,54 00:00,Gender,0 00:00,Height,-1 00:00,ICUType,4 00:00,Weight,-1 00:07,GCS,15 00:07,HR,73 00:07,NIDiasABP,65 00:07,NIMAP,92.33 00:07,NISysABP,147 00:07,RespRate,19 00:07,Temp,35.1 00:07,Urine,900 00:37,HR,77 00:37,NIDiasABP,58 00:37,NIMAP,91 00:37,NISysABP,157 00:37,RespRate,19 00:37,Temp,35.6 00:37,Urine,60 Data: Physionet 2012
  • 22. DEEPIC: SYMBOLIC & TIME GAP REPRESENTATION OF DATA 5/12/16 22 output max-pooling convolution -- motif detection embedding sequencing measurement points time gapsmeasurements prediction 1 2 3 4 5 time gap record vector word vector ? prediction point discretization0
  • 24. TOWARDS INTELLIGENT ASSISTANTS  Goal: To model code, text, team, user, execution, project & enabled business process à answer any queries by developers, managers, users and business  For now: ­ DeepSoft vision ­ LSTM for code language model ­ LD-RNN for report representation ­ Stacked/deep inference (later) 5/12/16 24 http://lifelong.engr.utexas.edu/images/course/swpm_b.jpg
  • 25. ANALYTICS FOR AGILE SOFTWARE PROJECT MANAGEMENT 5/12/16 25 http://www.solutionguidance.com/?page_id=1579
  • 26. CHALLENGES: LONG-TERM TEMPORAL DEPENDENCIES IN SOFTWARE  Software is similar to an evolving organism ­ What will happen next to a software system depends heavily on what has previously been done to it. ­ E.g. the implementation of a functionality may constraint how other functionalities are implemented in the future. ­ E.g. a previous change (to fix a bug or add a new feature) may inject new bugs and lead to further changes. ­ E.g. refactoring a piece of code may have long-term benefits in future maintenance.  Today’s software products undergo rapid cycles of development, testing and release ­ A software project typically has many releases ­ A release requires the completion of some tasks (i.e. resolution of some issues). ­ An issue is described using natural language (raw data). ­ The resolution of an issue may result in code patches (raw data). 26
  • 27. A DEEP LANGUAGE MODEL FOR SOFTWARE CODE (DAM ET AL, FSE’16 SE+NL)  A good language model for source code would capture the long-term dependencies  The model can be used for various prediction tasks, e.g. defect prediction, code duplication, bug localization, etc.  The model can be extended to model software and its development process. 5/12/16 27 Slide by Hoa Khanh Dam
  • 28. CHARACTERISTICS OF SOFTWARE CODE  Repetitiveness ­  E.g. for (int i = 0; i < n; i++)  Localness ­  E.g. for (int size may appear more often that for (int i in some source files.  Rich and explicit structural information ­  E.g. nested loops, inheritance hierarchies  Long-term dependencies ­  try and catch (in Java) or file open and close are not immediately followed each other. 28 Slide by Hoa Khanh Dam
  • 29. CODE LANGUAGE MODEL 29  Previous work has applied RNNs to model software code (White et al, MSR 2015)  RNNs however do not capture the long-term dependencies in code Slide by Hoa Khanh Dam
  • 30. EXPERIMENTS  Built dataset of 10 Java projects: Ant, Batik, Cassandra, Eclipse-E4, Log4J, Lucene, Maven2, Maven3, Xalan-J, and Xerces.  Comments and blank lines removed. Each source code file is tokenized to produce a sequence of code tokens. ­ Integers, real numbers, exponential notation, hexadecimal numbers replaced with <num> token, and constant strings replaced with <str> token. ­ Replaced less “popular” tokens with <unk>  Code corpus of 6,103,191 code tokens, with a vocabulary of 81,213 unique tokens. 30 Slide by Hoa Khanh Dam
  • 31. EXPERIMENTS (CONT.) 31  Both RNN and LSTM improve with more training data (whose size grows with sequence length).  LSTM consistently performs better than RNN: 4.7% improvement to 27.2% (varying sequence length), 10.7% to 37.9% (varying embedding size). Slide by Hoa Khanh Dam
  • 32. STORY POINT ESTIMATION  Traditional estimation methods require experts, LOC or function points. ­  Not applicable early ­  Expensive  Feature engineering is not easy!  Needs a cheap way to start from just a documentation. 5/12/16 32
  • 33. LD-RNN FOR REPORT REPRESENTATION (CHOETKIERTIKUL ET AL, WORK IN PROGRESS)  LD = Long Deep  LSTM for document representation  Highway-net with tied parameters for story point estimation 5/12/16 33 pooling Embed LSTM story point estimate W1 W2 W3 W4 W5 W6 Recurrent Highway NetRegression Standardize XD logging to align with document representation h1 h2 h3 h4 h5 h6 …. …. …. ….
  • 34. RESULTS 5/12/16 34 MAE = Mean Absolute Error
  • 35. TASK DEPENDENCY IN SOFTWARE PROJECT (CHOETKIERTIKUL ET AL, WORK IN PROGRESS) 5/12/16 35
  • 36. TASK DEPENDENCY IN SOFTWARE PROJECT (MORE ON PART III) 5/12/16 36 Column networksStacked Inference
  • 37. ANOMALY DETECTION USING UNSUPERVISED LEARNING (PART III) 5/12/16 37 dbta.com This work is partially supported by the Telstra-Deakin Centre of Excellence in Big Data and Machine Learning
  • 40. MIXED-VARIATE RBM (TRAN ET AL, 2011) 5/12/16 40 þ ý þ ý ¤ ¡ ¡ ¤ ¡ ¡ 1 2 3
  • 41. RESULTS OVER REAL DATASETS 41
  • 42. ABNORMALITY ACROSS ABSTRACTIONS 5/12/16 42 F1(x1) F2(x2) Rank 1 Rank 2 F3(x3) Rank 3 Rank aggregation Mv.RBM Mv.DBN-L2 Mv.DBN-L3 WA1 WA1 WA2 WD1 WD2 WD3
  • 45. MODEL OF MALICIOUS URLS 5/12/16 45 Safe/Unsafe max-pooling convolution -- motif detection Embedding (may be one-hot) Prediction with FFN 1 2 3 4 record vector char vector h t t p : / / w w w . s Train on 900K malicious URLs 1,000K good URLs Accuracy: 96% No feature engineering!
  • 46. SUMMARY OF PART II  Hand-on: ­ Introducing programming frameworks (Theano, TensorFlow, Mxnet)  Domains how-to: ­ Healthcare ­ Software engineering ­ Anomaly detection 5/12/16 46