SlideShare uma empresa Scribd logo
1 de 61
Baixar para ler offline
1
Thesis Defense for the degree of
Doctor of Philosophy
Electrical and Computer Engineering
Carnegie Mellon University
Jiang Zhu
jiang.zhu@sv.cmu.edu
Thesis Committee
Prof. Joy Zhang, Chair
Prof. Jason Hong
Prof. Patrick Tague
Dr. Fabio Maino, Cisco Research
2
•  Data collected from physical and soft sensors
•  Identify various behavioral factors from sensor streams
•  Build behavioral models to generate quantifiable metrics
•  Apply these models to a mobile security application: SenSec
Study the fundamental scientific problem
modeling a mobile user’s behavior
heterogeneous sensory time-series
of from
3
4
North
America
Global
0
50
100
150
200
250
300
2011
2011-2016
15.5
58.3
33.6
256.7Tablets
North
America
Global
0
500
1000
1500
2000
2500
2011
2011-2016
138 253
698
2027
Smartphones
•  Cisco Visual Networking Index VNI 2012, Cisco Systems Inc. 2012
Smartphone Tablets 3-8
2-4 2016
5
0%
10%
20%
30%
40%
50%
60%
Mobile Device Loss or theft
Strategy One Survey conducted among a U.S. sample of 3017 adults age 18 years older in September
21-28, 2010, with an oversample in the top 20 cities (based on population).
•  “The 329 organizations
polled had collectively lost
more than 86,000 devices
… with average cost of lost
data at $49,246 per device,
worth $2.1 billion or $6.4
million per organization.
"The Billion Dollar Lost-Laptop Study,"
conducted by Intel Corporation and the
Ponemon Institute, analyzed the scope
and circumstances of missing laptop
PCs.
6
Reliable Non-intrusive
Password
Application
Usability
A major source of
security vulnerabilities.
Easy to guess, reuse,
forgotten, shared
Different
applications may
have different
sensitivities
Authentication too-often or
sometimes too loose
7
Passwords
Normal passwords are not strong enough: usually meaningful words that can be
remembered
Stringent strong password can be annoying
Most users do not use the password-aid tools (Hong et al. 2009)
Fingerprint? Iris recognition? Face recognition? Voice recognition?
Password for the DHS E-file:
Contain from 8 to 16 characters
Contain at least 2 of the following 3 characters: uppercase alphabetic,
lowercase alphabetic, numeric
Contain at least 1 special character (e.g., @, #, $, %, & *, +, =)
Begin and end with an alphabetic character
Not contain spaces
Not contain all or part of your UserID
Not use 2 identical characters consecutively
Not be a recently used password
8
•  Provides a way to passively authenticate while using common,
sensitive applications and services.
•  Allows for rapid detection of unauthorized users
Block their access as quickly as possible.
•  Uses a variety of sensors available on common smartphones
8
9
•  Derived from
•  Behavioral: the way a human subject behaves
•  Biometrics: technologies and methods that measure and analyzes
biological characteristics of the human body
•  Finger prints, eye retina, voice patterns
•  Behaviometrics: Measurable behavior to Recognize or to Verify
•  Identity of a human subject, or
•  Subject’s certain behaviors
Behavioral BiometricsBehaviometrics
10
•  Mobile devices come with embedded sensors
•  Accelerometers, gyroscope, magnetometer
•  GPS receiver
•  WiFi, Bluetooth, NFC
•  Microphone, camera,
•  Temperature, light sensor
•  “Clock” and “Calendar”
•  Connect with other sensors
•  EEG, EMG, GSR
•  Mobile devices are connected with the Internet
•  Upload sensor data to the cloud
•  Viewing information computing on the server side
•  Users carry the device almost at all time.
•  My phone “knows” where I am, what I am doing and my future
activities.
11
•  Motion Metrics
•  Location Metrics
•  Interaction Metrics
•  System Metrics
•  Accelerometer
•  activity, motion, hand trembling, driving
style
•  sleeping pattern
•  inferred activity level, steps made per
day, estimated calorie burned
•  Motion sensors, WiFi, Bluetooth
•  accurate indoor position and trace.
•  GPS
•  outdoor location, geo-trace,
commuting pattern
•  Microphone, camera:
•  From background noise: activity, type
of location.
•  From voice: stress level, emotion
•  Video/audio: additional contexts
•  Keyboard, touches, slides
•  Specific tasks, user interactions, …
12
•  Monitor and track user behavior on smartphones using various
on-device sensors
•  Convert sensory traces and other context information to
Behaviometrics
•  Build statistical models with these features and use them for
calculation of Certainty Scores as security measure
•  Trigger various secondary Authentication Schemes when certain
application is launched or certain system function is invoked.
13
Raw
Data Preprocessing
Modeling
Metrics
Evaluation
Metrics
Metrics
Ground
Truth
Metrics
Metrics
+
14
Heterogonous
Sensor Data
Feature Extraction
Behavioral Text.
Frequency Rep.
n-gram
Skipped n-gram
Helix, NN,LR,
DT, RF, SVM…
Motion
Metrics
Accuracy
TPR & FPR
DR & FRR
Location
Metrics
Interaction
Metrics
Sim. Attacks
Ctrl. Exp.
Auth. Records
MobiSens
Framework
+
15
16
•  Human behavior/activities share some common properties
with natural languages
•  Meanings are composed from meanings of building blocks
•  Exists an underlying structure (grammar)
•  Expressed as a sequence (time-series)
•  Apply rich sets of Statistical NLPs to mobile sensory data
3
3.5
4
4.5
5
5.5
6
0 20 40 60 80 100 120 140 160 180 200
log(freq)
Rank of words by frequency
Zipf’s Law
17
•  Is this play Shakespeare’s work?
•  Comparing the play to Shakespeare’s known
library of works
•  Track words and phases patterns in the data
•  Calculate the probability the unknown U
given all the known Shakespeare’s work {S}
•  Compare with a threshold θ
•  Authentic work (a=1)
•  Fake, Forgery or Plagiarism (a=0)
ˆa = sign[P(U|{S}) > ]
18
Quantization Clustering
19
•  Convert feature vector series to label streams – dimension reduction
•  Step window with assigned length
A1 A2 A1 A4
G2 G5 G2 G2
W2 W1 W2
P1 P3 P6 P1
A2 G2G5 W1 P1P3 A1A4 G2 W1W2 P1
20
Applications
Modeling
PreprocessingSensing
Feature
Construction
Behavior Text
Generation
N-gram
Model
Classification
Threshold
Skipped
N-gram
Model
Helix
DT
Evaluation
ROC Accuracy
Precision
Recall
UX
Prediction
Recognition
SVM
DT SVM
Ground
Truth
Anomaly
Detection
...
21
Quantization
Risk Analysis
Tree
Clustering
Activity
Recognition
<
Application Sensitivity
Application Access Control
Certainty of Risk
Sensor Fusion
and Segmentation
Application
Access
Control
22
Inference
ModelingPreprocessingSensing
Feature
Construction
Behavior Text
Generation
N-gram
Model
Classifier
Binary
Classifier
Threshold
User
Authentication
User
Classification
•  SenSec collects sensor data
• Motion sensors
• GPS and WiFi Scanning
• In-use applications and their traffic patterns
•  SenSec modulebuild user behavior models
• Unsupervised Activity Segmentation and model the sequence using
Language model
• Building Risk Analysis Tree (DT) to detect anomaly
• Combine above to estimate risk (online): certainty score
•  Application Access Control Module activate authentication based
on the score and a customizable threshold.
23
•  Accelerometer
• Used to summarize
acceleration stream
• Calculated separately for each
dimension [x,y,z,m]
• Meta features:
Total Time, Window Size
•  GPS: location string from Google Map API and mobility path
•  WiFi: SSIDs, RSSIs and path
•  Applications: Bitmap of well-known applications
•  Application Traffic Pattern: TCP UDP traffic pattern vectors:
[ remote host, port, rate ]
24
•  Offline data collection (for training and testing)
Pick up the device from a desk
Unlock the device using the right slide pattern
Invoke Email app from the "Home Screen”
Some typing on the soft keyboard
Lock the device by pressing the "Power" button
Put the device back on the desk
2525
26
• 71.3% True-Positive Rate with 13.1% False Positive
with only 2-3 days of training data
27
•  Various Behaviometric applications be framed as a classification
problem
•  Prerequisite: a learned Behaviometric model
•  Input: a given observation of user’s behavior,
•  Output: some decisions based on the observation and the model
•  Behavioral text representation may lead to huge features space
•  Algorithms to handle large feature space
•  Smart feature set construction to limit size of the feature space
•  Identification and anomaly detection work better if the users are
performing the same activity
•  Activity recognition before identification/detection
•  High FPR
•  Incorporate other factors and metrics
•  UX improvements
28
29
•  The language model in general builds a single model for all types
of activities.
•  Often the way people perform a certain activity is enough to
distinguish them(as in PoC).
•  Models to identify between the diff. of how people perform the
same, or same class of activities.
0
5
10
15
20
0 20 40 60 80 100 120 140
acceleration
time
Accelerometer X-axis
standing
walking
running
Activity
Class
Extraction
AC-1 Model
AC-2 Model
AC-n Model
…
30
0
5
10
15
20
0 20 40 60 80 100 120 140
acceleration
time
Accelerometer X-axis
standing
walking
running
0
100
200
300
400
500
600
700
0 1 2 3 4 5 6 7
amplitude
frequency
Accelerometer X-axis
standing
walking
running
0
5
10
15
20
0 20 40 60 80 100 120 140
acceleration
time
Accelerometer Y-axis
standing
walking
running
0
200
400
600
800
1000
1200
1400
0 1 2 3 4 5 6 7
amplitude
frequency
Accelerometer Y-axis
standing
walking
running
0
5
10
15
20
0 20 40 60 80 100 120 140
acceleration
time
Accelerometer Z-axis
standing
walking
running
0
100
200
300
400
500
600
0 1 2 3 4 5 6 7
amplitude
frequency
Accelerometer Z-axis
standing
walking
running
Figure 4.7: Accelerometer readings for standing, walking and running along with the discrete
fourier transformation
31
•  Extract the activity level of motion time series data by examining
magnitude of the data and frequency composition via DFT
•  Activity Level =
•  Activity Class can be arbitrary segments of this function which has
range [0, 1)
0
5
10
15
20
0 20 40 60 80 100 120 140
acceleration
time
Accelerometer Y-axis
standing
walking
running
0
200
400
600
800
1000
1200
1400
0 1 2 3 4 5 6 7
amplitude
frequency
Accelerometer Y-axis
standing
walking
running
32
•  t-SNE to map feature space into 2-D plane
•  First tested activity recognition method on ambulatory
behavior with 381 samples of standing, walking, and
running. Correctly classifies 359/381 samples,
giving an accuracy of 94.23%
•  For user identification we ranked the features and used
the top features to create models using classical ML
algorithms
33
•  We approach the authentication problem by building a
Multivariate Gaussian distribution for each activity class
•  We fit the parameters, mean and standard deviation, for each
feature for a training set for which we define to contain only non-
anomalous data
•  The calculate the probability of some unknown data being
generated by the model with
34
•  Each range is an activity class with lower ranges representing
activities with low magnitude or low frequency and higher ranges
representing activities with high magnitude and high frequency.
•  110GB dataset of accelerometer data from 25 users.
35
•  Using only motion data can lead to a very high false positive rate
•  Combine motion with location factor to mitigate
•  Verify most users spend the majority time in a small number of
places. Incorporate location in experiments lead to reduced FPRs
36
37
•  Hypothesis: the micro-behavior a user interacts with the soft keyboard
reflects his/her cognitive and physical characteristics.
Cognitive fingerprints: typing rhythms, correction rate, delay between keys,
duration at each key….
Physical characteristics: area of pressure, amount of pressure, position of
contact, shift …
38
•  Keystroke Dynamics are a popular subject
Many papers—focusing primarily on desktops
Great success for passwords, good success for arbitrary text
Typing rate, key-to-key latencies are the primary features
Once people are skilled at typing, they develop natural rhythms (on
desktops)
•  Detecting keystroke patterns on mobile phones is challenging
Focus on Desktop-like attributes
Typing rate, timing, di-graphs, tri-graphs, etc.
•  Use background applications to “sniff” keystrokes
Without direct access to keyboard
Successful demonstrations using accelerometers
38
Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 39
•  Frequent use
•  Typically single user
•  Context awareness
•  Protected applications vs. Non-
protected
•  Current and historical patterns
•  Touchscreens provide wealth
of data
•  Touch location, pressure, finger
size, finger drift
•  Wide variety of other sensors
•  Accelerometers, gyroscopes
•  Limited computing power
•  Need to use efficient algorithms
•  Finite battery life
•  Users are sensitive to battery life
impact
•  Highly mobile
•  Typical usage: lying down, sitting,
walking, passenger in car/train/
subway system
•  Need to behave gracefully
40
•  Location pressed on keys
•  Length of press
key down to key up
•  Force of press
Varies across device types
•  Change of force over key press
•  Size of finger
Varies across device types
•  Drift of finger during press
•  Recent accelerometer history
41
42
43
44
•  From finger down to finger up
45
46
•  13 initial users after short recruiting drive
•  2 week long collection period
•  86,000 keystrokes
•  430,000 data points @ ~5/keystroke
•  Data split into training and testing:
•  99% accuracy with just 1 key stroke
Training Data for Model
50%
CV
15%
Training
for Keys
15%
Final
Testing
15%
CV for
Keys
10%
47
48
Anomaly Detection Rate: 67.7%
FRR(FPR):4.6%
49
Anomaly Detection Rate:84.8%, increased from 67.7% for 5 strokes
FRR(FPR): 2.2%, reduced from
4.6% for 5 strokes
50
51
•  Alpha test started in 6/2012, 1st Google Play Store release in 10/2012
•  False Positive: 13% FPR still annoying users sometimes
•  75% alpha users mentioned feeling annoyed when SenSec prompted
for passcode, at least once. Some of them are confused by the UI.
“I couldn’t get the passcode right multiple times when trying to answer an
important phone call because I was single-handed”
“I just entered my passcode a min ago and it asked me again”
“I was so lost about this UI. What’s this training and testing knob?”
52
•  Use adaptive model
•  Adding the trace data shortly before a false positive to the training data and
update the model
•  Add sliding pattern in addition to passcode validation
•  A confirmed false positive will grant a “free ride” for a configurable
duration
•  Assumption: just authenticated user should control the device for a given
period of time
•  “free ride” period will end immediately if abrupt context change is
detected.
•  Interaction Metrics added as part of the sensory fusion
•  Motion, location and interaction metrics work together
•  Different user flow: Reactive triggering vs. Proactive triggering
53
1 USER INTERFACE AND USER EXPERIENCE CHALLENGES 1
Figure 6.3: SenSec Protection Options: Secured, Protected via either Passcode or Sliding Patter
54
6.1 USER INTERFACE AND USER EXPERIENCE CHALLENGES 117
Figure 6.2: SenSec Tutorial
.3: SenSec Protection Options: Secured, Protected via either Passcode or Sliding Pattern
Figure 6.4: SenSec Home Screen
55
•  Data collected from physical and soft sensors
•  Identify various behavioral factors from sensor streams
•  Build behavioral models to generate quantifiable metrics
•  Apply these models to a mobile security application: SenSec
Study the fundamental scientific problem
modeling a mobile user’s behavior
heterogeneous sensory time-series
of from
56
•  Propose a concept Behaviometric to study the fundamental scientific
problem of modeling a mobile user’s behaviors from heterogeneous
sensor time series
•  Adopt a Language approach to solve Behaviometric problems via
various NLP techniques
•  Unsupervised algorithm Helix to discover the hierarchical structure, i.e.
grammar, in activity recognition
•  Investigate existing statistical learning algorithms being adopted and
applied to Behaviometric context when NLP approach is less sufficient
•  Derive an effective yet simple “activity level” metric from time series to
be used in identification and detection. Use location metrics to augment
motion metrics to reduce FPR.
•  Develop and deploy versions of SenSec app and adapt through user
feedbacks.
57
•  Extended data set for System Metrics
TCP, UDP traffic; sound; ambient lighting; battery status, etc.
•  Data and Modeling
Gain more insights into the data, features and factorized relationships among
various sensors
•  Enhanced security of SenSec components prepared for
commercial release
Integration with Android security framework and other applications
•  Privacy as expectation (Liu et al., 2012)
Users need to know where the data resides, how the data is going to be used
and shared. Whom to trust the data with?
•  Energy efficiency
58
•  CyLab at Carnegie Mellon
•  Northrop Grumman Cybersecurity Research Consortium
•  Cisco
Research aware for “A Language Approach in Behavioral Modeling”
Research award for “Privacy Preserved Personal Big Data Analytics through
Fog Computing’’
•  Special thanks: Dr. Hao Hu, Jiatong Zhou, Dr. Flavio Bonomi, Cisco Systems;
Sky Hu, Twitter Inc.; Yuan Tian, Google Inc.; Pang Wu, Samsung Research
North America.
58
Cybersecurity
Research Consortium
59
“Mobile behaviometrics: Models and applications” In Proceedings of the Second IEEE/CIC Inter- national Conference on
Communications in China (ICCC), Xi’An, China, August 12-14 2013., [with H.Hu, S.Hu, P.Wu, J.Zhang]
“MobiSens: A Versatile Mobile Sensing Platform for Real-world Applications”, MONE, 2013, [with P.Wu, J.Zhang]
"SenSec: Mobile Application Security through Passive Sensing," to appear in the Proceedings of International Conference
on Computing, Networking and Communications. (ICNC 2013). San Diego, USA. January 28-31, 2013 [with P.Wu, X.Wang,
J.Zhang]
“Towards Accountable Mobility Model: A Language Approach on User Behavior Modeling in Office WiFi Networks”, accepted
to ICCCN 2011, Maui, HI, Aug 1-5, 2011 [with Y.Zhang]
"Retweet Modeling Using Conditional Random Fields," in the Proceedings of DMCCI 2011: ICDM 2011 Workshop on Data
Mining Technologies for Computational Collective Intelligence, December 11, 2011.[ with H.Peng, D.Piao, R.Yan and
Y.Zhang]
“Mobile Lifelogger - recording, indexing, and understanding a mobile user's life", in the Proceedings of The Second
International Conference on Mobile Computing, Applications, and Services, Santa Clara, CA, Oct 25-28, 2010 [With
S.Chennuru, P.Cheng, Y.Zhang]
"SensCare: Semi-Automatic Activity Summarization System for Elderly Care", MobiCase 2011, Los Angeles, CA, October
24-27, 2011. [with Pang Wu, Huan-kai Peng,Joy Ying Zhang]
"Helix: Unsupervised Grammar Induction for Structured Human Activity Recognition," to appear in the Proceedings of The
IEEE International Conference on Data Mining series (ICDM), Vancouver, Canada, Dec 11-14, 2011.[with Huan-Kai Peng,
Pang Wu, and Ying Zhang]
"Statistically Modeling the Effectiveness of Disaster Information in Social Media," to appear in the Proceedings of IEEE
Global Humanitarian Technology Conference (GHTC), Seattle, Washington, Oct. 30 - Nov. 1st, 2011.[with Fei Xiong,
Dongzhen Piao, Yun Liu, and Ying Zhang]
"A dissipative network model with neighboring activation," to appear in THE EUROPEAN PHYSICAL JOURNAL B.[with F.
Xiong, Y. Liu, J. Zhu, Z. J. Zhang, Y. C. Zhang, and J. Zhang]
"Opinion Formation with the Evolution of Network," to appear in the Proceedings of 2011 Cross-Strait Conference on
Information Science and Technology and iCube, TaiBei, China, Dec 8-9, 2011.[with F.Xiong, Y.Liu, Y.Zhang]
Thank you.
61
Location
Time of the Day
Day of the week
R =f(handholding,
indoor loc, app)
Alert!
Activity
R =f(WiFi trace,
app, time)
Traveling Speed
Gait R =f(geo trace,
app, time)
At home At work Between home and work
8am-6pm Other
Mon.-Fri. Sat. & Sun idle
Walking Driving

Mais conteúdo relacionado

Mais procurados

Wearable technologies: what's brewing in the lab?
Wearable technologies: what's brewing in the lab?Wearable technologies: what's brewing in the lab?
Wearable technologies: what's brewing in the lab?Daniel Roggen
 
Wearable Computing - Part II: Sensors
Wearable Computing - Part II: SensorsWearable Computing - Part II: Sensors
Wearable Computing - Part II: SensorsDaniel Roggen
 
I tac tics_ntelligent infra_r&d
I tac tics_ntelligent infra_r&dI tac tics_ntelligent infra_r&d
I tac tics_ntelligent infra_r&dArpan Pal
 
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012Charith Perera
 
Near field communication
Near field communicationNear field communication
Near field communicationDheeraj Raja
 
The DemaWare Service-Oriented AAL Platform for People with Dementia
The DemaWare Service-Oriented AAL Platform for People with DementiaThe DemaWare Service-Oriented AAL Platform for People with Dementia
The DemaWare Service-Oriented AAL Platform for People with DementiaYiannis Kompatsiaris
 
Cps innovation lab kolkata iiest
Cps innovation lab kolkata iiestCps innovation lab kolkata iiest
Cps innovation lab kolkata iiestArpan Pal
 
Deep-learning based single object tracker for night surveillance
Deep-learning based single object tracker for night surveillance  Deep-learning based single object tracker for night surveillance
Deep-learning based single object tracker for night surveillance IJECEIAES
 

Mais procurados (8)

Wearable technologies: what's brewing in the lab?
Wearable technologies: what's brewing in the lab?Wearable technologies: what's brewing in the lab?
Wearable technologies: what's brewing in the lab?
 
Wearable Computing - Part II: Sensors
Wearable Computing - Part II: SensorsWearable Computing - Part II: Sensors
Wearable Computing - Part II: Sensors
 
I tac tics_ntelligent infra_r&d
I tac tics_ntelligent infra_r&dI tac tics_ntelligent infra_r&d
I tac tics_ntelligent infra_r&d
 
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012
 
Near field communication
Near field communicationNear field communication
Near field communication
 
The DemaWare Service-Oriented AAL Platform for People with Dementia
The DemaWare Service-Oriented AAL Platform for People with DementiaThe DemaWare Service-Oriented AAL Platform for People with Dementia
The DemaWare Service-Oriented AAL Platform for People with Dementia
 
Cps innovation lab kolkata iiest
Cps innovation lab kolkata iiestCps innovation lab kolkata iiest
Cps innovation lab kolkata iiest
 
Deep-learning based single object tracker for night surveillance
Deep-learning based single object tracker for night surveillance  Deep-learning based single object tracker for night surveillance
Deep-learning based single object tracker for night surveillance
 

Semelhante a Behaviometrics: Behavior Modeling from Heterogeneous Sensory Time-Series

GaitProjectProposal
GaitProjectProposalGaitProjectProposal
GaitProjectProposalVivek Kumar
 
Human Activity Recognition in Android
Human Activity Recognition in AndroidHuman Activity Recognition in Android
Human Activity Recognition in AndroidSurbhi Jain
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsVille Antila
 
MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013Charith Perera
 
Iit kgp workshop
Iit kgp workshopIit kgp workshop
Iit kgp workshopArpan Pal
 
Iot architecture
Iot architectureIot architecture
Iot architectureAnam Iqbal
 
Mobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextMobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextFlorent Stroppa
 
Io t research_arpanpal_iem
Io t research_arpanpal_iemIo t research_arpanpal_iem
Io t research_arpanpal_iemArpan Pal
 
A benchmark dataset to evaluate sensor displacement in activity recognition
A benchmark dataset to evaluate sensor displacement in activity recognitionA benchmark dataset to evaluate sensor displacement in activity recognition
A benchmark dataset to evaluate sensor displacement in activity recognitionOresti Banos
 
Embedded Sensing and Computational Behaviour Science
Embedded Sensing and Computational Behaviour ScienceEmbedded Sensing and Computational Behaviour Science
Embedded Sensing and Computational Behaviour ScienceDaniel Roggen
 
Detection and Prevention of security vulnerabilities associated with mobile b...
Detection and Prevention of security vulnerabilities associated with mobile b...Detection and Prevention of security vulnerabilities associated with mobile b...
Detection and Prevention of security vulnerabilities associated with mobile b...Clinton DSouza
 
Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...ISA Interchange
 
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Jiang Zhu
 
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...Milan Milenkovic
 
Detecting and Improving Distorted Fingerprints using rectification techniques.
Detecting and Improving Distorted Fingerprints using rectification techniques.Detecting and Improving Distorted Fingerprints using rectification techniques.
Detecting and Improving Distorted Fingerprints using rectification techniques.sandipan paul
 
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...IJERA Editor
 
Computer Vision for Measurement & FR
Computer Vision for Measurement & FRComputer Vision for Measurement & FR
Computer Vision for Measurement & FRRekaNext Capital
 
Gesture final report new
Gesture final report newGesture final report new
Gesture final report newchithiracyriac
 

Semelhante a Behaviometrics: Behavior Modeling from Heterogeneous Sensory Time-Series (20)

GaitProjectProposal
GaitProjectProposalGaitProjectProposal
GaitProjectProposal
 
Human Activity Recognition in Android
Human Activity Recognition in AndroidHuman Activity Recognition in Android
Human Activity Recognition in Android
 
From Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior PatternsFrom Context-awareness to Human Behavior Patterns
From Context-awareness to Human Behavior Patterns
 
MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013MDM-2013, Milan, Italy, 6 June, 2013
MDM-2013, Milan, Italy, 6 June, 2013
 
Iit kgp workshop
Iit kgp workshopIit kgp workshop
Iit kgp workshop
 
iotarchitecture-190506052723.pdf
iotarchitecture-190506052723.pdfiotarchitecture-190506052723.pdf
iotarchitecture-190506052723.pdf
 
Iot architecture
Iot architectureIot architecture
Iot architecture
 
Mobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile contextMobile user experience conference 2009 - The rise of the mobile context
Mobile user experience conference 2009 - The rise of the mobile context
 
Io t research_arpanpal_iem
Io t research_arpanpal_iemIo t research_arpanpal_iem
Io t research_arpanpal_iem
 
A benchmark dataset to evaluate sensor displacement in activity recognition
A benchmark dataset to evaluate sensor displacement in activity recognitionA benchmark dataset to evaluate sensor displacement in activity recognition
A benchmark dataset to evaluate sensor displacement in activity recognition
 
Embedded Sensing and Computational Behaviour Science
Embedded Sensing and Computational Behaviour ScienceEmbedded Sensing and Computational Behaviour Science
Embedded Sensing and Computational Behaviour Science
 
Secure you
Secure you Secure you
Secure you
 
Detection and Prevention of security vulnerabilities associated with mobile b...
Detection and Prevention of security vulnerabilities associated with mobile b...Detection and Prevention of security vulnerabilities associated with mobile b...
Detection and Prevention of security vulnerabilities associated with mobile b...
 
Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...Influence of time and length size feature selections for human activity seque...
Influence of time and length size feature selections for human activity seque...
 
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics Guest Lecture: SenSec - Mobile Security through BehavioMetrics
Guest Lecture: SenSec - Mobile Security through BehavioMetrics
 
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...
ISCC 2013 keynote "Pervasive Sensing and IoT Cooking Recipe: Just add People ...
 
Detecting and Improving Distorted Fingerprints using rectification techniques.
Detecting and Improving Distorted Fingerprints using rectification techniques.Detecting and Improving Distorted Fingerprints using rectification techniques.
Detecting and Improving Distorted Fingerprints using rectification techniques.
 
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...
Hand Motion Gestures For Mobile Communication Based On Inertial Sensors For O...
 
Computer Vision for Measurement & FR
Computer Vision for Measurement & FRComputer Vision for Measurement & FR
Computer Vision for Measurement & FR
 
Gesture final report new
Gesture final report newGesture final report new
Gesture final report new
 

Mais de Jiang Zhu

Core of Personalization at Polyvore: Style Profile
Core of Personalization at Polyvore: Style ProfileCore of Personalization at Polyvore: Style Profile
Core of Personalization at Polyvore: Style ProfileJiang Zhu
 
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...Jiang Zhu
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachJiang Zhu
 
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...Improving Web Siste Performance Using Edge Services in Fog Computing Architec...
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...Jiang Zhu
 
ICNC 2013 SenSec Presentation
ICNC 2013 SenSec PresentationICNC 2013 SenSec Presentation
ICNC 2013 SenSec PresentationJiang Zhu
 
美国云计算发展现状及趋势-2010
美国云计算发展现状及趋势-2010美国云计算发展现状及趋势-2010
美国云计算发展现状及趋势-2010Jiang Zhu
 
Icccn2011 jiang-0802
Icccn2011 jiang-0802Icccn2011 jiang-0802
Icccn2011 jiang-0802Jiang Zhu
 

Mais de Jiang Zhu (7)

Core of Personalization at Polyvore: Style Profile
Core of Personalization at Polyvore: Style ProfileCore of Personalization at Polyvore: Style Profile
Core of Personalization at Polyvore: Style Profile
 
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...
Big Data and Internet of Things: A Roadmap For Smart Environments, Fog Comput...
 
Art and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end ApproachArt and Science of Web Sites Performance: A Front-end Approach
Art and Science of Web Sites Performance: A Front-end Approach
 
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...Improving Web Siste Performance Using Edge Services in Fog Computing Architec...
Improving Web Siste Performance Using Edge Services in Fog Computing Architec...
 
ICNC 2013 SenSec Presentation
ICNC 2013 SenSec PresentationICNC 2013 SenSec Presentation
ICNC 2013 SenSec Presentation
 
美国云计算发展现状及趋势-2010
美国云计算发展现状及趋势-2010美国云计算发展现状及趋势-2010
美国云计算发展现状及趋势-2010
 
Icccn2011 jiang-0802
Icccn2011 jiang-0802Icccn2011 jiang-0802
Icccn2011 jiang-0802
 

Último

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Último (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Behaviometrics: Behavior Modeling from Heterogeneous Sensory Time-Series

  • 1. 1 Thesis Defense for the degree of Doctor of Philosophy Electrical and Computer Engineering Carnegie Mellon University Jiang Zhu jiang.zhu@sv.cmu.edu Thesis Committee Prof. Joy Zhang, Chair Prof. Jason Hong Prof. Patrick Tague Dr. Fabio Maino, Cisco Research
  • 2. 2 •  Data collected from physical and soft sensors •  Identify various behavioral factors from sensor streams •  Build behavioral models to generate quantifiable metrics •  Apply these models to a mobile security application: SenSec Study the fundamental scientific problem modeling a mobile user’s behavior heterogeneous sensory time-series of from
  • 3. 3
  • 5. 5 0% 10% 20% 30% 40% 50% 60% Mobile Device Loss or theft Strategy One Survey conducted among a U.S. sample of 3017 adults age 18 years older in September 21-28, 2010, with an oversample in the top 20 cities (based on population). •  “The 329 organizations polled had collectively lost more than 86,000 devices … with average cost of lost data at $49,246 per device, worth $2.1 billion or $6.4 million per organization. "The Billion Dollar Lost-Laptop Study," conducted by Intel Corporation and the Ponemon Institute, analyzed the scope and circumstances of missing laptop PCs.
  • 6. 6 Reliable Non-intrusive Password Application Usability A major source of security vulnerabilities. Easy to guess, reuse, forgotten, shared Different applications may have different sensitivities Authentication too-often or sometimes too loose
  • 7. 7 Passwords Normal passwords are not strong enough: usually meaningful words that can be remembered Stringent strong password can be annoying Most users do not use the password-aid tools (Hong et al. 2009) Fingerprint? Iris recognition? Face recognition? Voice recognition? Password for the DHS E-file: Contain from 8 to 16 characters Contain at least 2 of the following 3 characters: uppercase alphabetic, lowercase alphabetic, numeric Contain at least 1 special character (e.g., @, #, $, %, & *, +, =) Begin and end with an alphabetic character Not contain spaces Not contain all or part of your UserID Not use 2 identical characters consecutively Not be a recently used password
  • 8. 8 •  Provides a way to passively authenticate while using common, sensitive applications and services. •  Allows for rapid detection of unauthorized users Block their access as quickly as possible. •  Uses a variety of sensors available on common smartphones 8
  • 9. 9 •  Derived from •  Behavioral: the way a human subject behaves •  Biometrics: technologies and methods that measure and analyzes biological characteristics of the human body •  Finger prints, eye retina, voice patterns •  Behaviometrics: Measurable behavior to Recognize or to Verify •  Identity of a human subject, or •  Subject’s certain behaviors Behavioral BiometricsBehaviometrics
  • 10. 10 •  Mobile devices come with embedded sensors •  Accelerometers, gyroscope, magnetometer •  GPS receiver •  WiFi, Bluetooth, NFC •  Microphone, camera, •  Temperature, light sensor •  “Clock” and “Calendar” •  Connect with other sensors •  EEG, EMG, GSR •  Mobile devices are connected with the Internet •  Upload sensor data to the cloud •  Viewing information computing on the server side •  Users carry the device almost at all time. •  My phone “knows” where I am, what I am doing and my future activities.
  • 11. 11 •  Motion Metrics •  Location Metrics •  Interaction Metrics •  System Metrics •  Accelerometer •  activity, motion, hand trembling, driving style •  sleeping pattern •  inferred activity level, steps made per day, estimated calorie burned •  Motion sensors, WiFi, Bluetooth •  accurate indoor position and trace. •  GPS •  outdoor location, geo-trace, commuting pattern •  Microphone, camera: •  From background noise: activity, type of location. •  From voice: stress level, emotion •  Video/audio: additional contexts •  Keyboard, touches, slides •  Specific tasks, user interactions, …
  • 12. 12 •  Monitor and track user behavior on smartphones using various on-device sensors •  Convert sensory traces and other context information to Behaviometrics •  Build statistical models with these features and use them for calculation of Certainty Scores as security measure •  Trigger various secondary Authentication Schemes when certain application is launched or certain system function is invoked.
  • 14. 14 Heterogonous Sensor Data Feature Extraction Behavioral Text. Frequency Rep. n-gram Skipped n-gram Helix, NN,LR, DT, RF, SVM… Motion Metrics Accuracy TPR & FPR DR & FRR Location Metrics Interaction Metrics Sim. Attacks Ctrl. Exp. Auth. Records MobiSens Framework +
  • 15. 15
  • 16. 16 •  Human behavior/activities share some common properties with natural languages •  Meanings are composed from meanings of building blocks •  Exists an underlying structure (grammar) •  Expressed as a sequence (time-series) •  Apply rich sets of Statistical NLPs to mobile sensory data 3 3.5 4 4.5 5 5.5 6 0 20 40 60 80 100 120 140 160 180 200 log(freq) Rank of words by frequency Zipf’s Law
  • 17. 17 •  Is this play Shakespeare’s work? •  Comparing the play to Shakespeare’s known library of works •  Track words and phases patterns in the data •  Calculate the probability the unknown U given all the known Shakespeare’s work {S} •  Compare with a threshold θ •  Authentic work (a=1) •  Fake, Forgery or Plagiarism (a=0) ˆa = sign[P(U|{S}) > ]
  • 19. 19 •  Convert feature vector series to label streams – dimension reduction •  Step window with assigned length A1 A2 A1 A4 G2 G5 G2 G2 W2 W1 W2 P1 P3 P6 P1 A2 G2G5 W1 P1P3 A1A4 G2 W1W2 P1
  • 21. 21 Quantization Risk Analysis Tree Clustering Activity Recognition < Application Sensitivity Application Access Control Certainty of Risk Sensor Fusion and Segmentation Application Access Control
  • 22. 22 Inference ModelingPreprocessingSensing Feature Construction Behavior Text Generation N-gram Model Classifier Binary Classifier Threshold User Authentication User Classification •  SenSec collects sensor data • Motion sensors • GPS and WiFi Scanning • In-use applications and their traffic patterns •  SenSec modulebuild user behavior models • Unsupervised Activity Segmentation and model the sequence using Language model • Building Risk Analysis Tree (DT) to detect anomaly • Combine above to estimate risk (online): certainty score •  Application Access Control Module activate authentication based on the score and a customizable threshold.
  • 23. 23 •  Accelerometer • Used to summarize acceleration stream • Calculated separately for each dimension [x,y,z,m] • Meta features: Total Time, Window Size •  GPS: location string from Google Map API and mobility path •  WiFi: SSIDs, RSSIs and path •  Applications: Bitmap of well-known applications •  Application Traffic Pattern: TCP UDP traffic pattern vectors: [ remote host, port, rate ]
  • 24. 24 •  Offline data collection (for training and testing) Pick up the device from a desk Unlock the device using the right slide pattern Invoke Email app from the "Home Screen” Some typing on the soft keyboard Lock the device by pressing the "Power" button Put the device back on the desk
  • 25. 2525
  • 26. 26 • 71.3% True-Positive Rate with 13.1% False Positive with only 2-3 days of training data
  • 27. 27 •  Various Behaviometric applications be framed as a classification problem •  Prerequisite: a learned Behaviometric model •  Input: a given observation of user’s behavior, •  Output: some decisions based on the observation and the model •  Behavioral text representation may lead to huge features space •  Algorithms to handle large feature space •  Smart feature set construction to limit size of the feature space •  Identification and anomaly detection work better if the users are performing the same activity •  Activity recognition before identification/detection •  High FPR •  Incorporate other factors and metrics •  UX improvements
  • 28. 28
  • 29. 29 •  The language model in general builds a single model for all types of activities. •  Often the way people perform a certain activity is enough to distinguish them(as in PoC). •  Models to identify between the diff. of how people perform the same, or same class of activities. 0 5 10 15 20 0 20 40 60 80 100 120 140 acceleration time Accelerometer X-axis standing walking running Activity Class Extraction AC-1 Model AC-2 Model AC-n Model …
  • 30. 30 0 5 10 15 20 0 20 40 60 80 100 120 140 acceleration time Accelerometer X-axis standing walking running 0 100 200 300 400 500 600 700 0 1 2 3 4 5 6 7 amplitude frequency Accelerometer X-axis standing walking running 0 5 10 15 20 0 20 40 60 80 100 120 140 acceleration time Accelerometer Y-axis standing walking running 0 200 400 600 800 1000 1200 1400 0 1 2 3 4 5 6 7 amplitude frequency Accelerometer Y-axis standing walking running 0 5 10 15 20 0 20 40 60 80 100 120 140 acceleration time Accelerometer Z-axis standing walking running 0 100 200 300 400 500 600 0 1 2 3 4 5 6 7 amplitude frequency Accelerometer Z-axis standing walking running Figure 4.7: Accelerometer readings for standing, walking and running along with the discrete fourier transformation
  • 31. 31 •  Extract the activity level of motion time series data by examining magnitude of the data and frequency composition via DFT •  Activity Level = •  Activity Class can be arbitrary segments of this function which has range [0, 1) 0 5 10 15 20 0 20 40 60 80 100 120 140 acceleration time Accelerometer Y-axis standing walking running 0 200 400 600 800 1000 1200 1400 0 1 2 3 4 5 6 7 amplitude frequency Accelerometer Y-axis standing walking running
  • 32. 32 •  t-SNE to map feature space into 2-D plane •  First tested activity recognition method on ambulatory behavior with 381 samples of standing, walking, and running. Correctly classifies 359/381 samples, giving an accuracy of 94.23% •  For user identification we ranked the features and used the top features to create models using classical ML algorithms
  • 33. 33 •  We approach the authentication problem by building a Multivariate Gaussian distribution for each activity class •  We fit the parameters, mean and standard deviation, for each feature for a training set for which we define to contain only non- anomalous data •  The calculate the probability of some unknown data being generated by the model with
  • 34. 34 •  Each range is an activity class with lower ranges representing activities with low magnitude or low frequency and higher ranges representing activities with high magnitude and high frequency. •  110GB dataset of accelerometer data from 25 users.
  • 35. 35 •  Using only motion data can lead to a very high false positive rate •  Combine motion with location factor to mitigate •  Verify most users spend the majority time in a small number of places. Incorporate location in experiments lead to reduced FPRs
  • 36. 36
  • 37. 37 •  Hypothesis: the micro-behavior a user interacts with the soft keyboard reflects his/her cognitive and physical characteristics. Cognitive fingerprints: typing rhythms, correction rate, delay between keys, duration at each key…. Physical characteristics: area of pressure, amount of pressure, position of contact, shift …
  • 38. 38 •  Keystroke Dynamics are a popular subject Many papers—focusing primarily on desktops Great success for passwords, good success for arbitrary text Typing rate, key-to-key latencies are the primary features Once people are skilled at typing, they develop natural rhythms (on desktops) •  Detecting keystroke patterns on mobile phones is challenging Focus on Desktop-like attributes Typing rate, timing, di-graphs, tri-graphs, etc. •  Use background applications to “sniff” keystrokes Without direct access to keyboard Successful demonstrations using accelerometers 38
  • 39. Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 39 •  Frequent use •  Typically single user •  Context awareness •  Protected applications vs. Non- protected •  Current and historical patterns •  Touchscreens provide wealth of data •  Touch location, pressure, finger size, finger drift •  Wide variety of other sensors •  Accelerometers, gyroscopes •  Limited computing power •  Need to use efficient algorithms •  Finite battery life •  Users are sensitive to battery life impact •  Highly mobile •  Typical usage: lying down, sitting, walking, passenger in car/train/ subway system •  Need to behave gracefully
  • 40. 40 •  Location pressed on keys •  Length of press key down to key up •  Force of press Varies across device types •  Change of force over key press •  Size of finger Varies across device types •  Drift of finger during press •  Recent accelerometer history
  • 41. 41
  • 42. 42
  • 43. 43
  • 44. 44 •  From finger down to finger up
  • 45. 45
  • 46. 46 •  13 initial users after short recruiting drive •  2 week long collection period •  86,000 keystrokes •  430,000 data points @ ~5/keystroke •  Data split into training and testing: •  99% accuracy with just 1 key stroke Training Data for Model 50% CV 15% Training for Keys 15% Final Testing 15% CV for Keys 10%
  • 47. 47
  • 48. 48 Anomaly Detection Rate: 67.7% FRR(FPR):4.6%
  • 49. 49 Anomaly Detection Rate:84.8%, increased from 67.7% for 5 strokes FRR(FPR): 2.2%, reduced from 4.6% for 5 strokes
  • 50. 50
  • 51. 51 •  Alpha test started in 6/2012, 1st Google Play Store release in 10/2012 •  False Positive: 13% FPR still annoying users sometimes •  75% alpha users mentioned feeling annoyed when SenSec prompted for passcode, at least once. Some of them are confused by the UI. “I couldn’t get the passcode right multiple times when trying to answer an important phone call because I was single-handed” “I just entered my passcode a min ago and it asked me again” “I was so lost about this UI. What’s this training and testing knob?”
  • 52. 52 •  Use adaptive model •  Adding the trace data shortly before a false positive to the training data and update the model •  Add sliding pattern in addition to passcode validation •  A confirmed false positive will grant a “free ride” for a configurable duration •  Assumption: just authenticated user should control the device for a given period of time •  “free ride” period will end immediately if abrupt context change is detected. •  Interaction Metrics added as part of the sensory fusion •  Motion, location and interaction metrics work together •  Different user flow: Reactive triggering vs. Proactive triggering
  • 53. 53 1 USER INTERFACE AND USER EXPERIENCE CHALLENGES 1 Figure 6.3: SenSec Protection Options: Secured, Protected via either Passcode or Sliding Patter
  • 54. 54 6.1 USER INTERFACE AND USER EXPERIENCE CHALLENGES 117 Figure 6.2: SenSec Tutorial .3: SenSec Protection Options: Secured, Protected via either Passcode or Sliding Pattern Figure 6.4: SenSec Home Screen
  • 55. 55 •  Data collected from physical and soft sensors •  Identify various behavioral factors from sensor streams •  Build behavioral models to generate quantifiable metrics •  Apply these models to a mobile security application: SenSec Study the fundamental scientific problem modeling a mobile user’s behavior heterogeneous sensory time-series of from
  • 56. 56 •  Propose a concept Behaviometric to study the fundamental scientific problem of modeling a mobile user’s behaviors from heterogeneous sensor time series •  Adopt a Language approach to solve Behaviometric problems via various NLP techniques •  Unsupervised algorithm Helix to discover the hierarchical structure, i.e. grammar, in activity recognition •  Investigate existing statistical learning algorithms being adopted and applied to Behaviometric context when NLP approach is less sufficient •  Derive an effective yet simple “activity level” metric from time series to be used in identification and detection. Use location metrics to augment motion metrics to reduce FPR. •  Develop and deploy versions of SenSec app and adapt through user feedbacks.
  • 57. 57 •  Extended data set for System Metrics TCP, UDP traffic; sound; ambient lighting; battery status, etc. •  Data and Modeling Gain more insights into the data, features and factorized relationships among various sensors •  Enhanced security of SenSec components prepared for commercial release Integration with Android security framework and other applications •  Privacy as expectation (Liu et al., 2012) Users need to know where the data resides, how the data is going to be used and shared. Whom to trust the data with? •  Energy efficiency
  • 58. 58 •  CyLab at Carnegie Mellon •  Northrop Grumman Cybersecurity Research Consortium •  Cisco Research aware for “A Language Approach in Behavioral Modeling” Research award for “Privacy Preserved Personal Big Data Analytics through Fog Computing’’ •  Special thanks: Dr. Hao Hu, Jiatong Zhou, Dr. Flavio Bonomi, Cisco Systems; Sky Hu, Twitter Inc.; Yuan Tian, Google Inc.; Pang Wu, Samsung Research North America. 58 Cybersecurity Research Consortium
  • 59. 59 “Mobile behaviometrics: Models and applications” In Proceedings of the Second IEEE/CIC Inter- national Conference on Communications in China (ICCC), Xi’An, China, August 12-14 2013., [with H.Hu, S.Hu, P.Wu, J.Zhang] “MobiSens: A Versatile Mobile Sensing Platform for Real-world Applications”, MONE, 2013, [with P.Wu, J.Zhang] "SenSec: Mobile Application Security through Passive Sensing," to appear in the Proceedings of International Conference on Computing, Networking and Communications. (ICNC 2013). San Diego, USA. January 28-31, 2013 [with P.Wu, X.Wang, J.Zhang] “Towards Accountable Mobility Model: A Language Approach on User Behavior Modeling in Office WiFi Networks”, accepted to ICCCN 2011, Maui, HI, Aug 1-5, 2011 [with Y.Zhang] "Retweet Modeling Using Conditional Random Fields," in the Proceedings of DMCCI 2011: ICDM 2011 Workshop on Data Mining Technologies for Computational Collective Intelligence, December 11, 2011.[ with H.Peng, D.Piao, R.Yan and Y.Zhang] “Mobile Lifelogger - recording, indexing, and understanding a mobile user's life", in the Proceedings of The Second International Conference on Mobile Computing, Applications, and Services, Santa Clara, CA, Oct 25-28, 2010 [With S.Chennuru, P.Cheng, Y.Zhang] "SensCare: Semi-Automatic Activity Summarization System for Elderly Care", MobiCase 2011, Los Angeles, CA, October 24-27, 2011. [with Pang Wu, Huan-kai Peng,Joy Ying Zhang] "Helix: Unsupervised Grammar Induction for Structured Human Activity Recognition," to appear in the Proceedings of The IEEE International Conference on Data Mining series (ICDM), Vancouver, Canada, Dec 11-14, 2011.[with Huan-Kai Peng, Pang Wu, and Ying Zhang] "Statistically Modeling the Effectiveness of Disaster Information in Social Media," to appear in the Proceedings of IEEE Global Humanitarian Technology Conference (GHTC), Seattle, Washington, Oct. 30 - Nov. 1st, 2011.[with Fei Xiong, Dongzhen Piao, Yun Liu, and Ying Zhang] "A dissipative network model with neighboring activation," to appear in THE EUROPEAN PHYSICAL JOURNAL B.[with F. Xiong, Y. Liu, J. Zhu, Z. J. Zhang, Y. C. Zhang, and J. Zhang] "Opinion Formation with the Evolution of Network," to appear in the Proceedings of 2011 Cross-Strait Conference on Information Science and Technology and iCube, TaiBei, China, Dec 8-9, 2011.[with F.Xiong, Y.Liu, Y.Zhang]
  • 61. 61 Location Time of the Day Day of the week R =f(handholding, indoor loc, app) Alert! Activity R =f(WiFi trace, app, time) Traveling Speed Gait R =f(geo trace, app, time) At home At work Between home and work 8am-6pm Other Mon.-Fri. Sat. & Sun idle Walking Driving

Notas do Editor

  1. Smartphones Globally, grew 60% in 2011, reaching 698 million. will grow 2.9-fold between 2011 and 2016, reaching 2,027 million In North America, grew 58% during 2011, reaching 138 million in number. will grow 1.8-fold between 2011 and 2016, reaching 253 million Tablets Globally, grew 3.4-fold in 2011, reaching 33.6 million in number. Globally, the number of mobile-connected tablets will grow 7.6-fold between 2011 and 2016, reaching 256.7 million. In North America, 2.7-fold during 2011, reaching 15.5 million. will grow 3.8-fold between 2011 and 2016, reaching 58.3 million
  2. As mobile applications and devices are becoming ubiquitous, it is crucial for mobile users to privately and securely interact with their environment and data and for mobile services to trust the identity of the user. While mobile devices such as smartphones make our lives convenient in ways that were unimaginable before, applications such as email, web browsing, social network, shopping and online banking know too much about our private lives. Mobility introduces additional security and privacy challenges in being able to provide services in a way that neither compromises the environment of users nor their data. Protecting a user's privacy and ensuring the accountability of mobile applications in a seamless and non-intrusive way poses great challenges to next generation mobile computing platforms. Recently, a new survey* has revealed that 36 percent of consumers in the United States have either lost their mobile phone or had it stolen. Another survey† has also revealed that 329 organizations polled had collectively lost more than 86,000 devices with average cost of lost data at $49,246 per device, worth $2.1 billion or $6.4 million per organization. Given the high loss rate and high cost associated with these losses, accountable schemes are needed to protect the data on the mobile devices.
  3. Reliable authentication is an essential requirement for a mobile device and its applications. Today, passwords are the most common form of authentication. This results in two potential problems. First, passwords are also a major source of security vulnerabilities, as they are often easy to guess, re-used, often forgotten, often shared with others, and are susceptible to social engineering attacks. Secondly, to secure the data and applications on a mobile device, the mobile system would prompt user for authentication quite often and this results in series usability issues. We also observe that different applications on a mobile device may have different sensitivities towards the aforementioned threats and data loss. For example, the Angry Bird game on an android is less sensitive than Contact List or Phone Album should the device is operated by unauthorized user. One-thing-for-all approach in authentication schemes may be either too loose for some applications, which expose them to risks, or too tight for others, which cause usability problems.
  4. As a quick overview of our work, in order to enforce application security, we monitor and track user behavior through the traces collected from the on-device sensors. And then we convert these trace and other context information to behavior features. We adopt a n-gram model to model the user’s behavior and use that to monitor and calculate certainty score. This score will be fed to smartphone’s authentication module to enforce the security of various application and its data on the device.
  5. we convert the raw sensory data into behavior text representation as sequences of behavior labels. Each behavior label is considered as a ``word'' in the language. train a continues n-gram language model
  6. SenSec is constantly collecting sensory data from accelerometer, gyroscope, GPS, WiFi, microphone or even camera. Through analyzing the sensory data, it constructs the context under which the mobile device is used. This includes locations, movements and usage patterns, etc. From the context, the system can calculate the certainty that the system is at risk. Different applications on mobile device are assigned either manually or automatically with a sensitivity value. When user is invoking an application, SenSec compares the certainty with this application’s sensitivity level. If the sensitivity passes the certainty threshold, authentication mechanism would be employed to ensure security policy for that application.
  7. T-SNE plots show clear clusters for feature vectors of different users
  8. Models were trained with 3000 keystrokes from primary user and 2000 from each of 3 other users. These models were tested against [on average] 539 ‘primary user’ keystrokes and 489 keystrokes from a wide variety of other users (not used to train the model)
  9. That brings me to the end of my presentation. Thank you very much for your attention.