SlideShare uma empresa Scribd logo
1 de 51
Baixar para ler offline
Driving Style Analysis
based on Trip Segmentation.
A Comparative Multi-Technique Approach
Marco Brambilla, Andrea Mauri, Paolo Mascetti
@marcobrambi
Agenda
Intro
Problem Definition
Dataset
Data Exploration and Preliminaries
Trip Segmentation Techniques
Validation
Conclusions
Intro: Relevance
1.24 million traffic-related fatalities occur annually
world wide
Currently the leading cause of death for people aged
between 15 and 29 years
Majority of cases due to improper or risky driving
behavior
Source: World Health Organisation (WHO)
Intro: Driving Process
Driving Process: driving
a car is a complex task
that requires to take
informed decisions
based on information
pertaining different
levels such as his own
state and other drivers’
behavior.
Intro: Relevant Information
Vehicle’s Status
Contextual Info
• Road State
• Weather
Conditions
• Traffic Info
• Road Risk
• Traffic
Problem Statement
Data-driven driver profiling
with respect to driving risk
Essentially: Multivariate Time Series Segmentation
Application scenarios in insurance, promoting
pay-how-you-drive (PHYD) business models
State of the Art and Challenges
State of the art: many works on identification and
recognition of behavioural patterns (line following,
accelerations, braking etc) and maneuvers
recognition, behavioural scoring, prediction of
driver intentions.
Supervised Learning techniques require intensive
end expensive gathering process.
Proposed Solution
Unsupervised techniques to profile drivers
behaviour based on identified recurrent patterns
on driving path segmentation
Comparison of 3 different approaches and use of
all of them for consolidated results
1. Unsupervised Segmentation Based on Clustering
2. Unsupervised Segmentation Based on HMM
3. Unsupervised Topic Extraction
Contextual Scenes
Observed driving behaviours that are
repeated in each driver's behaviour and
also across different drivers.
A reduced representation of the original
Multivariate Time Series conveying a
simplified characterization
Further reasoning is then applied
ETL Process
3 Steps:
Extract: read collected files and selection of candidate features
Transform:
Filter and Grouping
Features computation
Load: produce a unique dataset
PreProcessing
Transform
Global	
dataset.csv
Load
Trip	File.csv
Extract
Datasets
Collection Device :
Xsens MTi-G-710 (27 users)
And cell phones (10 users)
Retrieved Signals :
Acceleration measurements
Altitude
GPS Positioning
Speeding
Orientation
Mounted in-vehicle aligned with
direction of movement.
No Ground truth knowledge
Features Selected
Acceleration (on Y and X axes),
Speed (on Y and X axes)
Difference in yaw
Pre-Analysis 1: Data Exploration
Pre-Analysis 1: Data Exploration
Pre-Analysis 2: Application of Driving
Safety Existing Analyses
Vaiana et.al. Propose a Driving Safety Diagram based on longitudinal and
lateral accelerations analysis.
Aggressiveness Index formulation:
(A = Aggressive, S = Safe points)
Graphical representation:
DP-Means
1. Unsupervised Segmentation Based
on DP-Means Clustering
Problem: Bayesian nonparametric techniques require expensive sampling methods or
variational techniques.
DP-means: proposed by Kulis et. al. revisiting k-means: K-means like objective function +
penalty
A new cluster is created whenever a point is farther than λ away from every already existing centroid.
Note:
Clustering results depends on data ordering.
Clusters
Silhouette
Results
Centroids
Results
Centroids
Distribution of features across
clusters
Distribution of features across
clusters
Trip Segmentation Examples
Trip Segmentation Examples
Hidden Markov Models
Unsupervised Segmentation based on
HMM
Goal: identify latent structure given observed data points,
assuming existance of Gaussian hidden states.
Assign to each observed point the corresponding hidden state.
Hidden Markov Models (HMM):
Observation and hidden states
Markovian properties
Continous observation
Unsupervised Segmentation based on
HMM
Training:
Baum-Welch EM algorithm to learn model parameters
Decoding:
Viterbi decoding to assign to each observed point the most
likely hidden state
HMM Results
Also a different variation applied: inertial HMM: lower transition
probabilities enforcing state persistence. Sensible for driving.
HMM Results
Clusters as hidden states.
HMM Results
Clusters as hidden states.
Example of Trip Segmentation
Topic Extraction
Topic Extraction Approach
What is topic extraction ?
Model topical concepts belonging to a set of textual documents.
Data are described as documents and the components are distributions of
terms that reflect recurring patterns, name Topics.
Hierarchical Dirichlet Processes (HDPs)
soft-clustering technique based on non-parametric Bayesian theory.
number of topics is not set a priori, but learned from data.
Posteriori probability approximated by Variational Inference algorithm by
Wang et.al.
Results:
Most relevant topics for each document and terms distribution in each topic.
Topic Extraction Process
Data	Quantization
Documents creation
Topics	Extraction
Topics	Evaluation
Quantization – Binning Process
with static binning strategy
Documents
Terms Relevance on Top 7 Topics
Linguist…
Terms Relevance on Top 7 Topics
… and data analyst perspectives
…
Comparison and Validation
Big Issue: How to Compare?
1) Point-to-point or point distribution
2) Resulting grouping of trips
3) Perceived user similarity of trips
Solution 1: Point-to-Point
Overlap of clusters? Per trip? Overall?
Solution 1: Point-to-Point
Solution 1: Point-to-Point
Solution 2: Moving from Points to
Trips
Can we cluster trips based on how observation points have
been clustered?
à Simple K-means clustering of trips for each approach.
à Comparison of overlap of the different clusters
Coherent with original question: grouping of trips (and thus
drivers) by driving behavior
Result of overlap analysis
K-means with K=6 clusters.
DP-means vs. HMM: 74% overlap
DP-means vs. Topic: 44%
HMM vs. Topic: 48%
Human Validation of Trip Groups
Experts (knowledgeable about driving styles and driving
paths recorded) identify possible groups of trips in the
dataset
Problem:
- Unable to distinguish 6 categories of groups
- Only 3 categories are feasible
- Best matching 6à3 categories for each method
Results
Conclusions
Three different clustering techniques of driving
behavior over trips
-> segmentation
Clustering of trips based on behavior
-> up to 74% overlap over 6 clusters
-> 100% overlap over 3 clusters
User Validation
-> 96% precision over 3 clusters
Future Work
About collection process:
Gathering process including contextual information (road
risk, traffic status, weather conditions)
Larger dataset to improve inference performance
About implemented methods:
Smarter data ordering for DP-means
Relax independency assumption in HMM
Improvements in data discretization process for HDP
Marco Brambilla, @marcobrambi, marco.brambilla@polimi.it
http://datascience.deib.polimi.it
Thanks! Questions?

Mais conteúdo relacionado

Mais procurados

Drowsiness Detection Presentation
Drowsiness Detection PresentationDrowsiness Detection Presentation
Drowsiness Detection PresentationSaurabh Kawli
 
Intelligent transportation system
Intelligent transportation system Intelligent transportation system
Intelligent transportation system Naveen raj
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport systemCivil Engineers
 
INTELLIGENT TRANSPORTATION SYSTEM
INTELLIGENT TRANSPORTATION SYSTEMINTELLIGENT TRANSPORTATION SYSTEM
INTELLIGENT TRANSPORTATION SYSTEMMr. Lucky
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognitionJaison2636
 
intelligent transportation system
intelligent transportation systemintelligent transportation system
intelligent transportation systemkvn virinchi
 
Vehicle counting for traffic management
Vehicle counting for traffic management Vehicle counting for traffic management
Vehicle counting for traffic management ADEEBANADEEM
 
Progress Assessment of Pavement Management Systems
Progress Assessment of Pavement Management SystemsProgress Assessment of Pavement Management Systems
Progress Assessment of Pavement Management SystemsAgileAssets Inc.
 
Lane detection sensors
Lane detection sensorsLane detection sensors
Lane detection sensorsNear East Uni
 
Intelligent Transportation Systems - ITS
Intelligent Transportation Systems - ITSIntelligent Transportation Systems - ITS
Intelligent Transportation Systems - ITSVijai Krishnan V
 
Automatic Car Number Plate Detection and Recognition using MATLAB
Automatic Car Number Plate Detection and Recognition using MATLABAutomatic Car Number Plate Detection and Recognition using MATLAB
Automatic Car Number Plate Detection and Recognition using MATLABHimanshiSingh71
 
Real time information systems in Transportation
Real time information systems in TransportationReal time information systems in Transportation
Real time information systems in TransportationAravind Samala
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHarshana Madusanka Jayamaha
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport systemvenkatsai91
 
Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.Rajani Suryavanshi
 
Traffic Prediction for Intelligent Transportation System using Machine Learning
Traffic Prediction for Intelligent Transportation System using Machine LearningTraffic Prediction for Intelligent Transportation System using Machine Learning
Traffic Prediction for Intelligent Transportation System using Machine LearningOmSuryawanshi9
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learningJohnson Ubah
 
Satellite image processing
Satellite image processingSatellite image processing
Satellite image processingalok ray
 

Mais procurados (20)

Drowsiness Detection Presentation
Drowsiness Detection PresentationDrowsiness Detection Presentation
Drowsiness Detection Presentation
 
Intelligent transportation system
Intelligent transportation system Intelligent transportation system
Intelligent transportation system
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport system
 
INTELLIGENT TRANSPORTATION SYSTEM
INTELLIGENT TRANSPORTATION SYSTEMINTELLIGENT TRANSPORTATION SYSTEM
INTELLIGENT TRANSPORTATION SYSTEM
 
Real time gesture recognition
Real time gesture recognitionReal time gesture recognition
Real time gesture recognition
 
intelligent transportation system
intelligent transportation systemintelligent transportation system
intelligent transportation system
 
Vehicle counting for traffic management
Vehicle counting for traffic management Vehicle counting for traffic management
Vehicle counting for traffic management
 
Progress Assessment of Pavement Management Systems
Progress Assessment of Pavement Management SystemsProgress Assessment of Pavement Management Systems
Progress Assessment of Pavement Management Systems
 
Lane detection sensors
Lane detection sensorsLane detection sensors
Lane detection sensors
 
Intelligent Transportation Systems - ITS
Intelligent Transportation Systems - ITSIntelligent Transportation Systems - ITS
Intelligent Transportation Systems - ITS
 
Vehicle detection
Vehicle detectionVehicle detection
Vehicle detection
 
Automatic Car Number Plate Detection and Recognition using MATLAB
Automatic Car Number Plate Detection and Recognition using MATLABAutomatic Car Number Plate Detection and Recognition using MATLAB
Automatic Car Number Plate Detection and Recognition using MATLAB
 
Real time information systems in Transportation
Real time information systems in TransportationReal time information systems in Transportation
Real time information systems in Transportation
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural network
 
Intelligent transport system
Intelligent transport systemIntelligent transport system
Intelligent transport system
 
Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.Seminar on Driver Behaviour Detection using Swarm Intelligence.
Seminar on Driver Behaviour Detection using Swarm Intelligence.
 
Traffic Prediction for Intelligent Transportation System using Machine Learning
Traffic Prediction for Intelligent Transportation System using Machine LearningTraffic Prediction for Intelligent Transportation System using Machine Learning
Traffic Prediction for Intelligent Transportation System using Machine Learning
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Satellite image processing
Satellite image processingSatellite image processing
Satellite image processing
 
Intelligent Transportation system
Intelligent Transportation systemIntelligent Transportation system
Intelligent Transportation system
 

Semelhante a Driving Style Analysis through Unsupervised Trip Segmentation

Emergency response behaviour data collection issue
Emergency response behaviour data collection issueEmergency response behaviour data collection issue
Emergency response behaviour data collection issueSerge Hoogendoorn
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfDr. Radhey Shyam
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingAkin Osman Kazakci
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation datajagan477830
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerIJERA Editor
 
MachineLearning-v0.1
MachineLearning-v0.1MachineLearning-v0.1
MachineLearning-v0.1Sergey Popov
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfRAJVEERKUMAR41
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyPaolo Missier
 
Mobility model for convex areas
Mobility model for convex areasMobility model for convex areas
Mobility model for convex areasvasanthi32
 
A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...Editor Jacotech
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifePeea Bal Chakraborty
 
Feature selection in multimodal
Feature selection in multimodalFeature selection in multimodal
Feature selection in multimodalijcsa
 
Concept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process ExecutionConcept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process Executioniosrjce
 
DM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentsDM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentssriharipatilin
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data miningAsad Ahamad
 

Semelhante a Driving Style Analysis through Unsupervised Trip Segmentation (20)

Emergency response behaviour data collection issue
Emergency response behaviour data collection issueEmergency response behaviour data collection issue
Emergency response behaviour data collection issue
 
Chapter8
Chapter8Chapter8
Chapter8
 
KIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdfKIT-601 Lecture Notes-UNIT-2.pdf
KIT-601 Lecture Notes-UNIT-2.pdf
 
A data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototypingA data science observatory based on RAMP - rapid analytics and model prototyping
A data science observatory based on RAMP - rapid analytics and model prototyping
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation data
 
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using RapidminerStudy and Analysis of K-Means Clustering Algorithm Using Rapidminer
Study and Analysis of K-Means Clustering Algorithm Using Rapidminer
 
MachineLearning-v0.1
MachineLearning-v0.1MachineLearning-v0.1
MachineLearning-v0.1
 
Data Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdfData Science and Machine learning-Lect01.pdf
Data Science and Machine learning-Lect01.pdf
 
algorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparencyalgorithmic-decisions, fairness, machine learning, provenance, transparency
algorithmic-decisions, fairness, machine learning, provenance, transparency
 
Mobility model for convex areas
Mobility model for convex areasMobility model for convex areas
Mobility model for convex areas
 
A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...
 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
 
Chapter 07
Chapter 07Chapter 07
Chapter 07
 
Feature selection in multimodal
Feature selection in multimodalFeature selection in multimodal
Feature selection in multimodal
 
Concept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process ExecutionConcept Drift for obtaining Accurate Insight on Process Execution
Concept Drift for obtaining Accurate Insight on Process Execution
 
I017366469
I017366469I017366469
I017366469
 
DM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year studentsDM UNIT_4 PPT for btech final year students
DM UNIT_4 PPT for btech final year students
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data mining
 
Zeleke_Poster14
Zeleke_Poster14Zeleke_Poster14
Zeleke_Poster14
 

Mais de Marco Brambilla

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...Marco Brambilla
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Marco Brambilla
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Marco Brambilla
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheresMarco Brambilla
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social MediaMarco Brambilla
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoMarco Brambilla
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Marco Brambilla
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsMarco Brambilla
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...Marco Brambilla
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksMarco Brambilla
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Marco Brambilla
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionMarco Brambilla
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Marco Brambilla
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Marco Brambilla
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...Marco Brambilla
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.Marco Brambilla
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoMarco Brambilla
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introductionMarco Brambilla
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...Marco Brambilla
 

Mais de Marco Brambilla (20)

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demo
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projects
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di Milano
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
 
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...On the Quest for Changing Knowledge. Capturing emerging entities from social ...
On the Quest for Changing Knowledge. Capturing emerging entities from social ...
 

Último

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 

Último (20)

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 

Driving Style Analysis through Unsupervised Trip Segmentation

  • 1. Driving Style Analysis based on Trip Segmentation. A Comparative Multi-Technique Approach Marco Brambilla, Andrea Mauri, Paolo Mascetti @marcobrambi
  • 2. Agenda Intro Problem Definition Dataset Data Exploration and Preliminaries Trip Segmentation Techniques Validation Conclusions
  • 3. Intro: Relevance 1.24 million traffic-related fatalities occur annually world wide Currently the leading cause of death for people aged between 15 and 29 years Majority of cases due to improper or risky driving behavior Source: World Health Organisation (WHO)
  • 4. Intro: Driving Process Driving Process: driving a car is a complex task that requires to take informed decisions based on information pertaining different levels such as his own state and other drivers’ behavior.
  • 5. Intro: Relevant Information Vehicle’s Status Contextual Info • Road State • Weather Conditions • Traffic Info • Road Risk • Traffic
  • 6. Problem Statement Data-driven driver profiling with respect to driving risk Essentially: Multivariate Time Series Segmentation Application scenarios in insurance, promoting pay-how-you-drive (PHYD) business models
  • 7. State of the Art and Challenges State of the art: many works on identification and recognition of behavioural patterns (line following, accelerations, braking etc) and maneuvers recognition, behavioural scoring, prediction of driver intentions. Supervised Learning techniques require intensive end expensive gathering process.
  • 8. Proposed Solution Unsupervised techniques to profile drivers behaviour based on identified recurrent patterns on driving path segmentation Comparison of 3 different approaches and use of all of them for consolidated results 1. Unsupervised Segmentation Based on Clustering 2. Unsupervised Segmentation Based on HMM 3. Unsupervised Topic Extraction
  • 9. Contextual Scenes Observed driving behaviours that are repeated in each driver's behaviour and also across different drivers. A reduced representation of the original Multivariate Time Series conveying a simplified characterization Further reasoning is then applied
  • 10. ETL Process 3 Steps: Extract: read collected files and selection of candidate features Transform: Filter and Grouping Features computation Load: produce a unique dataset PreProcessing Transform Global dataset.csv Load Trip File.csv Extract
  • 11. Datasets Collection Device : Xsens MTi-G-710 (27 users) And cell phones (10 users) Retrieved Signals : Acceleration measurements Altitude GPS Positioning Speeding Orientation Mounted in-vehicle aligned with direction of movement. No Ground truth knowledge
  • 12. Features Selected Acceleration (on Y and X axes), Speed (on Y and X axes) Difference in yaw
  • 13. Pre-Analysis 1: Data Exploration
  • 14. Pre-Analysis 1: Data Exploration
  • 15. Pre-Analysis 2: Application of Driving Safety Existing Analyses Vaiana et.al. Propose a Driving Safety Diagram based on longitudinal and lateral accelerations analysis. Aggressiveness Index formulation: (A = Aggressive, S = Safe points) Graphical representation:
  • 17. 1. Unsupervised Segmentation Based on DP-Means Clustering Problem: Bayesian nonparametric techniques require expensive sampling methods or variational techniques. DP-means: proposed by Kulis et. al. revisiting k-means: K-means like objective function + penalty A new cluster is created whenever a point is farther than λ away from every already existing centroid. Note: Clustering results depends on data ordering.
  • 22. Distribution of features across clusters
  • 23. Distribution of features across clusters
  • 27. Unsupervised Segmentation based on HMM Goal: identify latent structure given observed data points, assuming existance of Gaussian hidden states. Assign to each observed point the corresponding hidden state. Hidden Markov Models (HMM): Observation and hidden states Markovian properties Continous observation
  • 28. Unsupervised Segmentation based on HMM Training: Baum-Welch EM algorithm to learn model parameters Decoding: Viterbi decoding to assign to each observed point the most likely hidden state
  • 29. HMM Results Also a different variation applied: inertial HMM: lower transition probabilities enforcing state persistence. Sensible for driving.
  • 30. HMM Results Clusters as hidden states.
  • 31. HMM Results Clusters as hidden states.
  • 32. Example of Trip Segmentation
  • 34. Topic Extraction Approach What is topic extraction ? Model topical concepts belonging to a set of textual documents. Data are described as documents and the components are distributions of terms that reflect recurring patterns, name Topics. Hierarchical Dirichlet Processes (HDPs) soft-clustering technique based on non-parametric Bayesian theory. number of topics is not set a priori, but learned from data. Posteriori probability approximated by Variational Inference algorithm by Wang et.al. Results: Most relevant topics for each document and terms distribution in each topic.
  • 35. Topic Extraction Process Data Quantization Documents creation Topics Extraction Topics Evaluation
  • 36. Quantization – Binning Process with static binning strategy
  • 38. Terms Relevance on Top 7 Topics Linguist…
  • 39. Terms Relevance on Top 7 Topics … and data analyst perspectives …
  • 41. Big Issue: How to Compare? 1) Point-to-point or point distribution 2) Resulting grouping of trips 3) Perceived user similarity of trips
  • 42. Solution 1: Point-to-Point Overlap of clusters? Per trip? Overall?
  • 45. Solution 2: Moving from Points to Trips Can we cluster trips based on how observation points have been clustered? à Simple K-means clustering of trips for each approach. à Comparison of overlap of the different clusters Coherent with original question: grouping of trips (and thus drivers) by driving behavior
  • 46. Result of overlap analysis K-means with K=6 clusters. DP-means vs. HMM: 74% overlap DP-means vs. Topic: 44% HMM vs. Topic: 48%
  • 47. Human Validation of Trip Groups Experts (knowledgeable about driving styles and driving paths recorded) identify possible groups of trips in the dataset Problem: - Unable to distinguish 6 categories of groups - Only 3 categories are feasible - Best matching 6à3 categories for each method
  • 49. Conclusions Three different clustering techniques of driving behavior over trips -> segmentation Clustering of trips based on behavior -> up to 74% overlap over 6 clusters -> 100% overlap over 3 clusters User Validation -> 96% precision over 3 clusters
  • 50. Future Work About collection process: Gathering process including contextual information (road risk, traffic status, weather conditions) Larger dataset to improve inference performance About implemented methods: Smarter data ordering for DP-means Relax independency assumption in HMM Improvements in data discretization process for HDP
  • 51. Marco Brambilla, @marcobrambi, marco.brambilla@polimi.it http://datascience.deib.polimi.it Thanks! Questions?