Smart phones are equipped with many sensors which provide detailed and continuous information of the device's location and movement. The use of such signals for vehicle movement inference presents many challenges due to signal noise, unknown phone orientation, varying device sensor quality and so on. Signal processing and feature engineering are generally difficult and require deep domain knowledge and manual pattern recognition. We discuss how deep learning can be leveraged in this context for automatic signal processing and feature engineering. We present several applications of deep learning in vehicle telematics as well as the deep learning architecture designed for learning sensor embeddings for vehicle movement events. One challenge we face is that model training requires huge volumes of sensor data, which must be processed efficiently. We present a solution using Spark for model development and batch deployment.
2. We want Uber to be the
safest transportation
platform on the planet.
Safety should be our
number-one priority. We
have to, as a company,
stand for safety.”
Dara Khosrowshahi (2018)
Stand for Safety
5. Sensor Data (Driver Device)
● GPS
○ Absolute location, velocity and time
○ Low frequency (~0.5Hz)
● IMU
○ Relative motion of phone
○ Accelerometer: 3D linear acceleration
○ Gyroscope: 3D angular velocity
○ High frequency (~25Hz)
10. Pre-filtering
● High-frequency data result in huge # time steps
○ Pre-filtering: identify specific time window of interest
○ Window segmentation: divide input sequence into small windows
Design Choice
Window Segmentation
window0
window1
window2 windowT
12. Feature Extraction
window 1 window 2 window t window T
- Time domain stats
(min, max, mean, sd)
- Frequency domain feature
(FFT)
1-D CNNSample Summary
Raw data
New Feature
Vector
LSTM LSTM
13. Data Augmentation
● Sensor readings depend on phone orientation
● Create augmented data by artificially rotating phone
○ New sensor readings
○ Label stays the same
14. Model
- SparkML Transformer
- XgBoost
- xM trips in training
- xM trips in validation
- Saved model pipeline
Data
- Sensor (Driver)
Score
- Sensor embedding
Model Dev Pipeline
Data
- Sensor (Driver)
- Map
- Trip
- Other
Label/Feature
- Telematics
- Trip
Label/Feature
- Event definition
- Feature
Model
- Multi-layer LSTM
- xM trips in training
- Saved protocol buffer
Non-DL
DL
Score
- Score and classify
15. Horovod
● Open source library developed at Uber
● Distributed training for TensorFlow, Keras & PyTorch
● Uses bandwidth-optimal communication protocols &
makes use of advanced networking
● Seamlessly installs via pip install horovod
16. ● Open source library developed at Uber ATG
● Enables deep learning directly from Parquet
● Supports Tensorflow, PyTorch, and PySpark
Petastorm
Apache Parquet as a dataframe with
tensors
nd-arrays,
scalars
(e.g. images,
lidar point
clouds)
Apache Parquet
store
Fog
Horse
Hedgehog