Project page
http://www.hirokatsukataoka.net/research/transitionalactionrecognition/transitionalactionrecognition.html
Herein, we address transitional actions class as a class between actions. Transitional actions should be useful for producing short-term action predictions while an action is transitive. However, transitional action recognition is difficult because actions and transitional actions partially overlap each other. To deal with this issue, we propose a subtle motion descriptor (SMD) that identifies the sensitive differences between actions and transitional actions. The two primary contributions in this paper are as follows: (i) defining transitional actions for short-term action predictions that permit earlier predictions than early action recognition, and (ii) utilizing convolutional neural network (CNN) based SMD to present a clear distinction between actions and transitional actions. Using three different datasets, we will show that our proposed approach produces better results than do other state-of-the-art models. The experimental results clearly show the recognition performance effectiveness of our proposed model, as well as its ability to comprehend temporal motion in transitional actions.
fundamental of entomology all in one topics of entomology
【BMVC2016】Recognition of Transitional Action for Short-Term Action Prediction using Discriminative Temporal CNN Feature
1. Recognition of Transitional Action for Short-Term Action
Prediction using Discriminative Temporal CNN Feature
Hirokatsu Kataoka, Ph.D.
Computer Vision Research Group (CVRG), AIST
http://www.hirokatsukataoka.net/
Yudai Miyashita (TDU), Masaki Hayashi (Liquid Inc., Keio Univ.),
Kenji Iwata, Yutaka Satoh (AIST)
2. Related work: Early Action Recognition
• [Ryoo, ICCV2011]
M. S. Ryoo, “Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos”, International Conference on
Computer Vision (ICCV), pp.1036-1043, 2011.
3. Related work: Action Prediction
• [Kataoka+, VISAPP2016]
??? Daytime
(Time Zone)
Walking
(Previous Activity)
Sitting
(Current Activity)
???
(Next Activity)
xtimezone
xprevious xcurrent
θ = “Using a PC”
Given Not given
Time series
H. Kataoka, Y. Aoki, K. Iwata, Y. Satoh, “Activity Prediction using a Space-Time CNN and Bayesian Framework”, in VISAPP, 2016.
4. Problem of related works
• Early action recognition
– Action recognition in an early frame of the action
– Enough cue is required, so almost equals to action recognition
• Action prediction
– Complete future prediction in an unstable situation
5. Proposal
• Transitional Action (TA): Action-class while an action is transitive
– TA contains cue of prediction: Earlier than early action recognition
– Recognition-like future action prediction: More stable prediction
[Applications] Autonomous driving, active safety and robotics
Δt
【Proposal】
Short-term action prediction
recognize “cross” at time t5
【Previous works】
Early action recognition recognize
“cross” at time t9
Walk straight
(Action)
Cross
(Action)
Walk straight – Cross
(Transitional action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
6. Problem settings
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
7. Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
f (F1...t−L
A
) → At
A(cross)The objective action is
- Early action recognition is late response
8. Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
f (F1...t
A
) → At+L
A(cross)The objective action is
- Action prediction is unstable
9. Difference
Framework Problem
Action Recognition
Early Action Recognition
Action Prediction
Transitional Action Recognition
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
f (F1...t
TA
) → At+L
Walk straight
(Action)
Cross
(Action)
Walk straight – Cross
(Transitional action)
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12
A(cross)The objective action is
- Transitional action recognition is reasonable
f (F1...t
TA
) → At+L
10. Details of transitional action (TA)
• Annotation for TA
– TA and normal action (NA) classes are partially overlapped each other
• Difficulty of TA
– Temporally mixed between NA and TA
11. Subtle Motion Descriptor (SMD)
• A discriminative temporal CNN feature
– To divide classes between NA and TA
12. Subtle Motion Descriptor (SMD)
• Activation feature from VGG-16
– Fully-connected layer (N = 4,096)
– Based on pooled time series (PoT) [Ryoo+, CVPR2015]
14. Subtle Motion Descriptor (SMD)
• Temporal pooling from ΔV t
– Plus and minus
– Zero-around values are pooled (→This is the contribution of SMD)
– TH is experimentally fixed
15. Datasets
• Temporal action datasets
– NTSEL [Kataoka+, ITSC2015]
• Walk (NA), cross (NA), bicycle (NA), turn (TA) with human bbox
– UTKinect-Action [Xia+, CVPRW2012]
• Ordered 10 NAs (e.g. walk, throw, sit)
• 8 TAs (excluding push/pull; next page)
• Without human bbox
– Watch-n-Patch [Wu+, CVPR2015]
• Daily 10 NAs (e.g. read, turn on monitor, leave office)
• Top frequent 10 TAs (next page)
• Without human bbox
23. Comparison of PoT
• Subtle motion is effective for transitional action recognition
– NTSEL: +2.18%, +8.63%
– UTKinect: +7.19%, +4.31%
– Watch-n-Patch: +4.82%, +5.12%
24. Conclulsion
• Two contribusions:
1. Definition of transitional action for short-term action prediction
2. Subtle Motion Descriptor (SMD) to classify transitional and normal actions