研究室輪読 Feature Learning for Activity Recognition in Ubiquitous Computing

Feature Learning for Activity
Recognition in Ubiquitous Computing
Thomas Plötz, Nils Y. Hammerla, and Patrick Olivier
School of Computer Science
Newcastle University
the 12th International Joint Conference on Artificial
Intelligence 2011
Paper Reading at Jun. 9th in Matsuo Lab, Presented By Yusuke Iwasawa D1

Outline
1. Introduction
2. State-of-the-Art
3. Feature Learning for Activity Recognition
3.1. PCA based Feature Learning
3.2. Deep Learning for Feature Extraction
4. Experimental Evaluation
4.1. Datasets
4.2. Features Analyzed: Overview
4.3. Result
5. Conclusion
20. Outline

Activity Recognition (AR)
• … is a core concern of the ubiquitous computing (ubicomp) community
[Atallah and Yang, 2009]
• In general, sensors are utilized to capture aspects of movement or a
user’s behavior.
4

Typical Activity Recognition Chain [Bulling et. al]
5
• Raw data acquisition: Sensor data
• Preprocessing: Filtering etc.
• Segmentation: Sliding Window
(Predominant Approach)
• Feature Extraction:
• Based on Frames
• Classification:
• General classifier (KNN)
※the Fig. is cited from the paper[Bulling:2014jm]

AR Feature Extraction
6
• Almost all previous work used heuristically selected measures
• Time Domain: Average, Standard Deviation etc.
• Frequency Domain: FFT coefficients
• Few systematic research
• one of the major shortcomings of current AR [Lukowicz et al., 2010]
※the Fig. is cited from the paper[Bulling:2014jm]
Topic of This paper

Contributions
1. Proposing the simple work-flow that integrate unsupervised feature
learning techniques and general activity recognition procedures
• Principal Component Analysis(PCA) and Deep Learning
2. Presenting the suitability of feature learning for ubicomp activity
recognition tasks with four public datasets
• Showing how the automatically extracted features outperform
standard features across range of AR applications
7
New classes of activity analysis
• Such as behavioral analysis or skill assessment
• The task requires quantitate and in-depth analyzation of the
underlying data

行動認識（AR）に使われる特徴量と研究
• 主に2種類 [Figo et al., 2010]
• Statistical
• 平均値、標準偏差、エネルギー、エントロピー、相関係数
• Frequency Domain
• Fourier coefficients [Huynh and Schiele, 2005]
• DFT Coefficients
9
ヒューリスティックな手法
• Time-delay embeddings [Frank et al., 2010].
• 状態空間モデルの1種。複雑なシステムの状態の記述に利用
• 不規則な変化を捉えるのには不適切
• String Representation Based Distance [Minnen et al., 2006]
• センサデータを離散的な表現に置き換え、距離を測る
• 情報の欠損があるため、深い分析には不向き
その他の手法

3. Feature Learning for Activity
Recognition

Feature Learning
• Feature learning (Representation Learning?)を利用して自動的に 
データの良い表現を学習する
• ヒューリスティックな特徴のデザインがドメインスペシフィックな専門知識を要
するのに対して、表現学習では何らかの目的関数を最適化することでデータ
からよい表現を見つけることができる
• エネルギー最小化[LeCun et al., 2006]やmainfold learning （多様体学
習）[Huo et al., 2004], deep learningなどのアプローチがあ
11

Integration Framework of AR and Feature Learning
• Sensor Data
• 多次元時系列データ
• Frame Extraction
• Sensor Dataを前から 
順番にnサンプルずつ
• Training
• 学習用のデータ 
（フレーム単位）
• Test
• テスト用のデータ 
（フレーム単位）
• FEX
• Trainingでパラメタを 
学習、Testにも利用
12

Design Criteria
1. Capable of extracting generally applicable representations not be
limited to specific AR tasks.
2. Must not rely on the availability of ground truth annotations of the
training data.
3. Benefits from larger datasets, but not dependent on them.
4. Provides intrinsic information.
5. Must be computationally feasible and applicable in real-time
application contexts.
（論文中より抜粋）
13
本論文ではPCAとDeep Leaningを検討し5つの基準にそって評価
まとめると、2.ラベルが必ずしもたくさん利用できるとは限らない状況で、3. なるべく
多くのデータセットから、4. 本質的で、1. いろいろなタスクに利用可能な表現を、5.
高速に学習する必要がある

PCA & Deep Learning for Feature Learning
14
• Principal Component Analysis
• 表現学習での応用事例: [Karhunen and Joutsensalo] “Representation and
separation of signals using nonlinear PCA type learning,” Neural
Networks, vol. 7, no. 1, pp. 113–127, Jan. 1994. など
PCA
Deep Learning
• 各層をRBMにしたDeep Brief Networkを採用
• Auto Encorder[Hinton, 2007]
• 1つの入力層、1つの出力層、偶数の隠れ層
• Every layer is fully connected
• Learning the layers of the autoencoder network greedily in a bottom-
up procedure, by treating each pair of subsequent layers in the encoder
as a Restricted Boltzmann Machine (RBM)
• [Hinton et.al 2006]に従って学習

加速度センシングデータにPCAを適用する際の問題と対策
• PCAは事前にデータが上手く正規化されていないとパフォーマンスがでにくい
• ただし、センサデータの正規化は難しい
• （以下解釈）
• ARでのセンサデータ解析においては、少なくともある小さな時間はばではi.i.dである
ことが暗目的に仮定されているが（1次の統計量をとる利用するとか）、実際にはi.i.d
じゃないことがよくある
• 例えば、1フレームあたりのサンプル数を多くするとあるフレームに2つ以上の現象が
含まれるようになる（途中で行動が変わったりする）
• この場合、明らかに分散を1にするという様な正規化は不適切である
15
※[Hammerla et. al](ISWC’13)より引用

Empirical Cumulative Distribution Function（ECDF, 経験累積分布関数）
• すべてのフレームi、d個の任意のpj(j=1, 2, …, d) に対して、Pc(x) = P(X≦xj)
= pjなるxを求める (ただし、pj < pj+1, pjは0-1の範囲)
• PcがECDF。ECDFは単調増加関数
• d次元の表現fi = {x, ∃j : Pci(x) = pj}を獲得
• この方法により、構造情報を保ったまま正規化
16
※図は[Hammerla et. al](ISWC’13)より引用

Evaluation Method
• 4つの公開されたデータセットで評価
• State-of-the-Artな手法とPCAやRBMを用いた手法を比較
• スライディングウィンドウ方式でフレームを作成し、フレーム毎に分類 
（ウィンドウサイズ＝64サンプル、50%の重なり）
• 分類器にはKNN (the state-of-the-art in ubicomp AR)を採用
• 10交差検定（一部を除く。後述）。クラス間のバランスが等しくなるようにランダム
に選択。
18

Datasets for Evaluation
19
• 4つの異なるコンテキストでの人間行動センシングデータを利用
• すべてのデータセットが3軸加速度センサで計測されている 
Skodaのみ4校差検定で評価（クラス間インバランス問題のため）
左図: [Pham and Oliver, 2009]、
右図: [Huynh et al., 2008]よりそれぞれ引用

Features (6 types)
• Statistical (23-D representation) :
• x,y,z軸,pitch, rollの平均値、標準偏差、エネルギー、エントロピー
• x,y,z軸の加速度センサ値の相関係数
• FFT (30-D representation):
• 23-39次元での調査した結果で最適
• x,y,z軸の最初10個ずつの周波数係数
20
State-of-the-Art
Feature Leaning Method
• PCA
• Top-30の主成分スコアを利用（18, 23, 30, 39で調査した結果で最適）
• Deep Belief Networks (DBN)
• AKデータセットでパラメタを調整、学習は各データセットごとに
• 4 layers model (192-1024-1024-30)
• 入力層：100 epochs, その他：50 epochs
• PCA + ECDF, DBN + ECDF

Result – Classification Accuracy-
• すべてのデータセットで提案手法が統計的に有意に良い精度
• …と言っているけれど、4つのデータセットでの平均値を取ると 
Statistical: 73.4% (6), FFT: 81.7% (2), PCA: 80.9% (4), PCA+ECDF:
(84.1%), RBM: 81.2% (1), RBM+ECDF: 81.3% (3)でFFTが2位
• ただし、すべてのデータセットでPCA+ECDFでの方法が最も良い精度
• 特にSkodaデータセットではStatisticalが圧倒的に悪い精度
• 複数のタスクに有用
• （Design Criteria 1: Capable of extracting generally applicable
representations not be limited to specific AR tasks, を満たしている）
21

Result – Benefits of ECDF-based Representation -
• 大多数の分類タスクで、ECDFベース(赤）の方が生データベース（青）よりよい
• PCA: 4勝0敗、RBM: 3勝1敗
• 唯一悪くなっていたのは、Skoda+RBMの組み合わせ
• クラス間のインバランスによるオーバーフィッティングを原因と推測
22

Result – Influence of Sample Set Size -
23
• そこまで大きな影響はない（Design Criteria 3. を満たしている）と主張
• 比較すると、PCAの方がRBMより大きな影響を受けている
※ 横軸：学習データのうち実際に学習に使った割合、縦軸：100%を学習に使った場合との比率
※ わかりやすさのためにSkodaデータセットの例を表示．いずれのデータセットでも同様の結果

Further Analysis (Design Criteria 4, 5)
• 満たしている
• 例えば（良いデータによって学習された）PCAやRBMによる表現やそのデコー
ディングと、実際のデータを比べることでアクティビティの良さを評価できる
• 現状のディファクトスタンダードであるクラスタリングを用いたスキルアセスメント
などより適したメトリックを構築することができるであろう
24
4. Provides intrinsic information
5. Must be computationally feasible and applicable in real-time
application contexts
• 満たしている
• BC. 一度オフラインに特徴抽出機を学習してしまえば、表現の学習は単純に 
行列の掛け算で求まる
• それゆえ、オンラインコンテキストでも十分実用的である
• センサ側で特徴抽出を行うことも可能である
• サーバ側に必要な処理能力、センサのバッテリー問題、システムの反応速度な
どさまざまな面でメリットが有る

Discussion （というか疑問）
1. 表現学習の前にセンサデータの表現をECDFによって置き換えるのは良いのか？
1.1.その他の表現置き換え（SAX, Gaussian Mixture Model)はどうなのか？
2. Stacked RBM以外の学習方法はどうなのか？
2.1.CNN, RNNなど時系列性を考慮した方法はどうか？
3. Representation Learningは行動認識特有の問題を解決しているのか?
3.1.そもそも「行動」はいろんな要素を含むものであり、定義自体が難しい。
3.2.それゆえ、アノテーションも難しい。（解消されていそう）
3.3.ある「行動」はかなり多様である（Intraclass Diversity)
3.3.1.例えば、歩くという動作は人によってかなり異なる。実際、他の人で学習したモデ
ルをそのまま適用すると精度がかなり悪いことが知られている
3.4.ある行動Aと行動Bはかなり似ることがある。（例えば、コーヒーを飲むのと水をのむのは腕
の動きという点ではかなり似ているが、栄養学上では全く別物である）
3.5.クラス間にかなりのインバランス性がある。長時間計測しても、ほとんどの時間は何も起こっ
ていない。一方でたまに起こる行動（事故にあった、〜を買った、など）は様々なコンテキスト
で重要であることが多い
3.6.大規模な共有データセットがない（すべての情報を事前に学習することはかなり難しい、と
いうか不可能）
3.7.センサ自体のノイズや環境の影響などをかなりうける（かつ、あるタスクにはノイズでも他
のタスクにとっては有益な情報である場合もある）
25

Conclusion & Future Work
• Conclusion
• PCAとDBNをベースにしたARフレームワークを実装し
• ARでの表現学習が満たすべき5つの基準を設定し、表現学習の有用性を4つの 
パブリックなデータセットでの既存の手法との比較により評価
• 提案フレームワークは、従来のARのワークフローをベースに作られているので、あ
らゆる種類のARタスクに適用に適用可能である
• 提案手法が、1. 複数のデータセットに対して、2. ラベルをあまり利用せず、3. な
るべく少ないデータセットから、4. 本質的な特徴表現を、5. オンラインコンテキス
トで利用可能な程度に高速に取得できることを示した
• Future Work
• 1. セグメンテーションベースの解析からの解放
• 既存のテクニックを応用することで実現可能
• 2. 線形性を仮定している
• Kernel PCAなどにより対処可能
27

References
• ‘Types of samples’, http://psychology.ucdavis.edu/faculty_sites/sommerb/
sommerdemo/sampling/types.htm
• ‘楽しいAutoEncoderと学習の世界’, http://vaaaaaanquish.hatenablog.com/entry/
2013/12/03/033850
• ‘Convolution Neural Network’, http://ceromondo.blogspot.jp/2012/09/convolutional-
neural-network.html
• [Prandi:2014] C. Prandi, P. Salomoni, and S. Mirri, “mPASS: Integrating People Sensing and
Crowdsourcing to Map Urban Accessibility,” IEEE Consumer Communications and
Networking Conference (CCNC 2014): People Centric Sensing and Communications (PCSC),
2014.
• [Plötz et. al] T. Plötz, N. Y. Hammerla, and P. Olivier, “Feature learning for activity
recognition in ubiquitous computing,” IJCAI Proceedings-International Joint…, 2011.
• [Glatt et. al] R. Glatt et. al., Proposal for a Deep Learning Architecture for Activity
Recognition. International Journal of Engineering & …, 2014.
• [Zeng et. al] M. Zeng, L. T. Nguyen, B. Yu, O. J. Mengshoel, J. Zhu, and P. Wu,
“Convolutional Neural Networks for Human Activity Recognition using Mobile Sensors,”
mlt.sv.cmu.edu
• [Vollmer et. al] Christian Vollmer, et.al., “Learning Features for Activity Recognition with
Shift-Invariant Sparse Coding,” 2013.
• [Bhattacharya et. al] “Using unlabeled data in a sparse-coding framework for human activity
recognition,” 2014.
28

References on the Paper 1
[AtallahandYang,2009] L.AtallahandG.Yang.Theuseof pervasive sensing for behaviour profiling – a
survey. Per- vasive and Mobile Computing, (5):447 – 464, 2009.
[Chou, 1995] K.C. Chou. A novel approach to predicting protein structural classes in a (20–1)-D amino
acid com- position space. Proteins: Structure, Function, and Bioin- formatics, 21(4):319–344, 1995.
[Cox and Oakes, 1984] D.R. Cox and D. Oakes. Analysis of survival data. Monographs on statistics and
applied prob- ability. Chapman and Hall, 1984.
[Figo et al., 2010] D. Figo, P. Diniz, D. Ferreira, and J. Car- doso. Preprocessing techniques for context
recognition from accelerometer data. Pervasive and Ubiquituous Com- puting, pages 645 – 662, 2010.
[Frank et al., 2010] J Frank, S Mannor, and D Precup. Ac- tivity and gait recognition with time-delay
embeddings. In Proc. AAAI Conf. on Artificial Intelligence, 2010.
[Hinton et al., 2006] G.E. Hinton, S. Osindero, and Y.W. Teh. A fast learning algorithm for deep belief
nets. Neural computation, 18(7):1527–1554, 2006.
[Hinton, 2007] G.E. Hinton. To recognize shapes, first learn to generate images. Progress in brain
research, 165:535– 547, 2007.
[Huo et al., 2004] Xiaoming Huo, Xuelei Ni, and Andre K Smith. A survey of manifold-based learning
methods. In Recent Advances in Datamining of Enterprise Data Algo- rithms and Applications, pages
691 – 745. 2004.
29

References on the Paper 2
[Huynh and Schiele, 2005] Taˆm Huynh and Bernt Schiele. Analyzing features for activity recognition.
In Proc. Joint Conf. on Smart Objects and Ambient Intelligence, 2005.
[Huynh et al., 2008] Taˆm Huynh, Mario Fritz, and Bernt Schiele. Discovery of activity patterns using
topic mod- els. In Proc. Int. Conf. on Ubiquitous Computing, 2008.
[LeCun et al., 2006] Y LeCun, S Chopra, and R Hadsell. A tutorial on energy-based learning. In Predicting
Structured Data. MIT Press, 2006.
[Lukowicz et al., 2010] Paul Lukowicz, Stephen Intille, and Jamie A Ward, editors. Proc. Int. Workshop
on How To Do Good Research In Activity Recognition: Experimental methodology, performance
evaluation and reproducibil- ity., 2010.
[Minnen et al., 2006] David Minnen, Thad Starner, M Essa, and Charles Isbell. Discovering characteristic
actions from on-body sensor data. In Proc. Int. Symp. on Wearable Comp., 2006.
[Pham and Olivier, 2009] Cuong Pham and Patrick Olivier. Slice & dice: Recognizing food preparation
activities using embedded accelerometers. In Proc. Europ. Conf. Am- bient Intelligence, 2009.
[Roggen et al., 2010] D. Roggen, A. Calatroni, M. Rossi, T. Holleczek, K. Fo ̈rster, G. Tro ̈ster, P.
Lukowicz, D. Ban- nach, G. Pirkl, A. Ferscha, J. Doppler, C. Holzmann, M. Kurz, G. Holl, R. Chavarriaga, M.
Creatura, and J. del R. Milla ́n. Collecting complex activity data sets in highly rich networked sensor
environments. In 7th Int. Conf. Net- worked Sensing Sys., 2010.
[Zappi et al., 2008] Piero Zappi, Clemens Lombriser, Thomas Stiefmeier, Elisabetta Farella, Daniel
Roggen, Luca Benini, and Gerhard Tro ̈ster. Activity recognition from on-body sensors: Accuracy-power
trade-off by dynamic sensor selection. In Proc. Europ. Conf. on Wireless Sensor Networks, 2008.
30

研究室輪読 Feature Learning for Activity Recognition in Ubiquitous Computing

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (10)

Semelhante a 研究室輪読 Feature Learning for Activity Recognition in Ubiquitous Computing

Semelhante a 研究室輪読 Feature Learning for Activity Recognition in Ubiquitous Computing (20)

Mais de Yusuke Iwasawa

Mais de Yusuke Iwasawa (10)

Último

Último (12)

研究室輪読 Feature Learning for Activity Recognition in Ubiquitous Computing