O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

[系列活動] Emotion-AI: 運用人工智慧實現情緒辨識

11.754 visualizações

Publicada em

情緒的知覺與產生是人類的重要演化工具,藉以強化自身以及同物種成員生存的機會 – 對於仰賴社會互動與溝通以促進繁衍與生存的人類尤其重要。人類互動中的情緒訊號與內在情緒狀態會自然的驅動並反應於各種外顯行為中。Affective Computing (情感運算) 被 MIT Media Lab Rosalind Picard’s 於 1995 年論文中提出,學術界中也開啟了近二十年相關技術研究以解碼情緒訊號並感知人類內部情感狀態。近年來,運用人工智慧計算聲音特徵與臉部表情為主的情緒辨識已為下一代與人相關的應用 (例如: 人機互動、教育、醫療、娛樂、商業等) 帶來了新的巨大機會。
本課程將以運用人工智慧實現情緒辨識為主軸,內容包含簡介情感運算背景及其應用、情緒行為資料庫收集與處理、相關人工智慧技術、現今情緒辨識發展與未來。希望透過本課程提供的知識後,學員能更快速整合相關知識與工具來開發情緒辨識相關應用。
課程網頁介紹: http://foundation.datasci.tw/emotion-ai-171216/

Publicada em: Dados e análise
  • Login to see the comments

[系列活動] Emotion-AI: 運用人工智慧實現情緒辨識

  1. 1. Emotion-AI: 運用人工智慧實現情緒辨識 李祈均 資料科學年會課程 December 16th, 2017 1
  2. 2. Affective Computing (1995): the study and development of systems and devices that can recognize, interpret, process, and simulate human affects Professor Rosalind Piccard MIT Media Lab Annual Conference on Affect Computing and Intelligent Interaction (ACII) ACII 2017 @ San Antonio 2
  3. 3. 背景簡介及其應用 情緒行為資料庫收集與處理 情緒辨識人工智慧技術 情緒辨識的現在發展與未來 3
  4. 4. 什麼是情緒 ? 4
  5. 5. “People also smile when they are miserable.” ― Paul Ekman, Telling Lies: Clues to Deceit the Marketplace, Politics, and Marriage 5
  6. 6. 6
  7. 7. Philosophy Discuss emotion with philosophy Turn to Practical Combing physical and emotion to start to apply the systems on human Cognitive Process Cognitive Theory Mind-Body Dualism Combining physical world with emotion Modern Theory 1 2 3 4 5 7
  8. 8. https://aquileana.wordpress.com/2014/04/14/platos-phaedrus-the-allegory-of-the-chariot-and-the-tripartite-nature-of-the-soul/ Plato’s horses Successful Person- Reason horse is more in control described emotion and reason as two horses pulling us in opposite directions. Philosophy Discuss emotion with philosophy 8
  9. 9. Stoicism Aristippus Philosophy Discuss emotion with philosophy 9
  10. 10. Mind-Body Dualism In the 17th century, René Descartes viewed the body’s emotional apparatus as largely hydraulic. He believed that when a person felt angry or sad it was because certain internal valves opened and released such fluids as bile and phlegm Mind-Body Dualism Combining physical world with emotion 10
  11. 11. Charles Darwin believed that emotions were beneficial for evolution because emotions improved chances of survival. For example, the brain uses emotion to keep us away from a dangerous animal (fear), away from rotting food and fecal matter (disgust), in control of our resources (anger), and in pursuit of a good meal or a good mate (pleasure and lust). Damasio, Antonio R. Looking for Spinoza: Joy, Sorrow, and the Feeling Brain. New York NY: Harcourt, Inc., 2003. Turn to Practical Discuss the combination with physic and emotion and start to apply the system of human 11
  12. 12. 美國心理學之父 James, William. 1884. "What Is an Emotion?" Mind. 9, no. 34: 188-205. 當身體產生(生理)變化時,我們感受到這些變化,這就是情緒。 Our feeling of the same changes as they occur is the emotion Modern Theory 12
  13. 13. James- Lange Cannon-Bard Schachter & Singer 13
  14. 14. James- Lange Cannon-Bard Schachter & Singer 情緒並非外在刺激,而是生理變化產生 刺激引發自主神經系統的活動,產生 生理狀態上的改變,生理上的反應導 致了情緒 批評: 1.不同的情緒,會有相似的生理變 化。例如:興奮和憤怒都會心跳加速 2.生理變化如來自自主神經支配, 個體不應知道自己有此情緒 3.如人工方法給個體生理變化,不 能產生真正的情緒。例如:腎上腺素 注射,無法引發害怕。 修正James的理論,因其過於強調自主 神經系統在情緒中的作用,認為情緒 刺激應由丘腦進行 外在刺激,傳到中視丘和下次丘, 再傳到大腦皮質,同時引發生理反 應和情緒 Two-factor theory of emotion 刺激引起知覺而使個體對情境有認 知考量,加上生理變化之認知解釋, 兩項認知同時引起情緒表達 對自己生理變化的認知 對刺激情境性質的認知 實驗:沙赫特和辛格認為,情緒是認知 因素和生理喚醒狀態兩者交互作用的 產物 生理喚醒是情緒激活的必要條件,但 是真正的情緒體驗是由對喚醒狀態賦 予的「標記」決定的。這種「標記」 的賦予是一種認知過程,個體利用過 去經驗和當前環境的信息對自身喚醒 狀態作出合理的解釋,正是這種解釋 決定著產生怎樣的情緒。 批評: 1. 處 理 情 緒 的 主 要 部 分 為 下 視 丘 (hypothalamus) 和 邊 緣 系 統 (limbic system) 而非視丘 14
  15. 15. 16
  16. 16. 17
  17. 17. 321 認知心理學 行為學派 精神分析學派 18
  18. 18.  神經症的心理治療方法  醫療實踐中逐步形成的一種心理學理論  從外顯和內隱的方面描述了內驅力、感情、衝 突、心理和人格等現象  從情緒的角度看, 弗洛伊德把情緒放置在內驅力 和無意識的框架之內 3 精神分析學派 19
  19. 19. • 情緒只是有機體對待特定環境的一種反應 • 從反應模式和活動水平兩方面去描述情緒。 • 情緒是一種遺傳的反應模式,它包括整個的身體機制,特別是 內臟和腺體活動系統的深刻變化。 • 操作條件反射論者斯金納特別註意從動物在個體生活中的習得 行為研究情緒,發展了用條件反射技術來引發情緒的方法,並 把挫折效應作為研究情緒的一個標準方法。 2 行為學派 20
  20. 20. 1 認知心理學 • 以心智處理來思考與推理的模式 (判斷、評價和思考過程)。 • 思考與推理在人類大腦中的運作便像電腦軟體在電腦裏運作相似。 • 談到輸入、表徵、計算或處理,以及輸出等概念認知心理學理論 • 從認知學、社會學和文化的角度,情緒不僅僅來自生理反應,還受到 信息處理過程、社會交流方式和文化背景影響。 Cognitive Process Cognitive Theory 21
  21. 21. • James-Lange • 即使沒有大腦皮質參與,人也可以產生情緒(即沒有自主意識、 沒有認知的情況下)。 • 生理變化伴隨著情緒產生,調節制約人們對情緒的感受,但是並 不直接造成情緒。情緒也可以反過來導致生理變化,並產生包括 戰鬥、逃跑、撫育在內的適應行為。 • 神經解剖學 • 哺乳動物大腦中有三個獨立的神經迴路,分別控制三種情緒反應 • 憤怒、恐懼、悲傷、厭惡四種情緒各自有獨特的自主神經系統反 應。 22
  22. 22. 認知學視角 阿諾德(Arnold)與拉扎勒斯(Lazarus)的認知-評價理論 (Appraisal) 情緒來自正在進行著的環境中好的或不好的信息的生理心理反應,它依賴於 短時或持續的評價。在發生之前,人要對刺激進行解釋和評估。如果一個人 對刺激做出肯定的評價,他就會接近它;否則,就會躲避它。 湯姆金斯(Tomkins)和伊扎德(Izard)的動機-分化理論 主張情緒具有動機的性質。該理論以情緒為核心,以人格結構為基礎,論述情 緒的性質與功能。認為情緒是人格系統的核心動力。情緒特徵來源於個體的生 理結構,遺傳是某種情緒的特徵和強度水平的決定因素。認知是情緒產生的重 要因素,但認知不等同於情緒,也不是其產生的唯一原因,只是參與情緒激活 與調節的過程。 23
  23. 23. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 24
  24. 24. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 人隱藏狀態 25
  25. 25. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個INFERENCE的問題嗎? 26
  26. 26. 情緒是共同語言嗎? 可以TAG上共同標籤? 27
  27. 27. Charles Darwin believed that emotions were beneficial for evolution because emotions improved chances of survival. For example, the brain uses emotion to keep us away from a dangerous animal (fear), away from rotting food and fecal matter (disgust), in control of our resources (anger), and in pursuit of a good meal or a good mate (pleasure and lust). Damasio, Antonio R. Looking for Spinoza: Joy, Sorrow, and the Feeling Brain. New York NY: Harcourt, Inc., 2003. 達爾文 28
  28. 28. 研究情緒和面部表情的先驅 二十世紀最傑出的100位心理學家 Paul Ekman 也在想這個問題 29
  29. 29. 研究西方人和新幾內亞原始部落居民的面部表 情,他要求受訪者辨認各種面部表情的圖片, 並且要用面部表情來傳達自己所認定的情緒狀 態,結果他發現某些基本情緒(快樂、悲傷、 憤怒、厭惡、驚訝和恐懼)的表達在兩種文化 中都很雷同 30
  30. 30. Are There Universal Facial Expressions? Just guess 31 近期不同實驗、不同結果
  31. 31. • 對臉部肌肉群的運動及其對表情 的控製作用做了深入研究 • 開發了面部動作編碼系統 (Facial Action Coding System,FACS) • 根據人臉的解剖學特點,將其劃 分成若干既相互獨立又相互聯繫 的運動單元(AU) • 分析了這些運動單元的運動特征 及其所控制的主要區域以及與之 相關的表情 這個後面我們還會在帶到 32
  32. 32. 社會文化 一個人所處的社會環境改變,他的情緒構成也會發生改變。和 中國嬰兒相比,美國嬰兒的情緒反應更強烈,更具有表現力。 這也許是因為兩種文化中,成年人對情緒的表達就不一樣。 研究人員試圖找出中國人和美國人對「基本情緒」的認知有什 麼差異。結果顯示,中美文化中的人對於喜悅、憤怒、悲傷、 恐懼的認知一樣。但是中國人把「愛」看做悲傷的情緒,並且 中國人認為「羞惡之心」也是一種基本情緒。於是美國人的基 本情緒中有兩個正面的(喜悅、愛)和三個負面的(憤怒、悲 傷、恐懼);中國人的基本情緒中有一個正面的(喜悅)和五 個負面的(愛、憤怒、悲傷、恐懼、羞恥)。 Mascolo, M. F., Fischer, K. W., & Li,J. (2003). Dynamic development of component system of emotions: Pride, shame, and guilt in China and the Uni States. In R. J. Davidson, K. R. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective sciences (pp. 375-408). New York: Oxford University Press. Shaver, P. R., Wu, S., & Schwartz, J. C. (1992). Cross-cultural similarities and differences in emotion and its representation: A prototype approach. In 33
  33. 33. 情緒 1 2 3 4 刺激 需求 狀態 背景 2 個人需求/動機 生理 心理 精神 1 外在刺激 *轉變 *要求 *衝突 4 個人背景 *社會文化 *經驗 3 個人狀態 記憶力 專注力 判斷力 2 1 4 西方與東方 對情緒的解 釋不同 3 對當下事 件的判斷 影響認知 精神狀況 心理狀況 情況的認 真不同給 的刺激不 同 34
  34. 34. 所以呢? 35 有label 可以下嗎?
  35. 35. “there is no limit to the number of possible different emotions “ William James 36
  36. 36. Silvan Tomkins (1962) concluded that there are eight basic emotions: • surprise, interest, joy, rage, fear, disgust, shame, and anguish Carroll Izard (the University of Delaware 1993) • 12 discrete emotions labeled: • Interest, Joy, Surprise, Sadness, Anger, Disgust, Contempt, Self-Hostility, Fear, Shame, Shyness, and Guilt • Differential Emotions Scale or DES-IV 37
  37. 37. Ekman 在1972年提出的基本情感 (現今流行) • 憤怒 • 厭惡 • 恐懼 • 快樂 • 悲傷 • 驚訝 38
  38. 38. Dimensional models of emotion Define emotions according to one or more dimensions • Wilhelm Max Wundt(1897) • three dimensions: "pleasurable versus unpleasurable", "arousing or subduing" and "strain or relaxation” • Harold Schlosberg (1954) • three dimensions of emotion: "pleasantness– unpleasantness", "attention–rejection" and "level of activation” • Prevalent • incorporate valence and arousal dimensions 39
  39. 39. 比較知名幾個模型 • Circumplex model • Vector model • Positive activation – negative activation (PANA) model • Plutchik's model • PAD emotional state model • Lövheim cube of emotion • Cowen & Keltner 2017 40
  40. 40. Circumplex model : Perceptual • developed by James Russell (1980) • two-dimensional circular space, containing arousal and valence dimensions • arousal represents the vertical axis and valence represents the horizontal axis • prevalent use as ‘labels’ 41
  41. 41. Positive activation – Negative activation (PANA) Self Report • created by Watson and Tellegen in 1985 • suggests that positive affect and negative affect are two separate systems (responsible for different functions) • states of higher arousal tend to be defined by their valence • states of lower arousal tend to be more neutral in terms of valence • the vertical axis represents low to high positive affect • the horizontal axis represents low to high negative affect. • the dimensions of valence and arousal lay at a 45- degree rotation over these axes 42
  42. 42. 43
  43. 43. Cowen & Keltner • 2017, University of California, Berkeley researchers Alan S. Cowen & Dacher Keltner (PNAS) • 27 distinct emotions • http://news.berkeley.edu/2017/09/06/27- emotions/ • (A.) Admiration. (B.) Adoration. (C.) Aesthetic appreciation. (D.) Amusement. (E.) Anger. (F.) Anxiety. (G.) Awe. (H.) Awkwardness. (I.) Boredom. (J.) Calmness. (K.) Confusion. (L.) Craving. (M.) Disgust. (N.) Empathic pain. (O.) Entrancement. (P.) Excitement. (Q.) Fear. (R.) Horror. (S.) Interest. (T.) Joy. (U.) Nostalgia. (V.) Relief. (W.) Romance. (X.) Sadness (Y.) Satisfaction (Z.) Sexual desire. (Ω.) Surprise. 44
  44. 44. 在想一下: Affective Computing 運用人工智慧實現情緒辨識 Many Theories Many Models/Annotations Take Away? 比較Stable的說法 45
  45. 45. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 (Data-driven AI Learning and Inference) 的問題嗎? 46
  46. 46. 歷史、理論、可行性 現今商業上呢? 47
  47. 47. Affective Computing 情感運算 reference: https://www.gartner.com/newsroom/id/3412017/ fast growing, but still not a mature technique 48
  48. 48. Face Affective Computing 一些公司的例子 SpeechBody GestureMulti-Modal PhysiologyLanguage reference: http://blog.ventureradar.com/2016/09/21/15-leading-affective-computing-companies-you-should-know/ 49
  49. 49. Education Health Care Gaming Advertisement Retail Legal Emotion Recognition AS Part of Larger System API, SDK 50
  50. 50. 舉幾個例子: 51
  51. 51. Little Dragon (Affectiva- Education) “make learning more enjoyable and more effective, by providing an educational tool that is both universal and personalized” reference: https://www.affectiva.com/success-story/ https://www.youtube.com/watch?v=SmjAa8iMkjU 52
  52. 52. 53
  53. 53. Nevermind (Affectiva- Gaming) bio-feedback horror game “sense a player’s facial expressions for signs of emotional distress, and adapt game play accordingly” reference: https://www.affectiva.com/success-story/c https://www.youtube.com/watch?v=NGr0orAqRH4&t=497s 54
  54. 54. Brain Power (Affectiva- Health Care) The World’s First Augmented Reality Smart-Glass-System to empower children and adults with autism to teach themselves crucial social and cognitive skills. reference: https://www.affectiva.com/success-story/ https://www.youtube.com/watch?v=qfoTprgWyns 55
  55. 55. 56
  56. 56. MediaRebel (Affectiva- Legal) • Legal video deposition management platform MediaRebel uses Affectiva’s Emotion SDK for facial expression analysis and emotion recognition. • Intelligent analytical features include: • Search transcript based upon witness emotions • Instantly playback testimony based upon select emotions • Identify positive, negative & neutral witness behavior reference: https://www.affectiva.com/success-story/ https://www.mediarebel.com/ 57
  57. 57. shelfPoint (Affectiva- Retail) • Cloverleaf is a retail technology company for the modern brick-and-mortar marketer and merchandise • shelfPoint solution: brands and retailers can now capture customer engagement and sentiment data at the moment of purchase decision — something previously unavailable in physical retail stores. reference: https://www.affectiva.com/success-story/ https://www.youtube.com/watch?v=S9gDqpF6kLs https://www.youtube.com/watch?v=W6UnahO_zXs 58
  58. 58. 59
  59. 59. 在想一下: 運用人工智慧實現情緒辨識 60
  60. 60. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 Data-driven AI Learning and Inference 的問題嗎? 61
  61. 61. 變成一個機器學習(人工智慧)問題時 資料哪裡來? 怎麼收? 62
  62. 62. 公開資料庫 63 Year Database Language Setting Protocol Elicitation 1997 DES Dan. Single Scr. Induced 2000 GEMEP Fre. Single Scr. & Spo. Acted 2005 eNTERFACE' 05 Eng. Single Scr. Induced 2007 HUMAINE Eng. TV Talk Scr. & Spo. Mix. 2008 VAM Ger. TV Talk Spo. Acted 2008 IEMOCAP Eng. Dyadic Scr. & Spo. Acted 2009 SAVEE Eng. Single Spo. Acted 2010 CIT Eng. Dyadic Scr. & Spo. Acted 2010 SEMAINE Eng. Dyadic Scr. Mix. 2013 RECOLA Fre. Dyadic Spo. Acted 2016 CHEAVD Chi. TV talk Spo. Posed 2017 NNIME Chi. Dyadic Spo. Acted 另一個重要點:怎麼評分?
  63. 63. Language: Danish Participants: 4 (Man: 2; Female: 2) Recordings: • Audio Total: 0.5 hours Sentences: 5200 utterances Labels: • Perspectives: Naïve-Observer • Rater: 20 • Discrete session-level annotation • Categorical (5) DES: DESIGN, RECORDING AND VERIFICATION OF A DANISH EMOTIONAL SPEECH DATABASE 64 Engberg, Inger S., et al. "Design, recordingand verification of a Danish emotional speech database."Fifth European Conferenceon Speech Communication and Technology.1997. Available: Tom Brøndsted (tom@brondsted.dk)
  64. 64. DES • Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification1 (Cat.:0.676) • Automatic emotional speech classification2 (Cat.:0.516) 65 1Yun, Sungrack, and Chang D. Yoo. "Loss-scaled large-margin Gaussian mixture models for speech emotion classification."IEEE Transactions on Audio, Speech, and Language Processing20.2 (2012): 585-598. 2Ververidis, Dimitrios, Constantine Kotropoulos, and Ioannis Pitas. "Automatic emotional speech classification." Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP'04). IEEE International Conference on. Vol. 1. IEEE, 2004..
  65. 65. Language: French Participants: 10 (Man: 5; Female: 5) Recordings: • Dual-channel Audio • HD Video • Manual Transcript • Face & Head • Body Posture & Gestures Sentences: 7300 sequences Labels: • Perspectives: Naïve-Observer • Discrete session-level annotation • Categorical (18) GEMEP: Geneva Multimodal Emotion Portrayals corpus 66 Bänziger, Tanja, Hannes Pirker, and K. Scherer. "GEMEP-GEnevaMultimodal Emotion Portrayals:A corpus for the study of multimodal emotional expressions." Proceedings of LREC. Vol. 6. 2006. Bänziger, Tanja, and Klaus R. Scherer. "Using actor portrayals to systematicallystudy multimodalemotion expression: The GEMEPcorpus." International conference on affective computing and intelligentinteraction. Springer, Berlin, Heidelberg, 2007. Available: Tanja Bänziger(Tanja.Banziger@ pse.unige.ch)
  66. 66. GEMEP • Multimodal emotion recognition from expressive faces, body gestures and speech (Cat.: 0.571) 67 Kessous, Loic, Ginevra Castellano, and George Caridakis. "Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis." Journal on MultimodalUser Interfaces 3.1 (2010): 33-48.
  67. 67. Language: English Participants: 42 (Man: 34; Female: 24) (14 different nationalities) Recordings: • Dual-channel Audio • HD Video • Script Total: 1166 video sequences Emotion-related atmosphere: • To express six emotions eNTERFACE' 05: The eNTERFACE’05 Audio-Visual Emotion Database 68 Martin, Olivier, et al. "The enterface’05 audio-visual emotion database."Data Engineering Workshops,2006. Proceedings.22nd International Conferenceon. IEEE, 2006. Available: O. Martin (martin@tele.ucl.ac.be)
  68. 68. eNTERFACE' 05 • Sparse autoencoder- based feature transfer learning for speech emotion recognition1 (Cat.: 59.1) • Unsupervised learning in cross-corpus acoustic emotion recognition2 (Val./Act.:0.574/0.616) 69 1Deng, Jun, et al. "Sparse autoencoder-basedfeature transfer learning for speech emotion recognition." Affective Computing and IntelligentInteraction (ACII), 2013 Humaine Association Conference on. IEEE, 2013. 2Zhang, Zixing, et al. "Unsupervised learning in cross-corpus acoustic emotion recognition." Automatic Speech Recognition and Understanding(ASRU), 2011 IEEE Workshop on. IEEE, 2011.
  69. 69. Language: English Participants: Many (Include 8 datasets) Recordings : (Naturalistic (TV shows, interviews)/Induced data) • Audio • Video • Gesture • Emotion words Labels: • Perspectives: Naïve-Observer • Rater: 4 • Continuous-in-time annotation • Dimensional (8) [Intensity, Activation, Valence, Power, Expect, Word] • Discrete annotation (5) • Emotion-related states • Key Event • Everyday Emotion words… HUMAINE: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data 70 Douglas-Cowie, Ellen, et al. "The HUMAINE database: addressingthe collection and annotation of naturalistic and induced emotional data." Affective computingand intelligent interaction (2007): 488-500. Available: E.Douglas-Cowie@qub.ac.uk
  70. 70. HUMAINE • A Multimodal Database for Affect Recognition and Implicit Tagging1 (Val./Act.:0.761/0.677) • Abandoning Emotion Classes - Towards Continuous Emotion Recognition with Modelling of Long- Range Dependencies2 (Val./Act.[MSE]:0.18/0.08) 71 1Soleymani, Mohammad, et al. "A multimodaldatabase for affect recognition and implicittagging." IEEE Transactions on Affective Computing 3.1 (2012): 42-55. 2Wöllmer, Martin, et al. "Abandoning emotion classes-towards continuous emotion recognitionwith modellingof long-range dependencies."Ninth Annual Conference of the International Speech CommunicationAssociation. 2008.
  71. 71. Language: German TV shows Participants: 47 Recordings: • Audio • Video • Face • Manual Transcript Total: 12 hours Sentences: 946 utterances Labels: • Perspectives: Peer, Director, Self, Naïve-Observer • Rater: 17 • Continuous-in-time annotation • Dimensional (Valence-Activation-Dominance) for Audio • Discrete session-level annotation • Categorical (7) for Faces VAM: The Vera am Mittag German Audio-Visual Spontaneous Speech Database 72 Grimm, Michael, Kristian Kroschel,and Shrikanth Narayanan."The Vera am Mittag German audio-visual emotional speech database."Multimedia and Expo, 2008 IEEE International Conferenceon. IEEE, 2008. Available: Michael.Grimm@ieee.org
  72. 72. VAM • Towards robust spontaneous speech recognition with emotional speech adapted acoustic models1 (Word ACC.: 42.75) • Selecting training data for cross-corpus speech emotion recognition: Prototypicality vs. generalization Speech Adapted Acoustic Models2 • (Val./Act.: 0.502/0.677) 73 1Vlasenko, Bogdan, Dmytro Prylipko, and Andreas Wendemuth. "Towards robust spontaneous speech recognition with emotional speech adapted acoustic models." Poster and Demo Track of the 35th German Conference on ArtificialIntelligence, KI-2012, Saarbrucken, Germany. 2012. 2Schuller, Björn, et al. "Selecting training data for cross-corpus speech emotion recognition: Prototypicalityvs. generalization."Proc. 2011 Afeka-AVIOS Speech Processing Conference, Tel Aviv, Israel. 2011.
  73. 73. Language: English Participants: 10 (Man: 5; Female: 5) Recordings: • Dual-channel Audio • HD Video • Manual Transcript • 53 Marker Motion (Face and Head) Total: 12 hours, 50 sessions (3 min/session) Sentence: 6904 sentences Labels: • Perspectives: Naïve-Observer、Self (6/10) • Rater: 6 • Continuous-in-time annotation • Dimensional (Valence-Activation-Dominance) • Discrete session-level annotation • Categorical (5) IEMOCAP: The Interactive Emotional Dyadic Motion Capture database 74 Busso, Carlos, et al. "IEMOCAP: Interactiveemotional dyadic motion capturedatabase." Languageresourcesand evaluation 42.4 (2008): 335. Available: Anil Ramakrishna (akramakr@usc.edu)
  74. 74. IEMOCAP • Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information1 (Val./Act./Dom.:0.619/0.637 /0.62) • Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions2 (Cat./Val./Act.:0.552/0.634/0 .650) 75 1Metallinou, Angeliki, Athanasios Katsamanis, and Shrikanth Narayanan. "Tracking continuous emotional trends of participants during affective dyadic interactionsusing body language and speech information."Image and Vision Computing 31.2 (2013): 137-152. 2Lee, Chi-Chun, et al. "Modeling mutual influenceof interlocutoremotion states in dyadic spoken interactions."Tenth Annual Conference of the International Speech CommunicationAssociation. 2009.
  75. 75. Language: English Participants: 4 (Man: 4) Recordings: • Dual-channel Audio • Video • Face Maker Sentences: 480 utterances Labels: • Perspectives: Naïve-Observer • Discrete session-level annotation • Categorical (6) SAVEE: Surrey Audio-Visual Expressed Emotion database 76 Jackson, P., and S. Haq. "Surrey Audio-Visual Expressed Emotion(SAVEE) Database." University of Surrey: Guildford, UK (2014). Available: P Jackson (p.jackson@surrey.ac.uk)
  76. 76. SAVEE 77 . 2S. Haq and P.J.B. Jackson. "Speaker-DependentAudio-VisualEmotion Recognition", In Proc. Int'l Conf. on Auditory-Visual Speech Processing, pages 53-58, 2009. 3S. Haq, P.J.B. Jackson, and J.D. Edge. Audio-VisualFeature Selection and Reduction for Emotion Classification. In Proc. Int'l Conf. on Auditory-Visual Speech Processing, pages 185-190, 2008 • Speaker-Dependent Audio- Visual Emotion Recognition1 (Cat.: 97.5) • Audio-Visual Feature Selection and Reduction for Emotion Classification3 (Cat.: 96.7)
  77. 77. Language: English Participants: 16 (Man: 7; Female: 9) Recordings: • Dual-channel Audio • HD Video • Transcript • Body gesture Total: 48 dyadic sessions Sentences: 2162 sentence Labels: • Perspectives: Naïve-Observer • Rater: 3 • Discrete session-level annotation • Continuous-in-time annotation • Dimensional (Valence-Activation-Dominance) CIT: The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations 78 Metallinou, Angeliki, et al. "The USC CreativeIT database: A multimodal database of theatrical improvisation." Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (2010): 55. Metallinou, Angeliki, et al. "The USC CreativeIT database of multimodaldyadic interactions: From speech and full body motion capture to continuous emotional annotations." Languageresources and evaluation50.3 (2016): 497-521. Available: Manoj Kumar (prabakar@usc.edu)
  78. 78. CIT 79 1Yang, Zhaojun, and Shrikanth S. Narayanan. "Modelingdynamics of expressive body gestures in dyadic interactions."IEEE Transactions on Affective Computing 8.3 (2017): 369-381. 2Yang, Zhaojun, and Shrikanth S. Narayanan. "AnalyzingTemporal Dynamics of Dyadic Synchrony in Affective Interactions." INTERSPEECH. 2016. 3Chang, Chun-Min, and Chi-Chun Lee. "Fusion of multiple emotion perspectives: Improvingaffect recognitionthrough integrating cross-lingualemotion information." Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017. • Analyzing Temporal Dynamics of Dyadic Synchrony in Affective Interactions2
  79. 79. Language: English Participants: 150 Recordings: • Dual-channel Audio • HD Video • Manual Transcript Multi-Interaction (like TV talk show): • Human vs. Human • Semi-human vs. Human • Machine vs. Human Total: 959 dyadic sessions (3 min/session) Labels: • Perspectives: Naïve-Observer • Rater: 8 • Continuous-in-time annotation • Dimensional (Valence-Activation) • Discrete Categorical (27) SEMAINE: The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent 80 McKeown, Gary, et al. "The semaine database: Annotatedmultimodal recordsof emotionally colored conversationsbetween a person and a limited agent." IEEE Transactionson Affective Computing 3.1 (2012): 5-17. Available: eula@semaine-db.eu
  80. 80. SEMAINE • Building autonomous sensitive artificial listeners1 • A Dynamic Appearance Descriptor Approach to Facial Actions Temporal Modeling2 (0.701) 81 1Schroder, Marc, et al. "Buildingautonomous sensitive artificiallisteners." IEEE Transactions on Affective Computing 3.2 (2012): 165-183. 2Jiang, Bihan, et al. "A dynamic appearance descriptor approach to facial actions temporal modeling."IEEE transactions on cybernetics 44.2 (2014): 161-174.
  81. 81. Language: French Participants: 46 (Man: 19; Female: 27) Recordings: • Dual-channel Audio • HD Video (15 facial action units) • Electrocardiogram • Electrothermal activity Total: 11 hours, 102 dyadic sessions (3 min/session) Sentence: 1306 sentence Labels: • Perspectives: Self, Naïve-Observer • Rater: 6 • Continuous-in-time annotation • Dimensional (Valence-Activation) RECOLA: Remote Collaborative and Affective Interactions 82 Ringeval, Fabien, et al. "Introducing the RECOLA multimodal corpusof remote collaborative and affective interactions." Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on. IEEE, 2013. Available: Fabien Ringeval (faboem.ringeval@image.fr)
  82. 82. RECOLA • Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data1 (Val./Act.: 0.804/0.528 ) • End-to-end speech emotion recognition using a deep convolutional recurrent network2 (Val./Act.: 0.741/0.325 ) • Face Reading from Speech— Predicting Facial Action Units from Audio Cues3 (Predict Facial Action Units from Audio Cues: 0.650 ) 83 1Ringeval, Fabien, et al. "Predictionof asynchronous dimensional emotion ratings from audiovisualand physiologicaldata." Pattern RecognitionLetters 66 (2015): 22-30. 2Trigeorgis, George, et al. "Adieu features? End-to-end speech emotion recognition using a deep convolutionalrecurrent network." Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, 2016. 3Ringeval, Fabien, et al. "Face Reading from Speech—PredictingFacial Action Units from Audio Cues." Sixteenth Annual Conference of the International Speech CommunicationAssociation. 2015.
  83. 83. Language: Chinese Participants: 238 Recordings: • Audio • Video (34 films, 2 TV series, 4 TV shows) Total: 2.3 hours, Labels: • Rater: 4 • Discrete session-level annotation • Fake/suppressed emotions • Multi-emotion annotation for some segments • Categorical (26 non-prototypical) 2017 Multimodal Emotion Recognition Challenge (MEC 2017: http://www.chineseldc.org/htdocsEn/emotion.html) CHEAVD: A Chinese natural emotional audio-visual database 84 Li, Ya, et al. "CHEAVD: a Chinese naturalemotional audio–visual database." Journal of Ambient Intelligence and Humanized Computing 8.6 (2017): 913- 924. Available: Ya Li (yli@nlpr.ia.ac.cn)
  84. 84. CHEAVD • MEC 2016: the multimodal emotion recognition challenge of CCPR 20161 (Cat.: 37.03) • Chinese Speech Emotion Recognition2 (Cat.: 47.33) • Transfer Learning of Deep Neural Network for Speech Emotion Recognition3 (Cat.: 50.01) 85 1Li, Ya, et al. "MEC 2016: the multimodal emotion recognition challenge of CCPR 2016." Chinese Conference on Pattern Recognition. Springer Singapore, 2016. 2Zhang, Shiqing, et al. "Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition." Chinese Conference on Pattern Recognition. Springer Singapore, 2016. 3Huang, Ying, et al. "Transfer Learning of Deep Neural Network for Speech Emotion Recognition." Chinese Conference on Pattern Recognition. Springer Singapore, 2016.
  85. 85. Language: Chinese Participants: 44 (Man: 20; Female: 24) Recordings: • Dual-channel Audio • HD Video • Manual Transcript • Electrocardiogram Total: 11 hours, 102 dyadic sessions (3 min/session) Sentences: 6029 utterances Labels: • Perspectives: Peer, Director, Self, Naïve-Observer • Rater: 49 • Continuous-in-time annotation • Discrete session-level annotation • Dimensional (Valence-Activation) • Categorical (6) NNIME: The NTHU-NTUA Chinese Interactive Multimodal Emotion Corpus 86 Huang-ChengChou, Wei-Cheng Lin, Lien-ChiangChang, Chyi-ChangLi, Hsi-Pin Ma, Chi-Chun Lee "NNIME: The NTHU-NTUA Chinese InteractiveMultimodal Emotion Corpus" in Proceedings of ACII 2017 Available: Huang-Cheng Chou (hc.chou@gapp.nthu.edu.tw) Chi-Chun Lee (cclee@ee.nthu.edu.tw)
  86. 86. NNIME • Cross-Lingual Emotion Information1,3(sessio n) (Val./Act.: 0.682/0.604) • Dyad-Level Interaction2 (Cat.: 0.65) 87 1Chun-Min Chang, Bo-Hao Su, Shih-Chen Lin, Jeng-Lin Li, Chi-Chun Lee*, "A Boostrapped Multi-ViewWeighted Kernel Fusion Framework for Cross-Corpus Integration of Multimodal Emotion Recognition"in Proceedingsof ACII 2017 2Yun-Shao Lin, Chi-Chun Lee*, "DerivingDyad-Level InteractionRepresentation using InterlocutorsStructural and Expressive Multimodal Behavior Features" in Proceedings of the InternationalSpeech CommunicationAssociation (Interspeech),pp. 2366-2370, 2017 3Chun-Min Chang, Chi-Chun Lee*, "Fusion of Multiple Emotion Perspectives:Improving Affect RecognitionThrough Integrating Cross-Lingual Emotion Information" in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp.5820-5824, 2017
  87. 87. Access These Emotion Database 88 Year Database Website 1997 DES http://kom.aau.dk/~tb/speech/Emotions/ 2000 GEMEP https://www.affective-sciences.org/gemep/ 2005 eNTERFACE' 05 http://www.enterface.net/enterface05/ 2007 HUMAINE http://emotion-research.net/download/pilot-db/ 2008 VAM http://emotion-research.net/download/vam 2008 IEMOCAP http://sail.usc.edu/iemocap/ 2009 SAVEE http://kahlan.eps.surrey.ac.uk/savee/ 2010 CIT http://sail.usc.edu/CreativeIT/ImprovRelease.htm 2010 SEMAINE https://semaine-db.eu/ 2013 RECOLA https://diuf.unifr.ch/diva/recola/download.html 2016 CHEAVD Upon request 2017 NNIME http://nnime.ee.nthu.edu.tw/ 當然很多公司有自己私人的
  88. 88. Key take-away 多模態資料 (哪些可量測行為) 情緒標籤 (怎麼標、請誰標) 數量 (多少人、多少種、多久) 準確率不一定!! 收集真的很辛苦 : 其他媒介呢 ?(Maybe) 89
  89. 89. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 Data-driven AI Learning and Inference 的問題嗎? 90
  90. 90. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 Data-driven AI Learning and Inference 的問題嗎? 91
  91. 91. Speech Text Gesture Face 人外顯行為 Human Expression 意圖與目的 交流互動 傳達理念 附加情緒 感受情緒 萃取情緒 92
  92. 92. 語音與文字之情感計算 Paralinguistic Expression Linguistic Expression 93
  93. 93. 正向情緒 •Achievement 88.4% •Amusement 90.4% •Contentment 52.4% •Pleasure 61.6% •Relief 83.9% 分類情緒 易引發情緒之語音組 •篩選自前一實驗的樣本 •結果平均落在73%-94% •Amusement及disgust容易區分 •Pleasure和sadness難辨別 •部分的非語言性語音容易混淆 加入”以上皆非”選項 •與分類情緒比較 •Sadness及relief上升24%及 17.5% •Amusement下降12% •平均上升7.9% 聽見情緒? 人? 非語言性語音 Achievement, Amusement, Anger, Contentment, Disgust, Pleasure, Relief, Sadness, Surprise 平均: 69.9% Sauter, Disa. An investigation into vocal expressions of emotions: the roles of valence, culture, and acoustic factors. University of London, University College London (United Kingdom), 2007. 編碼 解碼 94
  94. 94. Laugh Cry Sigh Whisper Whine 副語言語音情緒 Laukka, Petri, et al. "Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations." Frontiers in Psychology 4 (2013). Gupta, Rahul, et al. "Detecting paralinguistic events in audio stream using context in features and probabilistic decisions." Computer Speech & Language 36 (2016): 72-92. Laughter & Fillers 2015 IS2013 sub-challenge AUC for Detection Laughter : 95.3 % Fillers : 90.4 % Cross-Culture 2013 Universal Emotion Non-Verbal Signals Speak : India, USA, Kenya, Singapore, Listen : Sweden 95
  95. 95. 那機器怎麼從語音辨識? Sahu, Saurabh & Gupta, Rahul & Sivaraman, Ganesh & AbdAlmageed, Wael & Espy-Wilson, Carol. (2017). Adversarial Auto-Encoders for Speech Based Emotion Recognition. 1243-1247. 10.21437/Interspeech.2017-1421. Rao, K. Sreenivasa, Shashidhar G. Koolagudi, and Ramu Reddy Vempada. "Emotion recognition from speech using global and local prosodic features." International journal of speech technology 16.2 (2013): 143-160. Lalitha, S., et al. "Emotion detection using MFCC and Cepstrum features." Procedia Computer Science 70 (2015): 29-35. Huang, Che-Wei, and Shrikanth Narayanan. "Characterizing Types of Convolution in Deep Convolutional Recurrent Neural Networks for Robust Speech Emotion Recognition." arXiv preprint arXiv:1706.02901 (2017). Lee, Jinkyu, and Ivan Tashev. "High-level feature representation using recurrent neural network for speech emotion recognition." INTERSPEECH. 2015. Emo-DB Prosodic SVM 62.43% MFCC ANN 85.7% Deep Convolution High-Level Representation (time series) 96
  96. 96. 語音情緒中的特徵 Dimosa, Kostis, Leopold Dickb, and Volker Dellwoc. "Perception of levels of emotion in speech prosody." The Scottish Consortium for ICPhS (2015). Erickson, Donna. "Expressive speech: Production, perception and application to speech synthesis." Acoustical Science and Technology 26.4 (2005): 317-325. Sauter, Disa. An investigation into vocal expressions of emotions: the roles of valence, culture, and acoustic factors. University of London, University College London (United Kingdom), 2007. Erickson, Donna. "Expressive speech: Production, perception and application to speech synthesis." Acoustical Science and Technology 26.4 (2005): 317-325. “emotional prosody does not function categorically, distinguishing only different emotions, but also indicates different degrees of the expressed emotion.” pitch and pitch variation is especially important for people to recognize emotion from non-verbal sounds voice quality  tension Some Experiments : change the sound (remove pitch, noisy channel, …) 語音描述性特徵 (descriptors) 與情緒息息相關 A Review : Research Findings of Acoustic and Perceptual Studies 97
  97. 97. 語音情緒辨識流程 Flow chart 前處理原始訊號 特徵擷取 辨識器 Learning Representation Discriminative Model 98
  98. 98. 特徵擷取 (Low-level Descriptors) Low Level Descriptors (10 – 15 ms) Mel Frequency Cepstral Coefficients Pitch Signal Energy Loudness Voice Quality (Jitter, Shimmer) Log Filterbank Energies Linear Prediction Cepstral Coefficients CHROMA and CENS Features (Music) Compute 原始訊號 Statistics Method Continuous Qualitative Spectral Pitch Energy Formants Voice quality : Harsh, tense, breathy LPC MFCC LFPC 99
  99. 99. 例子: 用音高辨別情緒 Arias, Juan Pablo, Carlos Busso,and NestorBecerra Yoma. "Shape-based modeling of the fundamental frequency contour for emotion detection in speech." ComputerSpeech & Language28.1(2014): 278-294. emotionally salient temporal segments • 75.8% in binary emotion classification • Dot, dash : subjective, dev. of sujective • Solid : objective 100
  100. 100. 更多特徵計算—語音的產生 Source Filter (ex) High arousal Physically Vocal Production System • Respiration • Vocal Fold Vibration • Articulation increase tension in laryngeal musculature 喉肌肉組織 raised subglottis pressure change production of sound at glottis vocal quality Johnstone, Tom & Scherer, Klaus. (2000). Vocal communication of Emotion. Handbook of Emotions,. . 101
  101. 101. 更多特徵計算—語音的感知 Mel-scale Filter Bank The response of the basilar membrane as a function of frequency, measured at six different distances from the stapes The psychoacoustical transfer function Stern, Richard M., and Nelson Morgan. "Features based on auditory physiology and perception." Techniques for Noise Robustness in Automatic Speech Recognition (2012): 193227. 102
  102. 102. Support Vector Machine (SVM) Convolutional Neural Network Hidden Markov Model (HMM) Recurrent Neural Network 情緒辨識模型 Time series Model 103
  103. 103. 語音情緒深度學習 End to End – From LLD to Deep Learning 深度學習模型擷取更全面的語音特徵資訊 Z. Aldeneh and E. M. Provost, "Using regional saliency for speech emotion recognition," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 2741-2745.doi: 10.1109/ICASSP.2017.7952655 C. W. Huang and S. S. Narayanan, "Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition," 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, 2017, pp. 583-588. doi: 10.1109/ICME.2017.8019296 signal Neural Network emotion CNN for Time Series Signal Attention 104
  104. 104. 語音特徵擷取工具 跨平台 YAAFE, an Easy to Use and EfficientAudio Feature Extraction Software, B.Mathieu,S.Essid, T.Fillon,J.Prado, G.Richard,proceedingsof the 11th ISMIR conference, Utrecht,Netherlands,2010. Florian Eyben, Felix Weninger, Florian Gross, Björn Schuller: “RecentDevelopments in openSMILE,the Munich Open-Source Multimedia Feature Extractor”, In Proc. ACM Multimedia (MM), Barcelona,Spain, ACM, ISBN 978-1-4503-2404-5,pp. 835-838,October 2013. doi:10.1145/2502081.2502224 Paul Boersma & David Weenink (2013): Praat: doing phonetics by computer [Computer program]. 105
  105. 105. 語音與文字之情感計算 Paralinguistic Expression Linguistic Expression 106
  106. 106. 情緒 認知 語言 讀到語言內的情緒? Schwarz-Friesel, Monika. "Language and emotion." The Cognitive Linguistic Perspective, in: Ulrike Lüdtke (Hg.), Emotion in Language. Theory–Research–Application, Amsterdam (2015): 157-173. 轉換關係 情感 判斷 Lexicon Grammar Ideational Meaning 具象化 語言是一種幫助人類獲知情緒的管道 Lindquist, Kristen A., Jennifer K. MacCormack, and Holly Shablack. "The role of language in emotion: predictions from psychological constructionism." Frontiers in psychology 6 (2015). 107
  107. 107. 辨識語意情緒 Human Behavior Evaluation •Cuple’s Therapy •Oral Presentation Reviews •Hotels  HBRNN •Amazon  Cross-Lingual •Movie (93%), Book (92%), DVD (93%), …  PNN + RBM Tweets •Positive & Negative •DCNN & LSTM 巨量資料 缺乏標註 結構複雜 Ain, Qurat Tul, et al. "Sentiment analysis using deep learning techniques: a review." Int J Adv Comput Sci Appl 8.6 (2017): 424 108
  108. 108. Review Article Social Media Talk 語意分析, 文字帶有甚麼? It’s terrible! What Texts Tell Us (Topics) Emotional Polarity It’s cool! Parts of Speech (POS) tagsN-Gram https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html VB VBD NN NNS JJ JJR JJS IN TO 109
  109. 109. Dictionary-Based Sentiment Analysis 關鍵字 情緒字典 對應關係 句子 110
  110. 110. 建立情緒字典 Changqin Quan and Fuji Ren. 2009. Construction of a blog emotion corpus for Chinese emotional expression analysis. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3 (EMNLP '09), Vol. 3. Association for Computational Linguistics, Stroudsburg, PA, USA, 1446-1454. Mohammad, Saif M., and Peter D. Turney. "Crowdsourcing a word–emotion association lexicon." Computational Intelligence29.3 (2013): 436-465. Pennebaker, James W., et al. The development and psychometric properties of LIWC2015. 2015. 假設 : 關鍵字富含先天的正 負面向,句意的正負 面來自這些關鍵字的 使用 LIWC (Linguistic Inquiry Word Count) 分類字典 : 語意種類(不只情緒) 64個類別,約4500字 情緒正面/負面 : 406/499 Seed WordGold Standard : 關鍵字 1. 由標記者選出具有 情緒的字眼 2. 半自動化利用文字 結構及相關性找出 其他情緒關鍵字 形容詞  副詞、名詞、動詞 以選擇題的方式非固 定標記者標註,以不 同問題設計驗證,如 哪個字像這個字,這 個字跟快樂有無關聯 群眾智慧 111
  111. 111. Data-driven方法呢? Sentiment Analysis (Unsupervised) Data-Driven Latent Structure  (representation)  recognition 112
  112. 112. Sentiment Analysis (Supervised) 標記者間的標記相關度 = 0.76 各情緒約落在 0.6 到 0.79 使用關鍵字辨識 = 0.66 自動化情緒辨識 = 0.73 (Naïve Bayes, SVM) Aman, Saima, and Stan Szpakowicz. "Identifying expressions of emotion in text." Text, speech and dialogue. Springer Berlin/Heidelberg, 2007. Feature Representation Classifier Emotion Label 113
  113. 113. 近年深度學習方式Deep Model Lopez, Marc Moreno, and Jugal Kalita. "Deep Learning applied to NLP." arXiv preprint arXiv:1703.03091 (2017). . Embed Embed Embed I LSTM LSTM LSTM love it positive 114
  114. 114. 沒文字怎麼辦? 115
  115. 115. Automatic Speech Recognition (ASR) 語音辨識 f( ) = Speech Text Challenging Task speaker gender Mapping / Translation 116 把聲音拆成小小的 phoneme 在組起來 辨識
  116. 116. 整合語音與文字 Aldeneh, Zakaria & Khorram, Soheil & Dimitriadis, Dimitrios & Mower Provost, Emily. (2017). Pooling acoustic and lexical features for the prediction of valence. 68-72. 10.1145/3136755.3136760. AffectNatural Language Non-Verbal Speech Bio- Information Image … Pooling Intermediate Representation Performance Robustness 117
  117. 117. • 對臉部肌肉群的運動及其對表情 的控製作用做了深入研究 • 開發了面部動作編碼系統 (Facial Action Coding System,FACS) • 根據人臉的解剖學特點,將其劃 分成若干既相互獨立又相互聯繫 的運動單元(AU) • 分析了這些運動單元的運動特征 及其所控制的主要區域以及與之 相關的表情 這個後面我們這邊帶到 118 臉部表情
  118. 118. Facial Action Coding System (FACS) 面部動作編碼系統
  119. 119. FACS The tool for annotating facial expressions What The Face Reveals is strong evidence for the fruitfulness of the systematic analysis of facial expression Paul Ekman and Wallace V. Friesen 1976
  120. 120. Action Unit (AUs) 動作單元 • AUs are considered to be the smallest visually discernible facial movement • As AUs are independent of any interpretation, they can be used as the basis for recognition of basic emotions • It’s an explicit means of describing all possible movements of face in 46 action points
  121. 121. Action Unit (AUs) • FACS is a tool for measuring facial expressions • Each observable component of facial moment is called an AUs • All facial expressions can be broken down into their constituent AU’s AU Description Example AU Description Example 1 Inner Brow Raiser 12 Lip Corner Puller 4 Brow Lowerer 13 Cheek Puffer 7 Lid Tightener 20 Lip stretcher
  122. 122. AU framework 臉部動作單位基礎架構
  123. 123. Facial Expressions of Emotion (e.g., happy, fear, disgust, surprise, etc) Automatic face & Facial feature detection Face alignment Multiple image windows at a variety of Locations and scales Feature extraction: Facilitate subsequentlearning and generalization,leadingto betterhuman interpretation Image filter: Modify or enhance theimage Facial AU (e.g., AU1, AU7, AU6+ASU15, etc) Rule-based classifier 一個簡易辨識的架構 e.g., Gabor filter coefficients
  124. 124. Facial Expressions of Emotion (e.g., happy, fear, disgust, surprise, etc) Automatic face & Facial feature detection Face alignment Multiple image windows at a variety of Locations and scales Feature extraction: Facilitate subsequentlearning and generalization,leadingto betterhuman interpretation Image filter: Modify or enhance theimage Facial AU (e.g., AU1, AU7, AU6+ASU15, etc) Rule-based classifier "Recognizing action units for facial expression analysis.“ Tian, Y-I., Takeo Kanade, and Jeffrey F. Cohn.
  125. 125. Recognize AUs for Facial Expression Analysis - Rule-based Classifier 規則式分類器 Informed by FACS AUs, they group the facial features into upper and lower parts because the facial actions in two sides are relatively independent for AU recognition [14] P. Ekman and W.V. Friesen, The facial action coding system: A technique for the measurement of facial movement single AU detection combined AU detection
  126. 126. Recognize AUs for Facial Expression Analysis - Results 結果 AU detection Ekman-Hager Single AU detection Combine AU detection Recognition rate Upper face 75 % 86.7 % Lower face 95.8 % 90.7 % AU detection cross database Test databases Train databaseCohn- Kanade Ekman- Hager Recognition rate Upper face 93.2 % 86.7 % Ekman- Hager Lower face 90.7 % 93.4 % Cohn- Kanade 一次預測多個比較準: 連動 AU準確率其實相當不錯, 跨資料庫也是
  127. 127. Facial Expressions of Emotion (e.g., happy, fear, disgust, surprise, etc) Automatic face & Facial feature detection Face alignment Multiple image windows at a variety of Locations and scales Feature extraction: Facilitate subsequentlearning and generalization,leadingto betterhuman interpretation Image filter: Modify or enhance theimage Facial AU (e.g., AU1, AU7, AU6+ASU15, etc) Rule-based classifier "Recognizing Facial Expressions of Emotion using Action Unit Specific Decision Thresholds “ Mustafa Sert, and Nukhet Aksoy AAM face track model
  128. 128. Recognizing Facial Expressions of Emotion using Action Unit Specific Decision Thresholds • Extract facial images from Active Appearance Model (AAM) 主動式外 觀模型 to form an appearance model • Facial AU multi-class classification using ADT 適應性決策模型 for both AU detection and facial expression recognition • ADT learns a separate decision threshold 𝑇𝑖 for each AU category, assign instance 𝑥 to category 𝑖 if and only if satisfied: 𝑓𝑖 𝑥 = 𝒘𝑖 𝑇 Φ𝒙 + 𝒃𝒊 > 𝑇𝑖 Φ is the mapping function to map SVM to high dimension space
  129. 129. Recognizing Facial Expressions of Emotion using Action Unit Specific Decision Thresholds ↑: Prototypic and major variants of AU combinations for facial expression fear. ‘+’ denotes logical AND ‘,’ indicates logical OR Facial expression recognition accuracy of the proposed scheme. → Bold bracketed numbers indicate best result, bold numbers denote second best
  130. 130. Recognizing Facial Expressions of Emotion using Action Unit Specific Decision Thresholds • ADT-based AU detector along with the rule- based emotion classifier (B&D) outperforms the baseline methods (A&C) • ↑Among the proposed method, D gives best results in all facial emotion categories except surprise • → The proposed ADT scheme outperforms the baseline method by an average F1-score of 6.383% for 17 AUs • → It gives superior performance in terms of F1- score compared with the baseline method for all AUs except AU2
  131. 131. Facial Expressions of Emotion (e.g., happy, fear, disgust, surprise, etc) Automatic face & Facial feature detection Face alignment Multiple image windows at a variety of Locations and scales Feature extraction: Facilitate subsequentlearning and generalization,leadingto betterhuman interpretation Image filter: Modify or enhance theimage Facial AU (e.g., AU1, AU7, AU6+ASU15, etc) Rule-based classifier "Compound facial expressions of emotion: from basic research to clinical applications“ Shichuan Du, and Aleix M. Martinez Observations under distinct compound emotions
  132. 132. Compound facial expressions of emotion 複合式情緒面部表情
  133. 133. Compound facial expressions of emotion • AU intensity shown in a cumulative histogram for each AU and emotion category • The x-axis in these histograms specifies the intensity of activation • The y-axis in these histograms defines the cumulative percentage of intensity (scale 0 to 1) • Numbers between zero and one specify the percentage of people using the specified and smaller intensities. Fig. AUs used to express a compound emotion are consistent with the AUs used to express its component categories
  134. 134. Key take-away 找到AU很重要 標出AU很重要 ! ! ! 學辨識AU可以用DNN 可以整合語音+文字+臉部 135
  135. 135. 136 AffectNatural Language Non- Verbal Speech Physiology Face Body Gestures 人的下一個行為模態?
  136. 136. 肢體語言 • 非語言溝通(肢體動作與臉部表情)源自於達爾文 『人類與動物的情緒表達』論文中 • 以動作分析系統來描述肢體動作 • 透過動作分析,有下列基本動作特徵描述: (1)軀幹運動(伸展,鞠躬) (2)手臂運動(開,關) (3)垂直方向(向上,向下) (4)矢狀方向(向前,向後) (5)力(強,輕) (6)速度(快,慢) (7)直接性(直接,間接) • 心理學實驗運用動作特徵找尋與辨識情緒之間的關係 reference: de Meijer, M. The contribution of general features of body movement to the attribution of emotions. Journal of Nonverbal Behavior 13, 4 (1989), 247–268. 137
  137. 137. 肢體與情緒相關的研究 • Psychology • Bull, P. E. Posture and gesture. Pergamon press, 1987. • Pollick, F. E., Paterson, H. M., Bruderlin, A., and Sanford, A. J. Perceiving affect from arm movement. Cognition 82, 2 (2001), B51–B61. • Coulson, M. Attributing emotion to static body postures: Recognition accuracy, confusions, and viewpoint dependence. Journal of nonverbal behavior 28, 2 (2004), 117–139. • Boone, R. T., and Cunningham, J. G. Children’s decoding of emotion in expressive body movement: The development of cue attunement. Developmental psychology 34 (1998), 1007–1016. • de Meijer, M. The contribution of general features of body movement to the attribution of emotions. Journal of Nonverbal Behavior 13, 4 (1989), 247–268. • Engineer • Balomenos, T., Raouzaiou, A., Ioannou, S., Drosopoulos, A., Karpouzis, K., and Kollias, S. Emotion analysis in man-machine interaction systems. In Machine learning for multimodal interaction. Springer, 2005, 318–328. • Coulson, M. Attributing emotion to static body postures: Recognition accuracy, confusions, and viewpoint dependence. Journal of nonverbal behavior 28, 2 (2004), 117–139. reference: Stefano Piana, Alessandra Staglianò, Francesca Odone, Alessandro Verri, Antonio Camurri, “Real-time Automatic Emotion Recognition from Body Gestures” in ArXiv 2014 138
  138. 138. 怎麼收資料? 12 actors four female and eight males aged between 24 and 60 total of about 100 videos separate clips of expressive gesture reference: 1) Stefano Piana, Alessandra Staglianò, Francesca Odone, Alessandro Verri, Antonio Camurri, “Real- time Automatic Emotion Recognition from Body Gestures” in ArXiv 2014 2) Amol S. Patwardhan and Gerald M. Knapp, “Augmenting Supervised Emotion Recognition with Rule-Based Decision Model.” in ArXiv 2016 QualisysKinect 139
  139. 139. Data Validation – Human annotation 人來評? The sole 3D skeleton is a guarantee that the user is not exploiting other information Not easy for human to recognize emotion only based on gesture reference: Stefano Piana, Alessandra Staglianò, Francesca Odone, Alessandro Verri, Antonio Camurri, “Real-time Automatic Emotion Recognition from Body Gestures” in ArXiv 2014 140
  140. 140. Skeleton based feature anger sadness happiness fear surprise disgust reference: 1)Stefano Piana, Alessandra Staglianò, Francesca Odone, Alessandro Verri, Antonio Camurri, “Real-time Automatic Emotion Recognition from Body Gestures” in ArXiv 2014 2) Piana, S., Stagliano`, A., Camurri, A., and Odone, F. A set of full-body movement features for emotion recognition to help children affected by autism spectrum condition. In IDGEI International Workshop (2013). Histogram: Energy on each of frames 141
  141. 141. Classification Result Qualisys Data with 310 gestures Kinect Data with 579 gestures Clean Dataset Noisy Dataset Almost the same to the human’s recognition ability 142
  142. 142. Skeleton Capture Method 系統 Kinect reference: https://itp.nyu.edu/classes/dance-f16/kinect/, https://github.com/CMU-Perceptual-Computing-Lab/openpose https://www.qualisys.com/ expensive, sophisticated system with multiple high speed camera cheap, easy to get RGB-D 3D camera device free, new software system with CNN Qualisys OpenPose 143
  143. 143. OpenPose: CNN based Method 144 直接對影片處理
  144. 144. Pose difference/movement indicative of ‘arousal’ mostly 145 AffectNatural Language Non- Verbal Speech Physiology Face Body Gestures 多行為整合的重要
  145. 145. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 Expressive Data AI Learning and Inference 的問題嗎? 146
  146. 146. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這可以是個 Internal Data AI Learning and Inference 的問題嗎? 147
  147. 147. 多重迷走神經理論-Polyvagal theory Stephen Porges 148
  148. 148. 人類的層級反應策略 • 第一層次:原始副交感神經系統。 • 通過抑制新陳代謝活動來應對威脅。 • 非主動性(immobilization)行為,裝死、昏厥、停止等策略来應對生命危險。 • 第二層次:交感神經系統。 • 通過增強新陳代謝功能能够,調高腎上腺素提高人的應對能力。 • 產生“戰鬥或逃離”(fight -flight)的主動性(mobilization)行為選擇 。 • 第三層次,社會神經系統,哺乳動物所獨有。 • 行為調節上慘用與社會交流相聯繫的策略,比如表情、語音和傾聽等 • 沒有新陳代謝的興奮或腎上腺素的變化。 • 對臉、喉、咽部肌肉神经控制的增强,於是得以產生複雜的與社会交往相聯繫 的臉部表情和語調。 149
  149. 149. 社會神經系統 • 神經系統的進化決定了人類情緒表達形式、交流 質量和調節行為的能力 • 抑制系统通過降低心律、降低血壓,抑制心臟交 感活性等調節手段,维持個體生長所需要的新陳 代謝的平衡狀態。 • 心率能夠作為一個敏感的信號去反應和識別社交 活動 reference: D.S.Quintana,A.J.Guastella,T.Outhred,I.B.Hickie,andA.H.Kemp. Heart rate variability is associated with emotion recognition: direct evidence for a relationship between the autonomic nervous system and social cognition. Int. J. of Psychophysiol, 86(2):168–172, 2012 http://blog.sina.com.cn/s/blog_753e49f90100pop2.html http://www.xzbu.com/6/view-2908185.htm 150
  150. 150. 心率變化 HRV (Heart Rate Variability) • 自律神經系統(ANS) • 包含交感與副交感神經系統 • 因各種情感刺激而產生作用 • 比方說受到驚嚇時,人會不自主地心跳加速、臉色發青,這就是交感神 經在運作 • 心率變異性(HRV) • 反映了交感與副交感神經系統的平衡狀態 • 多項與自律神經活躍有關的因素,皆會使心率變異降低,譬如:血壓變 化、呼吸、身體或心理壓力,甲狀腺亢進與藥物治療等 • 心率變異分析(HRV analysis) • 有一套完整並且標準化的評斷方法[2] reference: 1) María Teresa Valderas , Juan Bolea, Pablo Laguna, Montserrat Vallverdú, Raquel Bailón, “Human Emotion Recognition Using Heart Rate Variability Analysis with Spectral Bands Based on Respiration” in Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. 2) Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology (1996) Heart rate variability. Standards of measurement, physiological interpretation, and clinical use. Eur Heart J 17(3):354-81 3) D.S.Quintana,A.J.Guastella,T.Outhred,I.B.Hickie,andA.H.Kemp. Heart rate variability is associated with emotion recognition: direct evidence for a relationship between the autonomic nervous system and social cognition. Int. J. of Psychophysiol, 86(2):168–172, 2012. 151
  151. 151. HRV 生理上的指標與意義 心律變異度頻域分析測量指標、定義及臨床意義 指標 單位 定義 頻譜範圍 臨床意義 總功率 total power, TP ms2 全部正常心跳間 期之變異數高頻、 低頻、極低頻的 總和 ≤0.4Hz 整體心律變異度 評估 極低頻範圍功率 very low frequency power, VLFP ms2 極低頻範圍正常 心跳間期之變異 ≤0.04Hz 生理意義不明 低頻範圍功率 low frequency power, LFP ms2 低頻範圍正常心 跳間期之變異數 0.04-0.15Hz 代表交感與副交 感神經活性 高頻範圍功率 high frequency power, HFP ms2 高頻範圍正常心 跳間期之變異數 0.15-0.4Hz 代表副交感神經 活性 標準化低頻功率 normalized LFP, nLFP 標準化單位,n.u. LF/(TP-VLF) 交感神經活性定 量指標 標準化高頻功率 normalized HFP,nHFP 標準化單位,n.u. HF/(TP-VLF) 副交感神經活性 定量指標 低、高頻功率的 比值 LF/HF 無單位 低、高頻功率的 比值 代表自律神經活 性平衡 https://zh.wikipedia.org/wiki/%E5%BF%83%E7%8E%87%E8%AE%8A%E7%95%B0%E5%88%86%E6%9E%90 152
  152. 152. 資料收集方法 • Emotion elicitation • real experiences • film clips • problem solving • computer game interfaces • images • spoken words • music • Movie clips method • emotion inducing method more efficient than others verified by previous studies • 4 films (3- 10 min for each one) • 4 emotion: angry, fear, sad and happy • ECG data was record for 90 sec at 2 min before the end of movies. reference: 1)Han Wen Guo, Yu Shun Huang, Jen Chien Chien, Jiann Shing Shieh, “Short-term Analysis of Heart Rate Variability for Emotion Recognition via a Wearable ECG Device” in Intelligent Informatics and Biomedical Sciences (ICIIBMS), 2015 2) Mimma Nardelli, Gaetano Valenza, Alberto Greco, Antonio Lanata, Enzo Pasquale Scilingo, “Recognizing Emotions Induced by Affective Sounds through Heart Rate Variability” in IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, VOL. 6, NO. 4, OCTOBER- DECEMBER 2015 induced 153
  153. 153. ECG process pipeline reference: Abhishek Vaish and Pinki Kumari, “A Comparative Study on Machine Learning Algorithms in Emotion State Recognition Using ECG” in Proceedings of the Second International Conference on Soft Computing for Problem Solving (SocProS 2012), December 28-30, 2012 154
  154. 154. ECG Feature Extraction: HRV Time Domain Feature Frequency Domain Feature reference: Han Wen Guo, Yu Shun Huang, Jen Chien Chien, Jiann Shing Shieh, “Short-term Analysis of Heart Rate Variability for Emotion Recognition via a Wearable ECG Device” in Intelligent Informatics and Biomedical Sciences (ICIIBMS), 2015 1. MeanRRI: average of resultant RR intervals. 2. CVRR: the ratio of the standard deviation and mean of RR intervals. 3. SDRR: stand deviation of the RR intervals. 4. SDSD: standard deviation of the successive differences of the RR intervals. 1. LF (low frequency): standardized LF power (0.04-0.15 Hz) 2. HF (high frequency): standardized HF power (0.15-0.4 Hz) 3. LHratio: the ratio of LF/HF he shapes of the probability distributions Statistic Feature Evaluate the distribution : 155
  155. 155. Analysis on Feature Time Domain Feature Frequency Domain Feature Statistics Feature reference: Han Wen Guo, Yu Shun Huang, Jen Chien Chien, Jiann Shing Shieh, “Short-term Analysis of Heart Rate Variability for Emotion Recognition via a Wearable ECG Device” in Intelligent Informatics and Biomedical Sciences (ICIIBMS), 2015 156 平衡的機制
  156. 156. Classifier reference: Han Wen Guo, Yu Shun Huang, Jen Chien Chien, Jiann Shing Shieh, “Short-term Analysis of Heart Rate Variability for Emotion Recognition via a Wearable ECG Device” in Intelligent Informatics and Biomedical Sciences (ICIIBMS),2015 157
  157. 157. 這是個例子, 還有很多 158
  158. 158. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 這還是個問題嗎? 當然! 159
  159. 159. 160 情緒辨識的現在發展與未來
  160. 160. Group-Level EmotionThin Slice 技術上的開發 : 納入多重因子 Multi-Task Cross Corpus Common ground Cross Lingual Perspective 161
  161. 161. LLD 這麼多資料庫: 每個不一樣,演算法又這麼多怎麼比較? Encoding
  162. 162. 五個資料庫 比較 Result & Discussion (Binary Classification: Unweighted Average Recall) Database Act. Feature Rep. Val. Feature Rep. CIT 0.658 Praat BoAW 0.613 Praat FV IEMOCAP 0.769 EGEMAPS Func. 0.663 Praat FV NNIME 0.65 Praat FV 0.564 Praat BoAW RECOLA 0.634 EGEMAPS Func. 0.602 Praat BoAW VAM 0.811 ComP_LLD FV 0.665 EGE_LLD BoW
  163. 163. Variational Deep Embedding Fisher Scoring 雙人互動: 我們在情緒辨識這點是否需要考慮 ? 答案: 要會比較好! 該怎麼作
  164. 164. Generated Perspectives Multi-view Kernel Fusion 1. Chun-Min Chang, Bo-Hao Su, Shih-Chen Lin, Jeng-Lin Li, Chi-Chun Lee, "A Boostrapped Multi-View Weighted Kernel Fusion Framework for Cross-Corpus Integration of Multimodal Emotion Recognition" in Proceedings of ACII 2017 2. Chun-Min Chang, Chi-Chun Lee, "Fusion of Multiple Emotion Perspectives: Improving Affect Recognition Through Integrating Cross-Lingual Emotion Information" in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2017 跨語言語音情緒辨識 ? 整合語言特性
  165. 165. Group-Level EmotionThin Slice 技術上的開發 Multi-Task Cross Corpus Common ground Cross Lingual Perspective 166
  166. 166. 情緒在人類行為上的影響 memory cognitive emotion 167
  167. 167. 情緒對人的認知與行為的影響 情緒1 2 3 4專注力 記憶 感覺 判斷力 身經中樞 害怕的情緒往往會刺激杏仁核 作用。這種恐懼的情緒除了對 外在威脅可以產生迅速的反應 外,還會讓記憶更深刻。 醫療發現 在帶有(正向/負向)情緒的時候、 人臉偵測的數量提高 情緒的身體反 應情緒是本質,來自大腦某區功 能反映,而感覺就是情緒引發 的種種反應 情緒判斷 人做的決定會因當時所處的情 緒狀況而不同。 168
  168. 168. 情緒研究在醫療上的價值 自閉症、alexithymia 情感判讀、了解障礙 精神分裂症、強迫症 情緒控制障礙 情緒在健康上的影響 樂觀情緒增強健康與厭倦情緒的負面影 響。人類都會想讓自己健康,他們生活 上做的決定和隨之而來的行為將導致他 們的健康狀況 169
  169. 169. 情緒導入的個體行為 情緒1 2 3 4 5樂觀 厭倦 170
  170. 170. (樂觀/悲觀)在健康行為上面的影響 Mental Well-being 根據研究現代社會人的健康影響因子來 自環境、基因、醫療系統、生活型態 (如圖),人對於的自己的生活型態控管 佔了很重的比例。 探討樂觀與捷克大學成年學生的健康狀 況的研究中,雖然沒有發現顯著的高相 關性,但卻發現樂觀的態度與心靈健康 的指標呈現高度相關,他們解釋:樂觀 的人容易達到情緒祥和穩定的狀態、抱 持正向的態度、睡眠充足、保持良好社 交關係及享受自己做的事情 Ref: https://ac.els-cdn.com/S1877042815003080/1-s2.0-S1877042815003080-main.pdf?_tid=238e46fe-da36-11e7-86e8- 00000aab0f26&acdnat=1512531351_6db4641b5d3531d365e0f207f474d65f 171
  171. 171. (厭倦)情緒在健康行為上面的影響 Well-beings 厭倦情緒和大部分的與成癮問題有很高的相關性。但這種情緒也是促進一個人 的生產力的動力(Boredom, it turns out, can be a dangerous and disruptive state of mind that damages your health)(Mann’s ) Ref: http://alcoholrehab.com/drug-addiction/boredom-and-substance-abuse/ Boredom導致成癮發生的原因多 半來自生活上選擇的侷限,例如 青少年面對一成不變的念書生活, 導致他們想尋求刺激,加上好奇 心的驅使、未成熟的價值觀、同 儕的慫恿,使得酗酒、賭博、暴 食、等等對於健康傷害的問題出 現。 大多數沒有成癮問題的人回報說 他們鮮少有無聊厭倦的問題,主 因來自他們對於生活上成就感的 追求使得他們沒有餘力接觸這些 陋習,因此找到生活的目標是解 決成癮問題的良藥 Ref: On the Function of Boredom (Shane W. Bench) 172
  172. 172. 情緒對自閉症的幫助 Ref: The Facilitation of Social-Emotional Understanding and Social Interaction in High-Functioning Children with Autism: 自閉症的社交障礙: 社交障礙是自閉症的病徵,其重大原因在於他們無法判讀情 緒,研究顯示及早教育自閉症情緒上的知識有助於讓他們融 入社會。 Ref: Social Skills Deficits in Children with Autism Spectrum Disorders: Evidence Based Interventions 主動社交 經過情緒治 療後的自閉 症患者主動 表達社交意 願提高 眼神接觸 判讀他人情 緒會通常脫 離不了從人 的眼神上判 讀。受過教 育的自閉症 病人有獲得 改善 分享自己 經過情緒治 療後的自閉 症患者能懂 得合同材分 享自己 情緒表達 情緒教育的 介入讓自閉 症更能描述 比較複雜的 情緒 173
  173. 173. 教育學習與情緒 學習動機在學習行為上不能以成為不 可忽略要素(Pintrich, 1991, p. 199) 動機心理學、認知心理學、發展心理 學、教育心理學致力於了解人類學習 運作機制,情緒在學習上變成了學習 的基本要素,不論在教學、學習的過 程中都是,因此,了解情緒的運作對 於教育人員是非常重要的 ( 摘 錄 自 : special issue of the Educational Psychologist) 企劃 目標 興趣 自我 情緒 成就感 Ref: The Importance of Students‘ Goals in TheirEmotional Experience of Academic Failure:Investigating the Precursors and Consequences of Shame (Jeannine E. Turner ) 174
  174. 174. 著作:《The importance of Students’ Goals in Their Emotional Experience of Academic Failure: Investigating the Precursors and Consequences of Shame,”》 發現:給予學生長遠的計畫有助於提 升學習的彈性。 這篇研究主要在探討學生學習失敗的 例子。羞恥感、失敗、錯愕常常導致 學生輟學、自卑等等負面影響,但如 果我們藉由輔導他產生短程、長程的 學習目標,在增加學習彈性的情況下, 能夠加強學生自我管控的行為,並且 提升學生受挫抗壓的能力。 The Importance of Students‘ Goals in TheirEmotional Experience of Academic Failure:Investigating the Precursors and Consequences ofShame (Jeannine E. Turner) 175
  175. 175. 情緒與消費行為、廣告 FMRI核磁造影證實情消費行為與情緒(感 覺與經驗)比較相關而非資訊 廣告研究發現消費者對於廣告的情緒反應 的影響遠大於廣告內容本身 同樣廣告研究發現對於廣告好感度與因為 廣告增加產品銷售量有相關性 研究發現正向情緒對於消費者對品牌的忠 誠度比其他任何判斷都來的有關 Ref: https://blog.hubspot.com/marketing/emotions-in-advertising-examples 176 情緒內容往往會加深人的印象 開心: 開心的影片會很容易被別人轉貼 害怕: 引發人恐懼的影片通常是作為宣導相關的目的 難過: 人壽類型的企業常常引發消費者悲傷的情緒,通常會與孩子或是父母為主題 生氣: 生氣的情緒可以促使人反省
  176. 176. 情緒能隨著音樂被喚起 • 情緒會渲染聽到音樂的感覺,情緒也會隨著音 樂被激發(communication and perception of emotion in music) • 音樂引發出的情緒後果(emotional consequences of music listening) • 情緒可作為使用者音樂偏好的依據(predictors of music preferences) 音樂對於人類情緒一直密不可分,心理學家還在尋求音 樂與情緒的直接關聯性Swathi Swaminathan 提出了三種 對情緒的假設: Ref: Current Emotion Research in Music Psychology (Swathi Swaminathan ) 177
  177. 177. 情緒的重要 情緒的應用 光是可以SENSE EMOTION這件事… ….交給大家想像…. 178
  178. 178. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 還沒完呢! 179
  179. 179. functional Magnetic Resonance Imaging (fMRI) 180
  180. 180. 181
  181. 181. • Uses a standard MRI scanner • Acquires a series of images (numbers) • Measure changes in blood oxygenation • Use non-invasive, non-ionizing radiation • Can be repeated many times; can be used for a wide range of subjects • Combines good spatial and reasonable temporal resolution Synopsis of fMRI 182
  182. 182. Blood-Oxygen-Level dependent (BOLD) 183
  183. 183. Emotion Perception Decoding from fMRI fMRI Dataset Interaction behavior SPM Preprocessing Emotion Machine Learning Behavior observation 184
  184. 184. Emotional modules 活化的腦區 185
  185. 185. Co-activation graph for each emotion category A) Force-directed graphs for each emotion category, based on the Fruchterman-Reingold spring algorithm B) The same connections in the anatomical space of the brain. 186連結的腦區 - 非單一區
  186. 186. 什麼是情緒 ? 187 所以?
  187. 187. 1 個人需求/動機 生理 心理 精神 2 個人理念以及心態 4 感覺產生 1 外在刺激 *轉變 *要求 *衝突 3 事物理解 6 行 為 產 生1 個人背景 *社會文化 *經驗 1 個人狀態 記憶力 專注力 判斷力 5 情緒發生 不過有了解這個INFERENCE可能性,有更多方向? 188
  188. 188. 189 那我們是誰?
  189. 189. Our Research: Human-centered Behavioral Signal Processing (BSP) Prof. Shrikanth Narayanan Seek a window into human mind and traits… …through engineering approach S. Narayanan and P. G. Georgiou, “Behavioral signal processing: Deriving human behavioral informatics from speech and language," Proceedings of the IEEE, vol. 101, no. 5, pp. 1203–1233, 2013. Daniel Bone, Chi-Chun Lee, Theodora Chaspari, James Gibson, Shrikanth Narayanan, "Signal Processing and Machine Learning for Mental Health Research and Clinical Applications", in IEEE Signal Processing Magazine 人類行為訊號暨互動計算研究室 (EECS713) Behavioral Informatics and Interaction Computation Laboratory (BIIC)
  190. 190. 訊號處理 (Signal Processing) 機器學習演算法 (Machine Learning) 決策工具 (Decision Analytics) High-dimensional Behavior Space, Non-linear Predictive Recognition, Multimodal Integration, Experts Decision Mechanism Spatial-temporal modeling De-noising Feature extraction Supervised Unsupervised Semi-supervised Our Technology: Human-centric Decision Analytics Research & Development Core Technology Speech & Language Diarization, SpeakerID, ASR, Paralinguistic Descriptors, Emotion- AI, Sentiment, Word-topic Representation Computer Vision Segmentation, Tracking, Image- Video Descriptors Multimodal Fusion Joint speech-language-gesture modeling for multimodal prediction, Multi-party interaction modeling Representation Learning Behavior embedded space learning, clinical health informatics data representation Predictive Learning Deep-learning, machine learning based predictive modeling
  191. 191. BIIC Interdisciplinary Research ASD PAIN EHR fMRI EMO- AI ORAL Mental Health Clinical Health Affective Computing Education Our Application: Human-centered Exemplary BSP Domains Flow Consumer Behavior EMO -AI Neuroscience KEY APPLICATIONS Affective Computing Mental Health Clinical Health Education Neuroscience Consumer Behavior 跨領域: 以人為主的應用
  192. 192. 193 科技、資料、人類行為、人工智慧、跨領域合作 提供專家決策工具,全新各種的可能 顯微鏡: 不只是 “放大” 可以研究開發幫助社會有意義科技應用 Computing beyond status-quo in making a positive impact
  193. 193. Factual Conceptual Procedural Metacognitive Computation Blueprints Behavior Computing Health Analytics Affect Recognition Emphatic Computing Social Computing Value- Sensitive TechnologyAffective Feedback Interpersonal Relationship Computing Cognitive Feedback Fulfillment Empowerment Motivation Internal States External Functions Our Vision: Human-Centric Computing (HCC) “…computationally innovate human-centric empowerment enabling next-generation entity intelligence”
  194. 194. 195 張鈞閔 陳金博 李政霖 林畇劭 蔡福昇 周惶振 洪振瀛醫師 PHD 學生 研究生 BIIC LAB MEMBERS
  195. 195. 196 BIIC Lab @ NTHU EE http://biic.ee.nthu.edu.tw THANK YOU . . .

×