14. Commands
Dashboard => Show Something
● Health Status
● Sum of Biz TX
● Sys Resources
● …
Push or Pull Data
14
Target Services /
Systems
Watchers Controllers
Events
(Conditions / Thresholds)
Console => Do Something
● Reset or Clean Cache
● On / Off Functions
● Notification
● ...
Feedback
(Adjust Conditions / Thresholds by ML)
22. Levels of Health Check
● Light / Static Health Check
● Layer Health Check
● Deep Health Check
22
23. Levels of Health Check - Example
23
ASG
ELB
(Internet-Facing)
Route 53
Web App
ASG
Web Servers ELB
(Internal ELB)
App ServersThird Party
Services
Health-Checker
Light Health Check
Layer Health Check
Deep Health Check
Service A
Service B
24. Levels of Health Check
24
● Light / Static Health Check
○ Application 自己是正常的, 像是: Tomcat, IIS 正常運作
● Layer Health Check
○ App 跟另一個 App 溝通是正常的, Tomcat to Redis
○ 出問題時,釐清問題的節點
● Deep Health Check
○ 確認 Service 自身的商務邏輯是正常的:登入、結帳
25. 25
Service A Service B
Service C
Service D
Service E
(Third Party)
Service Dependencies (Internal)
26. Levels of Health Check
26
● Light / Static Health Check - Application Self
● Layer Health Check - App to App
● Deep Health Check: Service Self
● Service Health Check: Service to Services
28. Health Check 的設計
28
● Levels of Health Check: Light/Static, Layer, Deep
● 把 Health Check 設計成 API,增加可測性
○ ElasticSearch Health
● 提供每個 Request Unique MessageId,每個 Role 都要有
Health API、有 Test / Fake Mode
● 釐清 Third Party 的依賴性,重要的服務要納入 Health
Check 範圍
113. Commands
Dashboard => Show Something
● Health Status
● Sum of Biz TX
● Sys Resources
● …
Push or Pull Data
113
Target Services /
Systems
Watchers Controllers
Events
(Conditions / Thresholds)
Console => Do Something
● Reset or Clean Cache
● On / Off Functions
● Notification
● ...
Feedback
(Adjust Conditions / Thresholds by ML)
122. 參考資料
● Amazon CloudWatch Documentation
● AWS re:Invent 2014: Amazon CloudWatch Deep Dive
● AWS re:Invent 2016: From Monolithic to Microservices
● Deep Health Check Pattern
● SRE: How Google Runs Production Systems
122