SlideShare uma empresa Scribd logo
1 de 46
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
建立安全高效的資料分析平台加速
金融創新
T r a c k 1 | S e s s i o n 6
HC Lo
Solutions Architect
Amazon Web Services
Cliff Chao-kuan Lu
Principal Cloud Architect
EMQ
Agenda
The value of data in FSI/Fintech
Building a modern data platform on AWS
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data is a strategic asset for every organization
The world’s most
valuable resource is
no longer oil, but data.
David Parkins, 2017, The Economist
“
”
Data creates the next edge for financial institutions
Core processing Client facing
• Data lineage and
traceability
• Markets in Financial
Instruments Directive
(MiFID II)
• Consolidated Audit Trail
(CAT)
• Anti-money laundering
Compliance &
Regulatory Reporting
Business
Analytics
• Gain a holistic view of
the business
• Identify market trends
and opportunities
• Capture usage data from
multiple devices
• Fraud detection
Markets Surveillance &
Trading
• Aggregate traditional
market data with
alternative data
• Markets surveillance
• Portfolio optimization
• Back-testing trading /
investment strategies
Customer
Experience
• Capture interaction data
• Create targeted products
and services
• Provide personalized
experiences with timely,
tailored messages
• Reach customers via
their preferred channel
Challenge
• Need to handle 150 billion events per day
• Need to run complex surveillance queries
over 20+ PB of data to detect and analyze
illegal market activity
Solution
• Data lake – S3
• Data processing – Amazon EMR, Amazon
Redshift
Benefits
• increase agility, speed, and cost savings
while also operating at scale
• Estimate to save 10~20 million USD annually
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Typical steps of building a data platform
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Common considerations
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Easy to build
1
Comprehensive
and open
2
Secure
infrastructure
3
Common considerations
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Easy to build Comprehensive
and open
Secure
infrastructure
1 2 3
Financial institutions are
collecting an unprecedented
amount of data
1. Axis Corporate, Understanding Big Data in Financial Services, https://axiscorporate.com/us/infographic/understanding-big-data-in-financial-services-infographic/
• 90% of data worldwide has been
generated in the last five years
• 2.2 million terabytes of new data is
created everyday1
• Structured, semi-structured, and
unstructured data
Data silos to
OLTP ERP CRM LOB
Data warehouse silo 1
Business
intelligence
Devices Web Sensors Social
Data warehouse silo 2
Business
intelligence Machine
learning
BI +
analyticsData
warehousing
Data lakes
Open formats
Central catalog
Traditional data warehousing approaches don’t scale
Build on robust data lake with Amazon S3
Store any data in any format
99.999999999% durability
Global replication capabilities
Secure, compliant, and auditable
Cost-effective storage classes
Data
warehousing
Analytics Machine
learning
Data lake
Amazon S3
Choose the right data lake storage class
Raw data
Amazon S3
Standard
Production
data lake
Amazon S3
Intelligent-Tiering
Historical
data
Amazon S3 Glacier or
S3 Glacier Deep Archive
ETL
Amazon S3
Standard
Online
cool data
Amazon S3
Standard
Infrequent Access
Optimize costs for all stages of data lake workflows
Serverless ETL and data integration with AWS Glue
Serverless provisioning, configuration, and
scaling to run your ETL jobs
Pay only for the resources used for jobs
Automates the effort in building,
maintaining, and running ETL jobs
Makes it easy to schedule recurring ETL jobs,
chain multiple jobs together, or invoke jobs
on-demand
Manage the dependencies between your jobs,
automatically scales underlying resources,
and retries jobs if they fail
Common considerations
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Easy to build Comprehensive
and open
Secure
infrastructure
1 2 3
Comprehensive data analytics services
Data
warehousing
Big data
processing
Interactive
query
Operational
analytics
Real-time
analytics
Comprehensive data analytics services
Data
warehousing
Big data
processing
Interactive
query
Operational
analytics
Real-time
analytics
Amazon Kinesis
Data Analytics
Amazon EMR
Amazon
Elasticsearch Service
Amazon AthenaAmazon Redshift
Big data processing: Amazon EMR
Amazon
EMR
PIG
SQL
Amazon S3
EMRFS
Amazon EC2
Application
Data
processing
framework
Storage
Infra
Managed petabytes of unstructured
or semi-structured data
Support 21 open source projects,
including Apache Hadoop, Spark,
HBase, Presto, etc.
Decouple compute and storage
Enterprise-level security and
reliability
Cost optimization
Data warehousing: Amazon Redshift
JDBC/ODBC
Amazon Redshift
Managed Storage
RDS
PostgreSQL
Aurora
PostgreSQL
S3 Data lake
Analyze exabytes of data across data warehouses,
data lakes, and operational databases
Massively parallel data processing
Columnar storage, data compression, and zone maps
reduce the amount of I/O needed to perform queries
Cost-optimize workloads by paying compute and
storage separately
Security out of the box, at no extra cost
Interactive query: Amazon Athena
Amazon S3
Data lake
AWS Glue
Amazon
Athena
Amazon
QuickSight
AWS IoT
AI/ML
Devices Web Sensors Social
Pay per query
Save 30%–90% on
per-query costs
through compression
$
Serverless,
zero infrastructure,
zero administration
Query instantly
Point to S3 and start
querying with
standard SQL queries
SQL
Scales automatically
based on complexity
of queries
Common considerations
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Easy to build Comprehensive
and open
Secure
infrastructure
1 2 3
• Can our data lake scale to support
a multi-account strategy?
• Can data stewards have ownership
of their data and metadata once
it’s in the data lake?
• Can our users access the data lake
in accordance with corporate
authentication standards?
• Can data lake user interfaces
integrate with our Active Directory?
Multi-account supportCompliance programs
Authorization
Authentication
Encryption Private network connectivity
• What regulatory and data privacy
issues should we consider when
building a data lake?
• What services should we use to
meet our compliance obligations?
• Does the data lake support role-
based authorization for accessing
data and metadata in the data
lake?
• How do we ensure that users only
access data that they are allowed
to see?
• Does the data lake architecture
support our corporate encryption
standards for data at rest and in
motion?
• Does the data lake integrate with
our key management service?
• Can all traffic to and from the data
lake transfer over private and
secure network links?
• Can we block all internet access to
and from our data lake?
Common security considerations for data lakes
AWS Lake Formation: simplify security management
Summary: turn data into insights
Move data2 Clean, prep, and
catalog data
3
Configure and
enforce security and
compliance policies
4
Make data available
for analytics
5
Set up
storage
1
Easy to build Comprehensive
and open
Secure
infrastructure
1 2 3
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
EMQ 公司介紹
原分析流程
引入無伺服器資料分析服務
總結
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A cross-border
settlement solution
for any financial
transaction
FinTech since 2014
Licensed in Hong Kong, Indonesia,
Singapore, and Taiwan
15+ bank partners
50+ payout countries
15,000+ cash pickup location
200,000+ bank branches
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMQ 由資料驅動業務
VPC
EMQ Services
PostgreSQL
運用 Amazon Redshift 後查詢效率提升
VPC
Amazon RedshiftPostgreSQL Amazon QuickSightAWS Data Pipeline
Amazon Simple
Storage Service
EMQ Services
使用 Data Pipeline
與 Redshift
• 需要 S3 作為中介
• 由 Data Pipeline 配置
Redshift 的結構
• Redshift 本身可用於去除重
複資料列
• 在運行報告的時間外,叢集
會閒置;因此平均利用率低
• T 系列有助於控制成本
Redshift CPU Utilization (%)
尋找更適合現階段的
資料分析架構
技術選型目標
• 支持多種輸入、輸出介面;
可由PostgreSQL 輸入、由 QuickSight 查詢
• 使用主流框架,或支持以主流語言操作資料
• 日常維運門檻必須低
• 資訊安全需容易稽核、配置
• 效能門檻不高,但需要…
• Output Partitioning
• 差異備份
• 資料操作,Join / Flatten / Remove / Mask
Output Partitioning
https://aws.amazon.com/tw/blogs/big-data/work-with-partitioned-data-in-aws-glue/
查詢時帶入過濾條件,可減
少載入的資料量,加速查詢
並減低成本
差異備份
• 只讀取更新的資料列,載入資料量小
• 要求遞增欄位
• 大多 ETL 工具包含類似機制
• 部分服務原生支持資料合併
候選名單
Data Pipeline Database Migration Service Glue
儲存 / 分析
服務
Redshift Redshift S3 / Athena S3 / Athena
Output
Partitioning
不支持但夠快 不支持但夠快 不支持 支持
差異備份 任意遞增欄位 需要 Primary Key
遞增 Primary Key
任意遞增欄位
資料操作 Shell Script 不支持 JSON 等結構操作 Python / Spark
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Athena
EMQ 由資料驅動業務: 新架構、無伺服器
VPC
PostgreSQL Amazon QuickSight
Amazon Simple
Storage Service
EMQ Services
AWS Glue
Why AWS Glue
• 無伺服器,託管服務
• 計費模型單純:用多少,付多少
• 支持多種資料來源及輸出格式
• Glue 的輸出可以再成為 Glue 的輸入
• Glue 妥善整合其他 AWS 服務
• 容易學習、理解
• 稽核與配置檢查相對容易
• 抽象結構容易理解、調度功能符合需求
• Job 可運行 Python 或 Spark
Pushdown Predicate
• Spark SQL 在資料源過濾查詢條件
• 若透過索引完成,可縮短執行時間,並減少運算、傳輸量
• 目前還不支援 ODBC / JDBC 資料源
Why Amazon Athena
• 無伺服器,託管服務
• 計費模型單純:用多少,付多少
• 透過 SQL 介面查詢資料
• 與其他 AWS 服務整合密切
• 可用於創建新資料集 (S3) 與資料表 (Glue)
• 查詢結果儲存於 S3,容易管理與重用
• 透過 QuickSight 查詢並創建報表
In The Works – 持續優化費用與效能
• 我們的查詢仍有待優化,目前有部分 QuickSight 操作超時
計劃引入 Partition、暫存表、並將複雜的 Join 移至 Glue 以改善
• 若無法優雅解決差異備份問題,將引入或轉換至 EMR
• Pushdown Predicate for JDBC / ODBC
• 費用與效能
• 大多業務報告以 Glue / Athena 運行,持續向 best practices 靠攏
• 正與分析師協同重構,預估可提供每小時資料更新,並將費用減低爲 1/8
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
總結 – EMQ 引入無伺服器資料分析服務的效益
• 減緩學習曲線
• 預設功能堪用、日常維運門檻低
• 整合其他 AWS 服務,能承襲已有的知識
• 底層為開源專案,兼作爲學習的基石
• 快速迭代,持續進化
• 計費模式讓 EMQ 不需在 Day1 預估用量,可以嘗試各種架構設計
• 在 AWS 生態系內外有多種服務可整合、協作、相互替代
• AWS 提供的新服務與新功能,讓 EMQ 的選項益加豐富
Thank you!
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
HC Lo
hclo@amazon.com
Cliff Chao-kuan Lu
clifflu@emq.com

Mais conteúdo relacionado

Mais procurados

Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxAmazon Web Services
 
Track 6 Session 2_ 搭建現代化的資料數據湖.pptx
Track 6 Session 2_ 搭建現代化的資料數據湖.pptxTrack 6 Session 2_ 搭建現代化的資料數據湖.pptx
Track 6 Session 2_ 搭建現代化的資料數據湖.pptxAmazon Web Services
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxAmazon Web Services
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Amazon Web Services
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Amazon Web Services
 
DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%Amazon Web Services
 
Moving your commercial databases to Amazon RDS
Moving your commercial databases to Amazon RDSMoving your commercial databases to Amazon RDS
Moving your commercial databases to Amazon RDSAmazon Web Services
 
2016 summits - future of enterprise it
2016 summits - future of enterprise it2016 summits - future of enterprise it
2016 summits - future of enterprise itAmazon Web Services
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Amazon Web Services
 
Migrate & Modernize your legacy Microsoft applications with AWS
Migrate & Modernize your legacy Microsoft applications with AWSMigrate & Modernize your legacy Microsoft applications with AWS
Migrate & Modernize your legacy Microsoft applications with AWSAmazon Web Services
 
Develop a Custom Data Solution Architecture with NorthBay
Develop a Custom Data Solution Architecture with NorthBayDevelop a Custom Data Solution Architecture with NorthBay
Develop a Custom Data Solution Architecture with NorthBayAmazon Web Services
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Amazon Web Services
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSAmazon Web Services
 
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程Amazon Web Services
 
Track 1 Session 3_建構安全高效的電子設計自動化環境
Track 1 Session 3_建構安全高效的電子設計自動化環境Track 1 Session 3_建構安全高效的電子設計自動化環境
Track 1 Session 3_建構安全高效的電子設計自動化環境Amazon Web Services
 
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptxTrack 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptxAmazon Web Services
 
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全Amazon Web Services
 
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxTrack 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxAmazon Web Services
 

Mais procurados (20)

Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptxTrack 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
Track 6 Session 1_進入 AI 領域的第一步驟_資料平台的建置.pptx
 
Track 6 Session 2_ 搭建現代化的資料數據湖.pptx
Track 6 Session 2_ 搭建現代化的資料數據湖.pptxTrack 6 Session 2_ 搭建現代化的資料數據湖.pptx
Track 6 Session 2_ 搭建現代化的資料數據湖.pptx
 
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptxTrack 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
Track 4 Session 2_MAD03 容器技術和 AWS Lambda 讓您專注「應用優先」.pptx
 
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
Track 4 Session 1_MAD01 如何活用事件驅動架構快速擴展應用
 
Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows Migra le tue file shares in cloud con FSx for Windows
Migra le tue file shares in cloud con FSx for Windows
 
DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%DEM06 How Demandbase Cut Its Container Costs by 79%
DEM06 How Demandbase Cut Its Container Costs by 79%
 
Moving your commercial databases to Amazon RDS
Moving your commercial databases to Amazon RDSMoving your commercial databases to Amazon RDS
Moving your commercial databases to Amazon RDS
 
2016 summits - future of enterprise it
2016 summits - future of enterprise it2016 summits - future of enterprise it
2016 summits - future of enterprise it
 
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
Understanding AWS Managed Databases and Analytic Services - AWS Innovate Otta...
 
Migrate & Modernize your legacy Microsoft applications with AWS
Migrate & Modernize your legacy Microsoft applications with AWSMigrate & Modernize your legacy Microsoft applications with AWS
Migrate & Modernize your legacy Microsoft applications with AWS
 
Develop a Custom Data Solution Architecture with NorthBay
Develop a Custom Data Solution Architecture with NorthBayDevelop a Custom Data Solution Architecture with NorthBay
Develop a Custom Data Solution Architecture with NorthBay
 
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
Track 6 Session 6_ 透過 AWS AI 服務模擬、部署機器人於產業之應用
 
Modernizing upstream workflows with aws storage - john mallory
Modernizing upstream workflows with aws storage -  john malloryModernizing upstream workflows with aws storage -  john mallory
Modernizing upstream workflows with aws storage - john mallory
 
The Future of Enterprise IT
The Future of Enterprise IT The Future of Enterprise IT
The Future of Enterprise IT
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
Track 4 Session 3_ 利用 AWS Step Functions 建構穩健的業務處理流程
 
Track 1 Session 3_建構安全高效的電子設計自動化環境
Track 1 Session 3_建構安全高效的電子設計自動化環境Track 1 Session 3_建構安全高效的電子設計自動化環境
Track 1 Session 3_建構安全高效的電子設計自動化環境
 
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptxTrack 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
Track 6 Session 3_如何藉由 AWS AI 和機器學習平台搭建多功能的 AI 解決方案.pptx
 
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全
Track 5 Session 1_如何藉由多層次防禦搭建網路應用安全
 
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptxTrack 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
Track 5 Session 3_ 迎戰DDoS攻擊的資安最佳實踐.pptx
 

Semelhante a Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx

Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsAmazon Web Services
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & AnalyticsAmazon Web Services
 
Financial Services Analytics on AWS
Financial Services Analytics on AWSFinancial Services Analytics on AWS
Financial Services Analytics on AWSAmazon Web Services
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스Amazon Web Services Korea
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Euronext_AWS_talend_connect_paris_2018.pdf
Euronext_AWS_talend_connect_paris_2018.pdfEuronext_AWS_talend_connect_paris_2018.pdf
Euronext_AWS_talend_connect_paris_2018.pdfAmazon Web Services
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantageAmazon Web Services
 
Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016Amazon Web Services
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)Denodo
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesAmazon Web Services
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight OverviewLam Le
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingAmazon Web Services
 
Data Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data StrategyData Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data StrategyData Con LA
 
Tapdata Product Intro
Tapdata Product IntroTapdata Product Intro
Tapdata Product IntroTapdata
 

Semelhante a Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx (20)

Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
AWSome Day Galway Intro
AWSome Day Galway IntroAWSome Day Galway Intro
AWSome Day Galway Intro
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
State of the Union: Database & Analytics
State of the Union: Database & AnalyticsState of the Union: Database & Analytics
State of the Union: Database & Analytics
 
Financial Services Analytics on AWS
Financial Services Analytics on AWSFinancial Services Analytics on AWS
Financial Services Analytics on AWS
 
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Euronext_AWS_talend_connect_paris_2018.pdf
Euronext_AWS_talend_connect_paris_2018.pdfEuronext_AWS_talend_connect_paris_2018.pdf
Euronext_AWS_talend_connect_paris_2018.pdf
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016Big Data on AWS - Toronto FSI Symposium - October 2016
Big Data on AWS - Toronto FSI Symposium - October 2016
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
Modern Data Architectures for Business Outcomes
Modern Data Architectures for Business OutcomesModern Data Architectures for Business Outcomes
Modern Data Architectures for Business Outcomes
 
Module 3 - QuickSight Overview
Module 3 - QuickSight OverviewModule 3 - QuickSight Overview
Module 3 - QuickSight Overview
 
IBM Cloud pak for data brochure
IBM Cloud pak for data   brochureIBM Cloud pak for data   brochure
IBM Cloud pak for data brochure
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory Reporting
 
Data Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data StrategyData Con LA 2022 - Modern Data Strategy
Data Con LA 2022 - Modern Data Strategy
 
Tapdata Product Intro
Tapdata Product IntroTapdata Product Intro
Tapdata Product Intro
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 

Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx

  • 1. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. 建立安全高效的資料分析平台加速 金融創新 T r a c k 1 | S e s s i o n 6 HC Lo Solutions Architect Amazon Web Services Cliff Chao-kuan Lu Principal Cloud Architect EMQ
  • 2. Agenda The value of data in FSI/Fintech Building a modern data platform on AWS
  • 3. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. Data is a strategic asset for every organization The world’s most valuable resource is no longer oil, but data. David Parkins, 2017, The Economist “ ”
  • 5. Data creates the next edge for financial institutions Core processing Client facing • Data lineage and traceability • Markets in Financial Instruments Directive (MiFID II) • Consolidated Audit Trail (CAT) • Anti-money laundering Compliance & Regulatory Reporting Business Analytics • Gain a holistic view of the business • Identify market trends and opportunities • Capture usage data from multiple devices • Fraud detection Markets Surveillance & Trading • Aggregate traditional market data with alternative data • Markets surveillance • Portfolio optimization • Back-testing trading / investment strategies Customer Experience • Capture interaction data • Create targeted products and services • Provide personalized experiences with timely, tailored messages • Reach customers via their preferred channel
  • 6. Challenge • Need to handle 150 billion events per day • Need to run complex surveillance queries over 20+ PB of data to detect and analyze illegal market activity Solution • Data lake – S3 • Data processing – Amazon EMR, Amazon Redshift Benefits • increase agility, speed, and cost savings while also operating at scale • Estimate to save 10~20 million USD annually
  • 7. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. Typical steps of building a data platform Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1
  • 9. Common considerations Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1 Easy to build 1 Comprehensive and open 2 Secure infrastructure 3
  • 10. Common considerations Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1 Easy to build Comprehensive and open Secure infrastructure 1 2 3
  • 11. Financial institutions are collecting an unprecedented amount of data 1. Axis Corporate, Understanding Big Data in Financial Services, https://axiscorporate.com/us/infographic/understanding-big-data-in-financial-services-infographic/ • 90% of data worldwide has been generated in the last five years • 2.2 million terabytes of new data is created everyday1 • Structured, semi-structured, and unstructured data
  • 12. Data silos to OLTP ERP CRM LOB Data warehouse silo 1 Business intelligence Devices Web Sensors Social Data warehouse silo 2 Business intelligence Machine learning BI + analyticsData warehousing Data lakes Open formats Central catalog Traditional data warehousing approaches don’t scale
  • 13. Build on robust data lake with Amazon S3 Store any data in any format 99.999999999% durability Global replication capabilities Secure, compliant, and auditable Cost-effective storage classes Data warehousing Analytics Machine learning Data lake Amazon S3
  • 14. Choose the right data lake storage class Raw data Amazon S3 Standard Production data lake Amazon S3 Intelligent-Tiering Historical data Amazon S3 Glacier or S3 Glacier Deep Archive ETL Amazon S3 Standard Online cool data Amazon S3 Standard Infrequent Access Optimize costs for all stages of data lake workflows
  • 15. Serverless ETL and data integration with AWS Glue Serverless provisioning, configuration, and scaling to run your ETL jobs Pay only for the resources used for jobs Automates the effort in building, maintaining, and running ETL jobs Makes it easy to schedule recurring ETL jobs, chain multiple jobs together, or invoke jobs on-demand Manage the dependencies between your jobs, automatically scales underlying resources, and retries jobs if they fail
  • 16. Common considerations Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1 Easy to build Comprehensive and open Secure infrastructure 1 2 3
  • 17. Comprehensive data analytics services Data warehousing Big data processing Interactive query Operational analytics Real-time analytics
  • 18. Comprehensive data analytics services Data warehousing Big data processing Interactive query Operational analytics Real-time analytics Amazon Kinesis Data Analytics Amazon EMR Amazon Elasticsearch Service Amazon AthenaAmazon Redshift
  • 19. Big data processing: Amazon EMR Amazon EMR PIG SQL Amazon S3 EMRFS Amazon EC2 Application Data processing framework Storage Infra Managed petabytes of unstructured or semi-structured data Support 21 open source projects, including Apache Hadoop, Spark, HBase, Presto, etc. Decouple compute and storage Enterprise-level security and reliability Cost optimization
  • 20. Data warehousing: Amazon Redshift JDBC/ODBC Amazon Redshift Managed Storage RDS PostgreSQL Aurora PostgreSQL S3 Data lake Analyze exabytes of data across data warehouses, data lakes, and operational databases Massively parallel data processing Columnar storage, data compression, and zone maps reduce the amount of I/O needed to perform queries Cost-optimize workloads by paying compute and storage separately Security out of the box, at no extra cost
  • 21. Interactive query: Amazon Athena Amazon S3 Data lake AWS Glue Amazon Athena Amazon QuickSight AWS IoT AI/ML Devices Web Sensors Social Pay per query Save 30%–90% on per-query costs through compression $ Serverless, zero infrastructure, zero administration Query instantly Point to S3 and start querying with standard SQL queries SQL Scales automatically based on complexity of queries
  • 22. Common considerations Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1 Easy to build Comprehensive and open Secure infrastructure 1 2 3
  • 23. • Can our data lake scale to support a multi-account strategy? • Can data stewards have ownership of their data and metadata once it’s in the data lake? • Can our users access the data lake in accordance with corporate authentication standards? • Can data lake user interfaces integrate with our Active Directory? Multi-account supportCompliance programs Authorization Authentication Encryption Private network connectivity • What regulatory and data privacy issues should we consider when building a data lake? • What services should we use to meet our compliance obligations? • Does the data lake support role- based authorization for accessing data and metadata in the data lake? • How do we ensure that users only access data that they are allowed to see? • Does the data lake architecture support our corporate encryption standards for data at rest and in motion? • Does the data lake integrate with our key management service? • Can all traffic to and from the data lake transfer over private and secure network links? • Can we block all internet access to and from our data lake? Common security considerations for data lakes
  • 24. AWS Lake Formation: simplify security management
  • 25. Summary: turn data into insights Move data2 Clean, prep, and catalog data 3 Configure and enforce security and compliance policies 4 Make data available for analytics 5 Set up storage 1 Easy to build Comprehensive and open Secure infrastructure 1 2 3
  • 26. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 28. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 29. A cross-border settlement solution for any financial transaction FinTech since 2014 Licensed in Hong Kong, Indonesia, Singapore, and Taiwan 15+ bank partners 50+ payout countries 15,000+ cash pickup location 200,000+ bank branches
  • 30. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 32. 運用 Amazon Redshift 後查詢效率提升 VPC Amazon RedshiftPostgreSQL Amazon QuickSightAWS Data Pipeline Amazon Simple Storage Service EMQ Services
  • 33. 使用 Data Pipeline 與 Redshift • 需要 S3 作為中介 • 由 Data Pipeline 配置 Redshift 的結構 • Redshift 本身可用於去除重 複資料列 • 在運行報告的時間外,叢集 會閒置;因此平均利用率低 • T 系列有助於控制成本 Redshift CPU Utilization (%) 尋找更適合現階段的 資料分析架構
  • 34. 技術選型目標 • 支持多種輸入、輸出介面; 可由PostgreSQL 輸入、由 QuickSight 查詢 • 使用主流框架,或支持以主流語言操作資料 • 日常維運門檻必須低 • 資訊安全需容易稽核、配置 • 效能門檻不高,但需要… • Output Partitioning • 差異備份 • 資料操作,Join / Flatten / Remove / Mask
  • 36. 差異備份 • 只讀取更新的資料列,載入資料量小 • 要求遞增欄位 • 大多 ETL 工具包含類似機制 • 部分服務原生支持資料合併
  • 37. 候選名單 Data Pipeline Database Migration Service Glue 儲存 / 分析 服務 Redshift Redshift S3 / Athena S3 / Athena Output Partitioning 不支持但夠快 不支持但夠快 不支持 支持 差異備份 任意遞增欄位 需要 Primary Key 遞增 Primary Key 任意遞增欄位 資料操作 Shell Script 不支持 JSON 等結構操作 Python / Spark
  • 38. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 39. Amazon Athena EMQ 由資料驅動業務: 新架構、無伺服器 VPC PostgreSQL Amazon QuickSight Amazon Simple Storage Service EMQ Services AWS Glue
  • 40. Why AWS Glue • 無伺服器,託管服務 • 計費模型單純:用多少,付多少 • 支持多種資料來源及輸出格式 • Glue 的輸出可以再成為 Glue 的輸入 • Glue 妥善整合其他 AWS 服務 • 容易學習、理解 • 稽核與配置檢查相對容易 • 抽象結構容易理解、調度功能符合需求 • Job 可運行 Python 或 Spark
  • 41. Pushdown Predicate • Spark SQL 在資料源過濾查詢條件 • 若透過索引完成,可縮短執行時間,並減少運算、傳輸量 • 目前還不支援 ODBC / JDBC 資料源
  • 42. Why Amazon Athena • 無伺服器,託管服務 • 計費模型單純:用多少,付多少 • 透過 SQL 介面查詢資料 • 與其他 AWS 服務整合密切 • 可用於創建新資料集 (S3) 與資料表 (Glue) • 查詢結果儲存於 S3,容易管理與重用 • 透過 QuickSight 查詢並創建報表
  • 43. In The Works – 持續優化費用與效能 • 我們的查詢仍有待優化,目前有部分 QuickSight 操作超時 計劃引入 Partition、暫存表、並將複雜的 Join 移至 Glue 以改善 • 若無法優雅解決差異備份問題,將引入或轉換至 EMR • Pushdown Predicate for JDBC / ODBC • 費用與效能 • 大多業務報告以 Glue / Athena 運行,持續向 best practices 靠攏 • 正與分析師協同重構,預估可提供每小時資料更新,並將費用減低爲 1/8
  • 44. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 45. 總結 – EMQ 引入無伺服器資料分析服務的效益 • 減緩學習曲線 • 預設功能堪用、日常維運門檻低 • 整合其他 AWS 服務,能承襲已有的知識 • 底層為開源專案,兼作爲學習的基石 • 快速迭代,持續進化 • 計費模式讓 EMQ 不需在 Day1 預估用量,可以嘗試各種架構設計 • 在 AWS 生態系內外有多種服務可整合、協作、相互替代 • AWS 提供的新服務與新功能,讓 EMQ 的選項益加豐富
  • 46. Thank you! © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. HC Lo hclo@amazon.com Cliff Chao-kuan Lu clifflu@emq.com