Mais conteúdo relacionado Semelhante a AWS Data-Driven Insights Learning Series ANZ Sep 2019 Part 1 (20) Mais de Amazon Web Services (20) AWS Data-Driven Insights Learning Series ANZ Sep 2019 Part 11. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS presents
Data-Driven Insights Learning Series
Brisbane | 19 September
Adelaide | 24 September
Perth | 26 September
Auckland | 10 October
2. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Welcome
Learn how to
Build Next Gen Data Lakes
and Analytics Platforms
3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda Overview
09:10 – 09:50 How to modernise your analytics and data architecture
Vicky Falconer, Big Data & Analytics Business Development, AWS
09:50 – 10:20 Best practices sharing of real-life use cases , featuring:
Brisbane: IntelliHQ, Port of Brisbane, Sportcor
Adelaide: Oz Minerals, University of SA
Perth: Citic Pacific Mining, Kamala Tech, WESCEF
Auckland: Afterpay, ESP NZ
10:20 – 10:50 Morning Tea
10:50 – 11:00 AWS Training & Certification Learning paths: Data Analytics and AI & ML
11:00 – 12:00 Time to Value – Lake Formation
Jason Hunter, Senior Data & Analytics Specialist, AWS
Syed Jaffry, Solutions Architect, AWS
12:00 – 13:00 Networking lunch
13:00 – 13:45 The future of cloud data warehousing
Tom McMeekin, Solutions Architect, AWS
13:45 – 14:45 AI & Machine Learning and data lakes: A platform to build business outcomes from data
Eric Greene, AI & ML Solutions Specialist, AWS
Jenny Davies, Solutions Architect, AWS
Will Badr, AI & ML Solutions Specialist, AWS
14:45 – 15:00 Close
4. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you to our partners
5. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Vicky Falconer
Big Data & Analytics Business Development Lead,
AWS
How to modernise your analytics
and data architecture
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why is data strategic?
7. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
This is data
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
This is data
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
This is data
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
”How a Carsales hackathon spawned an AI innovation”
- itNews – 29 January 2018
“Cyclops image recognition tool automatically selects and assigns angles
to each image uploaded onto the Carsales website.
Automotive classifieds site Carsales has had a pretty solid run of luck with its
hackathons over the years, but a three-day innovation fest held this time
last year might prove to have been the catalyst for its best success story so
far.
It was at this particular hackathon that developers came up with and coded
a working prototype of a piece of image recognition software that vastly
improves the accuracy and consistency of photos uploaded to the site.”
11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Future = Flex + Foundations
12. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
13. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Drivers of change: Analytical needs have evolved
Organic revenue growth
*Aberdeen: Angling for Insight in Today’s Data Lake, Michael Lock, SVP Analytics and Business Intelligence
every 5
years
15
years
live for
1,000x
scale
>10x
grows
*IDC, Data Age 20215: The Evolution of Data to Life-Critical Don’t Focus on Big Data, Focus on the Data That’s Big, April 2017.
11 8 5 4
How do I provide democratized
access to data to enable
informed decisions while at the
same time enforce data
governance and prevent
mismanagement of the data?
more valuable
Hadoop Elasticsearch Presto Spark
Democratization
of data
Governance
& control
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multiple users
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Design Principles
De-coupleDesign to flex & adjust 10x data at speed
10X
Test and fail fast
T
Experimentation
at scale
No more silos
Foundations for AIMLSupporting multiple
personas
AIML
Not invented yet
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Modern Data Architecture
Ingestion
Event Pipelines (Near-real time)
Batch Data Pipelines
Machine
Learning
ServingData sources
Transactions
Connected
devices
Social media
Web logs /
clickstream
Business
Outcomes
• Revenue Lift
• Market
acquisition
• Customer delight
• Brand advocacy
• Personalisation
• Next best action
• Credit Risk
• Supply Chain
Optimisation
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Modern data architecture
INGEST
Data
sources
Transactions
ERP
Connected
devices
Social media
Web logs /
clickstream
Data analysts
Data scientists
Business users
Automation / events
Engagement platforms
AWS Database
Migration
AWS Direct
Connect
Internet
Interfaces
Amazon Kinesis
Semi/Unstructured
Amazon EMR
Schemaless
Amazon ElasticSearch
Direct Query
Amazon Athena
Data Warehouse
Amazon Redshift
Legacy Apps
Amazon RDS
Near-Zero Latency
Amazon DynamoDB
Machine Learning
Amazon SageMaker
BATCH DATA PIPELINES (Historic)
EVENT PIPELINES (Near-real time)
AWS Glue
Amazon EMR
ETL
Apps & dashboards
subscribe to alerts,
notifications, events
to enable time-
sensitive decision-
making
DECISIONS
Amazon Kinesis
data Streams
EVENT CAPTURE
Amazon S3
RAW DATA
Amazon Kinesis
Data Firehose
STREAM ANALYSIS
Amazon
SageMaker
Amazon Kinesis
Data Analytics
Amazon S3
STAGED DATA
(Data Lake)
Cleansed &
Processed data
SERVING
Amazon Managed
Streaming for
Kafka
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What does good look like?
1. Clarity around the ‘why’ – clearly
anchored in business value
2. The organisation is on board
3. Data strategy (ML strategy)
4. Culture of experimentation
5. People strategy -> data
6. Architect for the future – think long
term
20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
You need to start
building the muscle
now – skills, the
experiments, the
enabling culture and
the enabling platform
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Data Driven Enterprise (D2E)
• Data Warehouse Migration (DWM)
• Experiment to Value (D2V)
Getting started
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank you
Vicky Falconer
falcnr@amazon.com
23. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Voice of the customer:
Best practices sharing of real-life
use cases
Graham Preston,
Development
Manager,
WESCEF
Rutu Ayachit,
Data Analytics Lead
CITIC Pacific Mining
Doug Hull,
IT Manager,
Kamala Tech
25. 2727
Nitric Acid Yield Optimisation
Production
Moisture
V1
V2
Operator DashboardYield Algorithm
Extract
Data for selected
sensors extracted
every 5 mins using
Macroview datapump
Ingest
Raw data ingested
by AWS, meta tags
applied and results
stored in data lake
Process
Data cleansed and
engineered using a
Spark cluster and
stored in data lake
Serve
Curated data loaded to
data mart every 15
mins and optimisation
model applied
Visualise
Dashboard performs
live query of data
mart to advise
operating conditions
Raw Data
26. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
WelcomeMorning Tea
Resuming at 10:50am
27. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Training & Certification
Learning Paths: Data Analytics
& AI/ ML
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The skills gap in numbers
of Information
Technology decision-
makers reported a
between their team’s skill levels
and the knowledge required to
achieve organisational objectives.
68%
gap 600%
Customers
need
expertise
Increase in job postings
featuring “AWS”
Source: 2018 Global Knowledge IT Skills and Salary Report
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
*Source: IDC White Paper, sponsored by AWS, Train to Accelerate Your Cloud Strategy, October 2017
~Source: 2018 Global Knowledge IT Skills and Salary Report
Why is training important?
AWS Training and Certification leads to measurable results …
Increase employee
engagement
30% increased
employee
satisfaction,
leading to
retention~
Faster
time-to-market
Lower business
risk
Increase
profitability
4.4x more likely to
overcome
operational and
performance
concerns*
4.7x more likely to
improve
IT staff
productivity*
80% faster to
adopt cloud*
30. © 2019, Amazon Web Services, Inc. or its Affiliates.
Solution Based Learning Paths
Machine Learning: Developer
Machine Learning: Decision Maker
Machine Learning: Data Scientist
Big Data Specialty
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What’s in a plan? Three key pillars
Classroom training
In-person and virtual classes
to learn from accredited
instructors
Digital training
Digital training for
on-demand learning
AWS Certification
AWS Certification to
validate knowledge
32. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Jason Hunter
Sr Data Analyst Specialist
AWS
Time to Value – Lake Formation
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Lakes evolve the traditional approach
OLTP ERP CRM LOB
Data Warehouse
Business Intelligence
Data Lake
1001100001001010111
0010101011100101010
0001011111011010
0011110010110010110
0100011000010
Devices Web Sensors Social
Catalog
Machine Learning
DW
Queries
Big data
processing
Interactive Real-time
Relational and non-relational data
TBs-EBs scale
Schema defined during analysis
Diverse analytical engines to gain insights
Designed for low-cost storage and analytics
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS databases and analytics
Broad and deep portfolio, allowing for agility and scale
AWS Marketplace
Amazon
Redshift
Data warehousing
Amazon EMR
Hadoop +
Spark
Athena
Interactive analytics
Kinesis
Analytics Real-
time
Amazon Elasticsearch
service
Operational Analytics
RDS
MySQL, PostgreSQL,
MariaDB, Oracle, SQL Server
Aurora
MySQL, PostgreSQL
Amazon
QuickSight
Amazon
SageMaker
DynamoDB
Key value, Document
ElastiCache
Redis, Memcached
Neptune
Graph
Timestream
Time Series
QLDB
Ledger Database
S3/Amazon
Glacier
AWS Glue
ETL & Data Catalog
Lake Formation
Data Lakes
Database Migration Service | Snowball | Snowmobile | Kinesis Data Firehose | Kinesis Data Streams | Data Pipeline | Direct Connect
Data Movement
AnalyticsDatabases
Business Intelligence & Machine Learning
Data Lake
Managed
Blockchain
Blockchain
Templates
Blockchain
Amazon
Comprehend
Amazon
Rekognition
Amazon
Lex
Amazon
Transcribe
AWS DeepLens 250+ solutions
730+ Database
solutions
600+ Analytics
solutions
25+ Blockchain
solutions
20+ Data lake
solutions
30+ solutions
RDS on VMWare
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Typical steps of building a data lake
Setup Storage1
Move data2
Cleanse, prep, and
catalog data
3
Configure and enforce
security and compliance
policies
4
Make data available
for analytics
5
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Lake Formation
Build a data lake in
days, not months
Build and deploy a fully
managed data lake with a few
clicks
Enforce security
policies across multiple
services
Centrally define security,
governance, and auditing policies in
one place and enforce those
policies for all users and all
applications
Combine different
analytics approaches
Empower analyst and data scientist
productivity, giving them self-
service discovery and safe access to
all data from a single catalog
37. Fastest way to build secure data lakes
Data Lake Storage
Data
Catalog
Access
ControlBlueprints ML-based
data prep
Lake Formation
Data Lakes AWS Glue
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Amazon
QuickSight
Comprehensive list of integrated tools
enable every user equally
Centralized management of fine
grained permission empower security
officers
Simplified ingest and cleaning enables
data engineers to build faster
Cost effective, durable storage with
global replication capabilities
38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fastest way to build secure data lakes
Data Lake Storage
Data
Catalog
Access
ControlBlueprints ML-based
data prep
Lake Formation
Data Lakes AWS Glue
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Amazon
QuickSight
Comprehensive list of integrated tools
enable every user equally
Centralized management of fine
grained permission empower security
officers
Simplified ingest and cleaning enables
data engineers to build faster
Cost effective, durable storage with
global replication capabilities
39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fastest way to build secure data lakes
Data Lake Storage
Data
Catalog
Access
ControlBlueprints ML-based
data prep
Lake Formation
Data Lakes AWS Glue
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Amazon
QuickSight
Comprehensive list of integrated tools
enable every user equally
Centralized management of fine
grained permission empower security
officers
Simplified ingest and cleaning enables
data engineers to build faster
Cost effective, durable storage with
global replication capabilities
40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fastest way to build secure data lakes
Data Lake Storage
Data
Catalog
Access
ControlBlueprints ML-based
data prep
Lake Formation
Data Lakes AWS Glue
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Amazon
QuickSight
Comprehensive list of integrated tools
enable every user equally
Centralized management of fine
grained permission empower security
officers
Simplified ingest and cleaning enables
data engineers to build faster
Cost effective, durable storage with
global replication capabilities
41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Fastest way to build secure data lakes
Data Lake Storage
Data
Catalog
Access
ControlBlueprints ML-based
data prep
Lake Formation
Data Lakes AWS Glue
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Athena
Interactive analytics
Amazon
QuickSight
Comprehensive list of integrated tools
enable every user equally
Centralized management of fine
grained permission empower security
officers
Simplified ingest and cleaning enables
data engineers to build faster
Cost effective, durable storage with
global replication capabilities
44. Workflows : Orchestrate repeatable data pipelines
Easy way to create and visualise
you business transformation
rules
Allows for parameters and
pipeline state to be shared
across stages
Dynamic views allow inspection
of current running workflows
for diagnostic and current state
information.
45. Simplified and more granular security permissions
Control data access with simple
grant and revoke permissions
Specify permissions on tables
and columns rather than on
buckets and objects
Easily view policies granted to a
particular user
Audit all data access at one
place
47. Search and collaborate across multiple teams and users
Text based search across all of
your metadata
Add attributes like Data owners,
stewards, and other as table
properties
Add data sensitivity level,
column definitions, and others
as column properties
48. AWS Lake Formation pricing
No additional charges – Only pay for the
underlying services used.
49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Getting Started
50. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security Personas in Lake Formation
• Run and operate the data lake
• Define secure storage
boundaries
• Manage users
• Audit/optimize data lake
Data Lake Admins Data Lake Users
• Create, consume and curate
data sets
• Configure and manage access
controls across data assets
51. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security Permissions in Lake Formation
Security Permissions in Lake Formation
TableTable
Database
LFUsers
RequiredPermissionScope
Table
52. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 1 : Use Blueprints to ingest data
Select source system and data to
import
Specify location on where to load
your data
Provide frequency of loads
53. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Imported data catalogued for access
54. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 2 : Grant permissions to securely share data
55. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Step 3 : Run query in Amazon Athena
56. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demo
57. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demo : Build end to end data pipeline
Demo : Build end to end data pipelineLoad data Process Configure & Secure Make available
DL Engineer Data Steward / Owner Data Analyst / Scientist
58. H O W W E C A N H E L P
• Brainstorming
• Data platform architecture
• Building of prototype within your accounts that can be brought into production
• Work side-by-side with Amazon experts
Data Lab
• Practical education on Big Data and analytics for new and experienced
practitioners
• Learn best practice solution architecture for building modern data
platforms
Data & Analytics Learning Training and Certification
59. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks!
Jason Hunter
jasonhnz@amazon.com