SlideShare uma empresa Scribd logo
1 de 61
Baixar para ler offline
© 2021 Snowflake Inc. All Rights Reserved
ACTIONABLE INSIGHTS MIT AI - VOM
EXPERIMENT ZUR WERTSCHÖPFUNG
26. Jan. 2021
Harald Erb | harald.erb@snowflake.com
Sr. Solutions Engineer, Central Europe
© 2021 Snowflake Computing Inc. All Rights Reserved
ABOUT ME
Sr. Solutions Engineer
Central Europe
harald.erb@snowflake.com
Llinkedin.com/in/haralderb
Enthusiastic about Business Analytics &
Data Management for 20+ years
> Consulting: Delivered large-scale Data
Warehouse and BI projects as Developer,
Information Analyst, Solution Architect,
Project Lead at Oracle D/A/CH
> Presales: 2nd SE on the ground at Snowflake in
Centr. Europe with focus on Data Management,
Business Analytics & Data Science
> Worked with clients on Big Data & IoT solutions
as Architect and Solutions Engineer at Oracle
EMEA, Pentaho and Hitachi Vantara
© 2021 Snowflake Inc. All Rights Reserved
DYI ?
3
Kubernetes Cluster with 5 Raspberry PIs
???
Fascinating technology, but unfortunately
there is not enough time for DYI...
© 2021 Snowflake Inc. All Rights Reserved
AGENDA – PART 1
4
Source: github.com/szilard/ml-prod (Dr. Szilard Pafka)
ML LIFECYCLE
Tools & Ecosystem
• Notebooks (SQL, Python)
• Snowflake: Snowpark (Scala)
Data Development
• Required Platform capabilities
• SQL
• Snowflake: TimeTravel,
Zero-copy clone
Onboarding of new
Datasets
• Data Lake integration
• API integration
• Snowflake Data
Marketplace
Experiment
(Lab environment)
Value added
(ML in Production,
at scale)
© 2021 Snowflake Inc. All Rights Reserved.
SNOWFLAKE DATA ARCHITECTURE FOR
DATA SCIENCE
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
“DATA SCIENCE ONLY” ENVIRONMENT
6
© 2021 Snowflake Inc. All Rights Reserved
DATA SCIENCE + REPORTING DATABASE
7
© 2021 Snowflake Inc. All Rights Reserved
SNOWFLAKE DATA ARCHITECTURE
8
“Data Lake inside”
© 2021 Snowflake Computing Inc. All Rights Reserved
OPTIONS HOW TO ORGANIZE DATA ASSETS IN SNOWFLAKE
Data Sources Data Consumers
Structured Data
Semi-Structured Data
Web APIs
IoT Data
Data Visualization /
Reporting
Data Science
Ad hoc Queries
Data Zones in Snowflake
Work Area (Exploratory, AI / ML)
Persistent, user/team space, dedicated compute resources
Landing Zone
Transient, ELT processes, truncate/reload
Raw
Raw data, schema-
less (JSON…):
no transformations,
matches source data
Conformed
Raw +
de-duplicated, data
type standardization
(dates)
Reference
Master data, ,
manual mappings,
Business hierarchies
Modeled
Integrated, cleansed,
modeled data (3NF,
Data Vault,
Dimensional Model)
“Data Lake" “Data Warehouse”
9
Snowflake’s Architecture is based on elastic cloud storage allowing to organize very large amounts of raw data at an affordable
price. This capability enables Data Teams to perform unbounded data discovery and data understanding while Analysts can
access business friendly data models in a self-service mode.
© 2021 Snowflake Computing Inc. All Rights Reserved 10
> SELECT … FROM
…
Semi-structured data
(JSON, Avro, XML, Parquet, ORC)
Structured data
(e.g., CSV, TSV, …)
Storage optimization
Transparent discovery and storage optimization
of repeated elements
Query optimization
Full database optimization for queries
on semi-structured data
+
select v:lastName::string
as last_name
, ...
from json_doc_table;
HANDLING OF SCHEMALESS DATA IN SNOWFLAKE
With Snowflake’s VARIANT data type, semi-structured data can be loaded easily into a relational DW
and is then available for immediate analysis
© 2021 Snowflake Inc. All Rights Reserved
SNOWFLAKE DATA ARCHITECTURE
11
Storage Integration with external Data Lake
© 2021 Snowflake Inc. All Rights Reserved.
INGESTION & PROCESSING
OF NEW DATASETS
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
OPTIONS FOR DATA INGESTION
FILES
AUTO-INGEST
SNOWPIPE
SNOWPIPE
REST API
COPY
Driverless
Notification-driven
Serverless
Async, Continuous
File Dedup
Error Handling
EXTERNAL
TABLE AUTO
REFRESH
Bulk Loading
● COPY Command
● User-managed compute
resource
Continuous Loading
● Snowpipe
● Snowflake-managed
compute resource
© 2021 Snowflake Inc. All Rights Reserved
© 2020 Snowflake Inc. All Rights Reserved
SOLUTION SCENARIO
SCENARIO: INGESTING FUEL PRICE DATA FOR ANALYIS
Source: tankerkoenig.de
© 2021 Snowflake Inc. All Rights Reserved
SOLUTION
ARCHITECTURE
© 2020 Snowflake Inc. All Rights Reserved
In Scope today
© 2021 Snowflake Inc. All Rights Reserved 16
Key Steps
>Integrate with AWS S3 and connect
Snowflake via External Stage
>Create a Pipe for Automatic Data Ingestion
> Run Snowpipe with new data
EXAMPLE
DATA INGESTION WITH
SNOWPIPE
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
AWS S3: STORAGE INTEGRATION VIA EXTERNAL STAGE
v
SF Admin Task, typically
not done by developers!
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
AWS S3: DATA ACCESS DIRECTLY FROM SNOWFLAKE
v
List content of a S3 bucket directly
from Snowflake, navigate subfolder
structure.
Identify, inspect and select files to be
loaded using “ * ” and RegExp etc.
Compute statistics
on files to be loaded
into Snowflake
© 2021 Snowflake Inc. All Rights Reserved
NOTIFICATION-DRIVEN DATA INGESTION WITH SNOWPIPE
v
v
Bulk load
command
v
Target table to be updated
Source location,
external stage
(e.g. S3 Bucket)
© 2021 Snowflake Inc. All Rights Reserved 20
Key Steps
>Integrate AWS Lambda Function
>Automate API Calls + store Payloads (JSON)
> Implement Change Data Capture
> Automate JSON flattening + Data Loading
EXAMPLE
FAST INTEGRATION OF
API PAYLOADS
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
API INTEGRATION VIA EXTERNAL FUNCTION
SF Admin Task, typically
not done by developers
v
v
External API’s can now
be called via SQL!
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
EXTERNAL API CALL VIA SQL (EXAMPLE 1)
Table with column
SOURCE_DATA (VARIANT
data type) containing a
collection of raw datasets from
multiple open data sources
v
Insert Statement
calling external API
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
EXTERNAL API CALL VIA SQL (EXAMPLE 2)
Fuel price data of multiple
gas stations
v
Insert Statement
calling external API
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
AUTOMATED DELTA LOAD WITH STREAMS AND TASKS
V
Task will only start if table stream
has new data records to process
à saves compute resources!
Only CDC data
records of interest will
be processed and then
cleared from stream
when committed
V
Lateral view and flatten table function
used to split price data by Gas Station
and store as separate records in the
target table REMOTE_FUEL_PRICES
V
© 2021 Snowflake Inc. All Rights Reserved
API DATA IS NOW PREPARED FOR FURTHER USE…
New fuel prices prepared
and stored in target table
REMOTE_FUEL_PRICES
(still in JSON format)
V
© 2021 Snowflake Inc. All Rights Reserved
…LIKE IN A DASHBOARD COMBINING HISTORICAL + RT DATA
© 2021 Snowflake Inc. All Rights Reserved
V
Final BI Query:
Reading, formatting and joining
JSON price data directly with
master data – with fast query
performance
© 2021 Snowflake Inc. All Rights Reserved 27
Key Steps
>Identify & select dataset of interest in
Snowflake Data Marketplace
>Request new (paid) dataset and agree on
terms and conditions
>Once request is approved, access to the
Shared Database will be granted by Data
Provider
> Query and blend own and shared data for
new insights
EXAMPLE
SNOWFLAKE DATA
MARKETPLACE
© 2021 Snowflake Inc. All Rights Reserved
PUBLIC AND PERSONALIZED DATASETS…
28
Step 1: Select
dataset of interest
V
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
…AVAILABLE BY (PAID) REQUEST…
29
Step 2: Request
shared dataset
for access in own
Snowflake Account
© 2021 Snowflake Inc. All Rights Reserved
…READY TO QUERY – DATA SCIENTISTS LOVE IT ALREADY!
30
Shared Database is
visible (read-only)
in own Snowflake
Account – no
additional ETL
required
V
V
Step 3: Query location
features to improve
predictive models
using Snowflake’s Geo
SQL Functions
© 2021 Snowflake Inc. All Rights Reserved.
MORE DATA ENGINEERING
WITH SNOWFLAKE
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Computing Inc. All Rights Reserved
PLATFORM REQUIREMENT #1: DEDICATED COMPUTE
32
Continuous
Loading (4TB/day)
S3
<5min SLA
Compute Cluster
“Medium”
Batch Data Loads
& Transformations
Compute Cluster
"Large”
Interactive
Dashboard
50% < 1s
85% < 2s
95% < 5s
Compute Cluster
Auto Scale –
”X-Large” x 5
Prod DB
Snowflake Shared Data, Multi-Cluster Architecture: All data available in a central repository,
all major workloads isolated, elastic compute provides performance on demand
Structured & Semi-structured Data at Petabyte-Scale
(all encrypted, compressed)
Dedicated Compute
Cluster incl.
allowance to resize,
e.g. up to
"2X-Large” etc.
Data Science /
Data Projects
© 2021 Snowflake Inc. All Rights Reserved 33
PLATFORM REQUIREMENT #2: ELASTIC SCALING IN NO TIME
Vertical Scaling: Resize Compute Cluster instantly
• Pure Cloud User Experience
• Scale up/down in no time, no need to start/stop, or reboot
• Benefit: Faster data loads, more complex queries
(aggregations, new features)
Horizontal Scaling: Automatic Scale-out
• Snowflake detects/handles massive parallel workloads
and automatically scales back after the load drops
• Benefit: constantly good performance, even in peak times
© 2021 Snowflake Computing Inc. All Rights Reserved
PLATFORM REQUIREMENT #3: PAY FOR WHAT YOU USE
ETL and
Processing
Morning Noon Night
Workload
Reporting
Ad-hoc
Analytics
Morning Noon Night
Workload
Morning Noon Night
Workload
Data
Science
Morning Noon Night
Workload
© 2021 Snowflake Inc. All Rights Reserved
MEETING USERS WHERE THEY ARE
35
© 2021 Snowflake Inc. All Rights Reserved
DEVELOPER EXPERIENCE WITH SNOWFLAKE
36
Coding:
Data Pipelines, Debugging –
e.g. with VS Code, Scala and
Snowpark (in Development)
RDBMS
Development:
Database models,
use data dictionary,
RBAC – e.g. with
Snowsight,
DBeaver, SqlDBM
Data Science
Notebook: Live code
embedded in markup
text + visualizations –
e.g. with Jupyter and
SF Python Connector
© 2021 Snowflake Computing Inc. All Rights Reserved 37
DON’T FORGET SQL FOR FEATURE ENGINEERING
Create Compute-intensive Features in Snowflake
à Eliminates need for frameworks such as Featuretools for Python
Examples
• Calculate number of sales per product for last week/month/year
• Rank products within different product categories by price, revenue, etc.
• Calculate share of revenue or margin for a given product
• Find the lowest ever sales price for all products in each product category
Real code example, but simplified for increased readability
Adopted from Elkjøp’s presentation @ Snowflake Stockholm Meetup, 2021
© 2021 Snowflake Computing Inc. All Rights Reserved 38
SNOWFLAKE FEATURES FOR REPRODUCIBILITY
TIME TRAVEL
All tables can be versioned for 0 - 90 days
• Extremely useful for tracking changes and
debugging master data issues
à Eliminates need to maintain separate history
tables in many cases
• Intuitive SQL syntax for querying tables at a
specific point in time
ZERO-COPY CLONING
Take “snapshots” of any table, schema or database
at no extra cost
• Useful for creating static datasets when developing
ML models à eliminates need to create datasets in
Azure Blob Storage / AWS S3
• Associate Git commit (code version) with cloned
table (data version) à makes it easier to
reproduce an experiment in the future
Real code example, but
simplified for increased
readability
Adopted from Elkjøp’s presentation @ Snowflake Stockholm Meetup, 2021
© 2021 Snowflake Computing Inc. All Rights Reserved 39
REPORT OF A CUSTOMER’S DATA SCIENTIST TEAM
© 2021 Snowflake Inc. All Rights Reserved
DATA DEVELOPMENT: PYTHON & SQL COMBINED
41
à Notebook
© 2021 Snowflake Inc. All Rights Reserved
DATA DEVELOPMENT: SCALA VIA SNOWPARK
42
SNOWPARK – A new developer experience that allows you to write Snowflake code in
your preferred way, and execute it directly in Snowflake (incl. all platform benefits)
IN PRIVATE-PREVIEW
© 2021 Snowflake Inc. All Rights Reserved
DATA DEVELOPMENT: SCALA VIA SNOWPARK
43
A comparative example
SQL Snowpark
Write, use functions.
Monolithic, hard to debug
Pipeline, easy to debug
Repetitive, inflexible
operations
Flexible, reusable
operations
IN PRIVATE-PREVIEW
© 2021 Snowflake Inc. All Rights Reserved
DATA DEVELOPMENT: SCALA VIA SNOWPARK
44
Deep Dive and Demo of Preview Version
Webinar Recording: Building Open Source Data Science Models on Snowflake
IN PRIVATE-PREVIEW
© 2021 Snowflake Inc. All Rights Reserved
AGENDA – PART 2
45
Source: github.com/szilard/ml-prod (Dr. Szilard Pafka)
ML LIFECYCLE
Data Science Platform (DataRobot)
• Auto ML
• ML Ops
• Model Deployment
Experiment
(Lab environment)
Value added
(ML in Production,
at scale)
© 2021 Snowflake Inc. All Rights Reserved.
DATA SCIENCE (AUTO ML) PLATFORMS TO
COMPLEMENT SNOWFLAKE CAPABILITIES
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
DATA SCIENCE PLATFORM – ARCHITECTURE (DATAROBOT)
47
© 2021 Snowflake Inc. All Rights Reserved
DATA SCIENCE PLATFORM – COMPONENTS (DATAROBOT)
48
© 2021 Snowflake Inc. All Rights Reserved
AUTO ML – LEADERBOARD OF A PROJECT
49
à DataRobot
© 2021 Snowflake Inc. All Rights Reserved
ML OPERATIONS
50
Any ML Ops solution capable and agile enough to handle complex situations has to be proficient in the following four
critical areas to safely deliver machine learning applications in production at scale
© 2021 Snowflake Inc. All Rights Reserved
ML OPS – MONITORING (SERVICE HEALTH)
51
© 2021 Snowflake Inc. All Rights Reserved
ML OPS – MONITORING (MODEL ACCURACY)
52
© 2021 Snowflake Inc. All Rights Reserved
INTEGRATION OF PRODUCTIVE ML MODELS
53
© 2021 Snowflake Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
CODING EXAMPLE: DATAROBOT & SNOWFLAKE
54
Notebook example: Snowflake + DataRobot Prediction API
Call of DataRobot
prediction API
V
Sample code: DataRobot Github Repo
Open Snowflake cursor
and and store
prediction results in a
table for further use
© 2021 Snowflake Inc. All Rights Reserved
MODEL DEPLOYMENT – SOURCE CODE EXPORT
55
SNOWFLAKE INTEGRATION EXAMPLE
Detailed Article: DataRobot Blog
56
Calling a productive Machine Learning Model via Snowflake External Functions
© 2021 Snowflake Inc. All Rights Reserved.
SESSION TAKEAWAY
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
SNOWFLAKE DATA CLOUD
58
DATA
SOURCES
OLTP DATABASES
ENTERPRISE
APPLICATIONS
THIRD-PARTY
WEB/LOG DATA
IoT
DATA
CONSUMERS
DATA MONETIZATION
OPERATIONAL
REPORTING
AD HOC ANALYSIS
REAL-TIME ANALYTICS
© 2021 Snowflake Inc. All Rights Reserved
A COMPLETE & EASY-TO-USE DATA PLATFORM…
Structured Data
Semi-Structured Data
Web APIs
IoT Data
Visualization /
Reporting
Data Science
Ad hoc Queries
Data Sources Stage
Presentation /
Consumers
JSON, AVRO
(VARIANT)
Hive Metastore
Integration
External Tables
Parquet
Load/Unload
ANSI SQL
Data Lake Warehouse Aggregation
Semantic /
Federated
Elastic Multi-
Cluster Compute
Data Vault,
3NF Modeling
ACID
Transactional
Consistency
Data Masking/
Row-level Sec. *)
Sec. + Material.
Views
Zero Copy
Cloning
SSO
LDAP
OAUTH
SCIM
ODBC/JDBC
Python/R/Spark
Connector
End-to-End Security (RBAC, Encryption at Rest/in Motion)
Web UI /
Snowsight
External
Functions
Data Sharing /
Marketplace
Streams (CDC) &
Tasks (Scheduler)
Time Travel
Kafka-Connector /
Snowpipe
Stored Procs /
UDFs
Geospatial
Snowflake supports Data Lake, Data Warehouse, and Data Engineering workloads
Dimensional
Modeling
Information
Schema
59
SnowPark *)
(Scala)
*) Roadmap item / in Preview
SNOWFLAKE PLATFORM DATA ENGINEERING INTEGRATIONS
…FOR ACCELERATING MACHINE LEARNING & AI
● Integration with AutoML
platforms and Notebook-
based ML
● Read and writeback
● Spark Connector
● Python Connector
● Apache Arrow support
● Snowpipe
● Kafka
● Streams/Tasks
● Reproducibility – Versioning -
Features, Models
● “Share” data & results
● Feature Repository – Re-Use
● Time Travel
● Zero-copy cloning
• Data Platform as a Service
• Instant Scalability & Elasticity
• Dedicated Virtual Warehouses
• Store and use all data
regardless of structure
• Cross cloud replication
• Pay per use – per second pricing
• Data Marketplace & Exchange
60
© 2021 Snowflake Inc. All Rights Reserved.
Q & A
© 2021 Snowflake Computing Inc. All Rights Reserved
© 2021 Snowflake Inc. All Rights Reserved
THANK YOU

Mais conteúdo relacionado

Mais procurados

Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptxchennakesava44
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overviewJames Serra
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Amazon Web Services
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudMichael Rainey
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 

Mais procurados (20)

Snowflake Architecture.pptx
Snowflake Architecture.pptxSnowflake Architecture.pptx
Snowflake Architecture.pptx
 
Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Elastic Data Warehousing
Elastic Data WarehousingElastic Data Warehousing
Elastic Data Warehousing
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 

Semelhante a Actionable Insights with AI - Snowflake for Data Science

Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSenturus
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauHarald Erb
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services Torsten Steinbach
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AITorsten Steinbach
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionTorsten Steinbach
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateMichael Rainey
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst
 
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataAltinity Ltd
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWKent Graziano
 
Laboratorio práctico: Data warehouse en la nube
Laboratorio práctico: Data warehouse en la nubeLaboratorio práctico: Data warehouse en la nube
Laboratorio práctico: Data warehouse en la nubeSoftware Guru
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeTorsten Steinbach
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfIlham31574
 
Peteris Arajs - Where is my data
Peteris Arajs - Where is my dataPeteris Arajs - Where is my data
Peteris Arajs - Where is my dataAndrejs Vorobjovs
 
Moving OBIEE to Oracle Analytics Cloud
Moving OBIEE to Oracle Analytics CloudMoving OBIEE to Oracle Analytics Cloud
Moving OBIEE to Oracle Analytics CloudEdelweiss Kammermann
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeDATAVERSITY
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
 
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudIBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudTorsten Steinbach
 

Semelhante a Actionable Insights with AI - Snowflake for Data Science (20)

Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and Tableau
 
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
IBM THINK 2020 - Cloud Data Lake with IBM Cloud Data Services
 
Cloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AICloud-based Data Lake for Analytics and AI
Cloud-based Data Lake for Analytics and AI
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
 
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGateContinuous Data Replication into Cloud Storage with Oracle GoldenGate
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 
Laboratorio práctico: Data warehouse en la nube
Laboratorio práctico: Data warehouse en la nubeLaboratorio práctico: Data warehouse en la nube
Laboratorio práctico: Data warehouse en la nube
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
IBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lakeIBM Cloud Day January 2021 - A well architected data lake
IBM Cloud Day January 2021 - A well architected data lake
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdf
 
Peteris Arajs - Where is my data
Peteris Arajs - Where is my dataPeteris Arajs - Where is my data
Peteris Arajs - Where is my data
 
Moving OBIEE to Oracle Analytics Cloud
Moving OBIEE to Oracle Analytics CloudMoving OBIEE to Oracle Analytics Cloud
Moving OBIEE to Oracle Analytics Cloud
 
Unlocking the Value of Your Data Lake
Unlocking the Value of Your Data LakeUnlocking the Value of Your Data Lake
Unlocking the Value of Your Data Lake
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM CloudIBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
IBM THINK 2019 - A Sharing Economy for Analytics: SQL Query in IBM Cloud
 

Mais de Harald Erb

Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Harald Erb
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Machine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für ArchitektenMachine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für ArchitektenHarald Erb
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyHarald Erb
 
Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Harald Erb
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Harald Erb
 
Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Harald Erb
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data DiscoveryHarald Erb
 
DOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataDOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataHarald Erb
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleHarald Erb
 
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Harald Erb
 

Mais de Harald Erb (11)

Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Machine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für ArchitektenMachine Learning - Eine Challenge für Architekten
Machine Learning - Eine Challenge für Architekten
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud Journey
 
Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
 
Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
 
DOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataDOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big Data
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
 
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
 

Último

ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Último (20)

ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

Actionable Insights with AI - Snowflake for Data Science

  • 1. © 2021 Snowflake Inc. All Rights Reserved ACTIONABLE INSIGHTS MIT AI - VOM EXPERIMENT ZUR WERTSCHÖPFUNG 26. Jan. 2021 Harald Erb | harald.erb@snowflake.com Sr. Solutions Engineer, Central Europe
  • 2. © 2021 Snowflake Computing Inc. All Rights Reserved ABOUT ME Sr. Solutions Engineer Central Europe harald.erb@snowflake.com Llinkedin.com/in/haralderb Enthusiastic about Business Analytics & Data Management for 20+ years > Consulting: Delivered large-scale Data Warehouse and BI projects as Developer, Information Analyst, Solution Architect, Project Lead at Oracle D/A/CH > Presales: 2nd SE on the ground at Snowflake in Centr. Europe with focus on Data Management, Business Analytics & Data Science > Worked with clients on Big Data & IoT solutions as Architect and Solutions Engineer at Oracle EMEA, Pentaho and Hitachi Vantara
  • 3. © 2021 Snowflake Inc. All Rights Reserved DYI ? 3 Kubernetes Cluster with 5 Raspberry PIs ??? Fascinating technology, but unfortunately there is not enough time for DYI...
  • 4. © 2021 Snowflake Inc. All Rights Reserved AGENDA – PART 1 4 Source: github.com/szilard/ml-prod (Dr. Szilard Pafka) ML LIFECYCLE Tools & Ecosystem • Notebooks (SQL, Python) • Snowflake: Snowpark (Scala) Data Development • Required Platform capabilities • SQL • Snowflake: TimeTravel, Zero-copy clone Onboarding of new Datasets • Data Lake integration • API integration • Snowflake Data Marketplace Experiment (Lab environment) Value added (ML in Production, at scale)
  • 5. © 2021 Snowflake Inc. All Rights Reserved. SNOWFLAKE DATA ARCHITECTURE FOR DATA SCIENCE © 2021 Snowflake Computing Inc. All Rights Reserved
  • 6. © 2021 Snowflake Inc. All Rights Reserved “DATA SCIENCE ONLY” ENVIRONMENT 6
  • 7. © 2021 Snowflake Inc. All Rights Reserved DATA SCIENCE + REPORTING DATABASE 7
  • 8. © 2021 Snowflake Inc. All Rights Reserved SNOWFLAKE DATA ARCHITECTURE 8 “Data Lake inside”
  • 9. © 2021 Snowflake Computing Inc. All Rights Reserved OPTIONS HOW TO ORGANIZE DATA ASSETS IN SNOWFLAKE Data Sources Data Consumers Structured Data Semi-Structured Data Web APIs IoT Data Data Visualization / Reporting Data Science Ad hoc Queries Data Zones in Snowflake Work Area (Exploratory, AI / ML) Persistent, user/team space, dedicated compute resources Landing Zone Transient, ELT processes, truncate/reload Raw Raw data, schema- less (JSON…): no transformations, matches source data Conformed Raw + de-duplicated, data type standardization (dates) Reference Master data, , manual mappings, Business hierarchies Modeled Integrated, cleansed, modeled data (3NF, Data Vault, Dimensional Model) “Data Lake" “Data Warehouse” 9 Snowflake’s Architecture is based on elastic cloud storage allowing to organize very large amounts of raw data at an affordable price. This capability enables Data Teams to perform unbounded data discovery and data understanding while Analysts can access business friendly data models in a self-service mode.
  • 10. © 2021 Snowflake Computing Inc. All Rights Reserved 10 > SELECT … FROM … Semi-structured data (JSON, Avro, XML, Parquet, ORC) Structured data (e.g., CSV, TSV, …) Storage optimization Transparent discovery and storage optimization of repeated elements Query optimization Full database optimization for queries on semi-structured data + select v:lastName::string as last_name , ... from json_doc_table; HANDLING OF SCHEMALESS DATA IN SNOWFLAKE With Snowflake’s VARIANT data type, semi-structured data can be loaded easily into a relational DW and is then available for immediate analysis
  • 11. © 2021 Snowflake Inc. All Rights Reserved SNOWFLAKE DATA ARCHITECTURE 11 Storage Integration with external Data Lake
  • 12. © 2021 Snowflake Inc. All Rights Reserved. INGESTION & PROCESSING OF NEW DATASETS © 2021 Snowflake Computing Inc. All Rights Reserved
  • 13. © 2021 Snowflake Inc. All Rights Reserved OPTIONS FOR DATA INGESTION FILES AUTO-INGEST SNOWPIPE SNOWPIPE REST API COPY Driverless Notification-driven Serverless Async, Continuous File Dedup Error Handling EXTERNAL TABLE AUTO REFRESH Bulk Loading ● COPY Command ● User-managed compute resource Continuous Loading ● Snowpipe ● Snowflake-managed compute resource
  • 14. © 2021 Snowflake Inc. All Rights Reserved © 2020 Snowflake Inc. All Rights Reserved SOLUTION SCENARIO SCENARIO: INGESTING FUEL PRICE DATA FOR ANALYIS Source: tankerkoenig.de
  • 15. © 2021 Snowflake Inc. All Rights Reserved SOLUTION ARCHITECTURE © 2020 Snowflake Inc. All Rights Reserved In Scope today
  • 16. © 2021 Snowflake Inc. All Rights Reserved 16 Key Steps >Integrate with AWS S3 and connect Snowflake via External Stage >Create a Pipe for Automatic Data Ingestion > Run Snowpipe with new data EXAMPLE DATA INGESTION WITH SNOWPIPE
  • 17. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved AWS S3: STORAGE INTEGRATION VIA EXTERNAL STAGE v SF Admin Task, typically not done by developers!
  • 18. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved AWS S3: DATA ACCESS DIRECTLY FROM SNOWFLAKE v List content of a S3 bucket directly from Snowflake, navigate subfolder structure. Identify, inspect and select files to be loaded using “ * ” and RegExp etc. Compute statistics on files to be loaded into Snowflake
  • 19. © 2021 Snowflake Inc. All Rights Reserved NOTIFICATION-DRIVEN DATA INGESTION WITH SNOWPIPE v v Bulk load command v Target table to be updated Source location, external stage (e.g. S3 Bucket)
  • 20. © 2021 Snowflake Inc. All Rights Reserved 20 Key Steps >Integrate AWS Lambda Function >Automate API Calls + store Payloads (JSON) > Implement Change Data Capture > Automate JSON flattening + Data Loading EXAMPLE FAST INTEGRATION OF API PAYLOADS
  • 21. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved API INTEGRATION VIA EXTERNAL FUNCTION SF Admin Task, typically not done by developers v v External API’s can now be called via SQL!
  • 22. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved EXTERNAL API CALL VIA SQL (EXAMPLE 1) Table with column SOURCE_DATA (VARIANT data type) containing a collection of raw datasets from multiple open data sources v Insert Statement calling external API
  • 23. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved EXTERNAL API CALL VIA SQL (EXAMPLE 2) Fuel price data of multiple gas stations v Insert Statement calling external API
  • 24. © 2021 Snowflake Inc. All Rights Reserved © 2021 Snowflake Inc. All Rights Reserved AUTOMATED DELTA LOAD WITH STREAMS AND TASKS V Task will only start if table stream has new data records to process à saves compute resources! Only CDC data records of interest will be processed and then cleared from stream when committed V Lateral view and flatten table function used to split price data by Gas Station and store as separate records in the target table REMOTE_FUEL_PRICES V
  • 25. © 2021 Snowflake Inc. All Rights Reserved API DATA IS NOW PREPARED FOR FURTHER USE… New fuel prices prepared and stored in target table REMOTE_FUEL_PRICES (still in JSON format) V
  • 26. © 2021 Snowflake Inc. All Rights Reserved …LIKE IN A DASHBOARD COMBINING HISTORICAL + RT DATA © 2021 Snowflake Inc. All Rights Reserved V Final BI Query: Reading, formatting and joining JSON price data directly with master data – with fast query performance
  • 27. © 2021 Snowflake Inc. All Rights Reserved 27 Key Steps >Identify & select dataset of interest in Snowflake Data Marketplace >Request new (paid) dataset and agree on terms and conditions >Once request is approved, access to the Shared Database will be granted by Data Provider > Query and blend own and shared data for new insights EXAMPLE SNOWFLAKE DATA MARKETPLACE
  • 28. © 2021 Snowflake Inc. All Rights Reserved PUBLIC AND PERSONALIZED DATASETS… 28 Step 1: Select dataset of interest V © 2021 Snowflake Inc. All Rights Reserved
  • 29. © 2021 Snowflake Inc. All Rights Reserved …AVAILABLE BY (PAID) REQUEST… 29 Step 2: Request shared dataset for access in own Snowflake Account
  • 30. © 2021 Snowflake Inc. All Rights Reserved …READY TO QUERY – DATA SCIENTISTS LOVE IT ALREADY! 30 Shared Database is visible (read-only) in own Snowflake Account – no additional ETL required V V Step 3: Query location features to improve predictive models using Snowflake’s Geo SQL Functions
  • 31. © 2021 Snowflake Inc. All Rights Reserved. MORE DATA ENGINEERING WITH SNOWFLAKE © 2021 Snowflake Computing Inc. All Rights Reserved
  • 32. © 2021 Snowflake Computing Inc. All Rights Reserved PLATFORM REQUIREMENT #1: DEDICATED COMPUTE 32 Continuous Loading (4TB/day) S3 <5min SLA Compute Cluster “Medium” Batch Data Loads & Transformations Compute Cluster "Large” Interactive Dashboard 50% < 1s 85% < 2s 95% < 5s Compute Cluster Auto Scale – ”X-Large” x 5 Prod DB Snowflake Shared Data, Multi-Cluster Architecture: All data available in a central repository, all major workloads isolated, elastic compute provides performance on demand Structured & Semi-structured Data at Petabyte-Scale (all encrypted, compressed) Dedicated Compute Cluster incl. allowance to resize, e.g. up to "2X-Large” etc. Data Science / Data Projects
  • 33. © 2021 Snowflake Inc. All Rights Reserved 33 PLATFORM REQUIREMENT #2: ELASTIC SCALING IN NO TIME Vertical Scaling: Resize Compute Cluster instantly • Pure Cloud User Experience • Scale up/down in no time, no need to start/stop, or reboot • Benefit: Faster data loads, more complex queries (aggregations, new features) Horizontal Scaling: Automatic Scale-out • Snowflake detects/handles massive parallel workloads and automatically scales back after the load drops • Benefit: constantly good performance, even in peak times
  • 34. © 2021 Snowflake Computing Inc. All Rights Reserved PLATFORM REQUIREMENT #3: PAY FOR WHAT YOU USE ETL and Processing Morning Noon Night Workload Reporting Ad-hoc Analytics Morning Noon Night Workload Morning Noon Night Workload Data Science Morning Noon Night Workload
  • 35. © 2021 Snowflake Inc. All Rights Reserved MEETING USERS WHERE THEY ARE 35
  • 36. © 2021 Snowflake Inc. All Rights Reserved DEVELOPER EXPERIENCE WITH SNOWFLAKE 36 Coding: Data Pipelines, Debugging – e.g. with VS Code, Scala and Snowpark (in Development) RDBMS Development: Database models, use data dictionary, RBAC – e.g. with Snowsight, DBeaver, SqlDBM Data Science Notebook: Live code embedded in markup text + visualizations – e.g. with Jupyter and SF Python Connector
  • 37. © 2021 Snowflake Computing Inc. All Rights Reserved 37 DON’T FORGET SQL FOR FEATURE ENGINEERING Create Compute-intensive Features in Snowflake à Eliminates need for frameworks such as Featuretools for Python Examples • Calculate number of sales per product for last week/month/year • Rank products within different product categories by price, revenue, etc. • Calculate share of revenue or margin for a given product • Find the lowest ever sales price for all products in each product category Real code example, but simplified for increased readability Adopted from Elkjøp’s presentation @ Snowflake Stockholm Meetup, 2021
  • 38. © 2021 Snowflake Computing Inc. All Rights Reserved 38 SNOWFLAKE FEATURES FOR REPRODUCIBILITY TIME TRAVEL All tables can be versioned for 0 - 90 days • Extremely useful for tracking changes and debugging master data issues à Eliminates need to maintain separate history tables in many cases • Intuitive SQL syntax for querying tables at a specific point in time ZERO-COPY CLONING Take “snapshots” of any table, schema or database at no extra cost • Useful for creating static datasets when developing ML models à eliminates need to create datasets in Azure Blob Storage / AWS S3 • Associate Git commit (code version) with cloned table (data version) à makes it easier to reproduce an experiment in the future Real code example, but simplified for increased readability Adopted from Elkjøp’s presentation @ Snowflake Stockholm Meetup, 2021
  • 39. © 2021 Snowflake Computing Inc. All Rights Reserved 39 REPORT OF A CUSTOMER’S DATA SCIENTIST TEAM
  • 40. © 2021 Snowflake Inc. All Rights Reserved DATA DEVELOPMENT: PYTHON & SQL COMBINED 41 à Notebook
  • 41. © 2021 Snowflake Inc. All Rights Reserved DATA DEVELOPMENT: SCALA VIA SNOWPARK 42 SNOWPARK – A new developer experience that allows you to write Snowflake code in your preferred way, and execute it directly in Snowflake (incl. all platform benefits) IN PRIVATE-PREVIEW
  • 42. © 2021 Snowflake Inc. All Rights Reserved DATA DEVELOPMENT: SCALA VIA SNOWPARK 43 A comparative example SQL Snowpark Write, use functions. Monolithic, hard to debug Pipeline, easy to debug Repetitive, inflexible operations Flexible, reusable operations IN PRIVATE-PREVIEW
  • 43. © 2021 Snowflake Inc. All Rights Reserved DATA DEVELOPMENT: SCALA VIA SNOWPARK 44 Deep Dive and Demo of Preview Version Webinar Recording: Building Open Source Data Science Models on Snowflake IN PRIVATE-PREVIEW
  • 44. © 2021 Snowflake Inc. All Rights Reserved AGENDA – PART 2 45 Source: github.com/szilard/ml-prod (Dr. Szilard Pafka) ML LIFECYCLE Data Science Platform (DataRobot) • Auto ML • ML Ops • Model Deployment Experiment (Lab environment) Value added (ML in Production, at scale)
  • 45. © 2021 Snowflake Inc. All Rights Reserved. DATA SCIENCE (AUTO ML) PLATFORMS TO COMPLEMENT SNOWFLAKE CAPABILITIES © 2021 Snowflake Computing Inc. All Rights Reserved
  • 46. © 2021 Snowflake Inc. All Rights Reserved DATA SCIENCE PLATFORM – ARCHITECTURE (DATAROBOT) 47
  • 47. © 2021 Snowflake Inc. All Rights Reserved DATA SCIENCE PLATFORM – COMPONENTS (DATAROBOT) 48
  • 48. © 2021 Snowflake Inc. All Rights Reserved AUTO ML – LEADERBOARD OF A PROJECT 49 à DataRobot
  • 49. © 2021 Snowflake Inc. All Rights Reserved ML OPERATIONS 50 Any ML Ops solution capable and agile enough to handle complex situations has to be proficient in the following four critical areas to safely deliver machine learning applications in production at scale
  • 50. © 2021 Snowflake Inc. All Rights Reserved ML OPS – MONITORING (SERVICE HEALTH) 51
  • 51. © 2021 Snowflake Inc. All Rights Reserved ML OPS – MONITORING (MODEL ACCURACY) 52
  • 52. © 2021 Snowflake Inc. All Rights Reserved INTEGRATION OF PRODUCTIVE ML MODELS 53 © 2021 Snowflake Inc. All Rights Reserved
  • 53. © 2021 Snowflake Inc. All Rights Reserved CODING EXAMPLE: DATAROBOT & SNOWFLAKE 54 Notebook example: Snowflake + DataRobot Prediction API Call of DataRobot prediction API V Sample code: DataRobot Github Repo Open Snowflake cursor and and store prediction results in a table for further use
  • 54. © 2021 Snowflake Inc. All Rights Reserved MODEL DEPLOYMENT – SOURCE CODE EXPORT 55
  • 55. SNOWFLAKE INTEGRATION EXAMPLE Detailed Article: DataRobot Blog 56 Calling a productive Machine Learning Model via Snowflake External Functions
  • 56. © 2021 Snowflake Inc. All Rights Reserved. SESSION TAKEAWAY © 2021 Snowflake Computing Inc. All Rights Reserved
  • 57. © 2021 Snowflake Inc. All Rights Reserved SNOWFLAKE DATA CLOUD 58 DATA SOURCES OLTP DATABASES ENTERPRISE APPLICATIONS THIRD-PARTY WEB/LOG DATA IoT DATA CONSUMERS DATA MONETIZATION OPERATIONAL REPORTING AD HOC ANALYSIS REAL-TIME ANALYTICS
  • 58. © 2021 Snowflake Inc. All Rights Reserved A COMPLETE & EASY-TO-USE DATA PLATFORM… Structured Data Semi-Structured Data Web APIs IoT Data Visualization / Reporting Data Science Ad hoc Queries Data Sources Stage Presentation / Consumers JSON, AVRO (VARIANT) Hive Metastore Integration External Tables Parquet Load/Unload ANSI SQL Data Lake Warehouse Aggregation Semantic / Federated Elastic Multi- Cluster Compute Data Vault, 3NF Modeling ACID Transactional Consistency Data Masking/ Row-level Sec. *) Sec. + Material. Views Zero Copy Cloning SSO LDAP OAUTH SCIM ODBC/JDBC Python/R/Spark Connector End-to-End Security (RBAC, Encryption at Rest/in Motion) Web UI / Snowsight External Functions Data Sharing / Marketplace Streams (CDC) & Tasks (Scheduler) Time Travel Kafka-Connector / Snowpipe Stored Procs / UDFs Geospatial Snowflake supports Data Lake, Data Warehouse, and Data Engineering workloads Dimensional Modeling Information Schema 59 SnowPark *) (Scala) *) Roadmap item / in Preview
  • 59. SNOWFLAKE PLATFORM DATA ENGINEERING INTEGRATIONS …FOR ACCELERATING MACHINE LEARNING & AI ● Integration with AutoML platforms and Notebook- based ML ● Read and writeback ● Spark Connector ● Python Connector ● Apache Arrow support ● Snowpipe ● Kafka ● Streams/Tasks ● Reproducibility – Versioning - Features, Models ● “Share” data & results ● Feature Repository – Re-Use ● Time Travel ● Zero-copy cloning • Data Platform as a Service • Instant Scalability & Elasticity • Dedicated Virtual Warehouses • Store and use all data regardless of structure • Cross cloud replication • Pay per use – per second pricing • Data Marketplace & Exchange 60
  • 60. © 2021 Snowflake Inc. All Rights Reserved. Q & A © 2021 Snowflake Computing Inc. All Rights Reserved
  • 61. © 2021 Snowflake Inc. All Rights Reserved THANK YOU