SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Big Data Tundra
Building Flexible Cloud Data Ecosystems
Phil GoerdtMike Fuller
© 2017 Red Pill Analytics
Agenda
- Introductions
- Building a Flexible Data Ecosystem
- Snowflake Data Warehouse
- StreamSets Data Collector
- Looker
- Questions
© 2017 Red Pill Analytics
RED PILL ANALYTICS: WHAT WE DO
CAPACITY ANALYTICS CHECKMATE SINGLE DOSES
BI Development as a Service
Agile Development
Cloud Enabled
Continuous Integration
Support
Check-in & Automate
True Multi-User Development
Full Source Control
Continuous Integration
Hosted or On-Premise
Strategy & Roadmap
Architecture
Prototyping
Infrastructure
Training
© 2017 Red Pill Analytics
Mike Fuller
- Consultant at Red Pill Analytics
- Primarily a Release Lead & Developer
- Full-Stack BI
- Have been working with data for 9 years
- Focused on Business Intelligence for 5 years
- Worn several IT and Business hats
- Business Analyst, Reporting Analyst, Data
Analyst, BI Engineer, BI Developer, Data
Architect, Warehouse Architect, Release Lead,
Consultant, etc … Just don’t call me late for
dinner
© 2017 Red Pill Analytics
Phil Goerdt
- “Full stack” BI and DI guy passionate about using
and making data better
- Have worked with data for last 8+ years
- Last 5+ have been BI focused
- Consultant at Red Pill Analytics
- Hipster Alert: Have played the part of data
scientist… before it was a buzzword!
- Find me on LinkedIn, Twitter, Medium and the
Red Pill Blog
© 2017 Red Pill Analytics
What is a Flexible Data Ecosystem?
© 2017 Red Pill Analytics
Client Engagement - Requirements
- Consolidate disparate data to promote
conformity and uniformity
- One central location for all data to power
data-driven decisions for the enterprise
- Create a data lake to collect and structure
business process data
- Use data from the data lake downstream to
power reporting and analytics
- “How can we prevent our data lake from turning
into a data swamp?”
- Cloud-based
- Limit up-front costs
- Follow KISS modus operandi - do not over
engineer
© 2017 Red Pill Analytics
A Flexible Data Ecosystem Will...
- Allow developers to work using agile
methodology
- Allow data to persist in multiple states
- Allow data to be consumed in multiple use cases
- Be adaptable to future change
- Allocate resources (compute/storage) on demand
- Accommodate varying data types
- Do all of the above while maintaining sensible
governance
© 2017 Red Pill Analytics
How Do We Turn This Into Something Real?
Logging Data
Application Data
Data Lake Reports &
Analytics
© 2017 Red Pill Analytics
A Cloud Ecosystem in the Wild
Logging Data
Application DataApplication DataApplication Data
© 2017 Red Pill Analytics
Why Choose Snowflake?
© 2017 Red Pill Analytics
Snowflakes are Made in Clouds
- Snowflake is a truly “cloud engineered” data
warehouse solution
- Hardware and virtualware constraints have been
lifted by storage and compute being separated
- This allows for truly scalability on both sides of
the cluster
© 2017 Red Pill Analytics
Snowflake: The Database (Storage)
- AWS S3 holds all of the data inserted into
Snowflake
- Virtually no limit for data!
- This allows for querying of non-traditional data
types such as JSON or Avro
- Queries written in ANSI SQL allow developers to
use a language they already know
© 2017 Red Pill Analytics
Snowflake: The Warehouse (Compute)
- Compute is run on AWS EC2 instance clusters
- This allows for scaling horizontally (cluster
count) and vertically (node count) compute
resources cheaply and easily!
- This also allows different sized clusters for
different needs
- Example scenarios:
- Data scientists may need large compute for
a short period of time
- Finance users smaller compute all the time
© 2017 Red Pill Analytics
Maintenance?! Patching?! Deployment Lifecycle Management?!
© 2017 Red Pill Analytics
Getting Answers, Not Fighting Fires
- Snowflake managing the platform = less
headaches (for you and DBAs)
- 11 9’s Durable & 4 9’s Available
- Compliant for security requirements such as:
PII/PHI/HIPAA
- OpEx model allows for small initial investment
compared to CapEx seen with traditional data
warehousing solutions
- “Oops” features like Time Travel and Undrop
allow for recovery in development
© 2017 Red Pill Analytics
Performance Anxiety? Not Here!
- One Snowflake instance can contain multiple
databases
- Users can query across databases which is
beneficial for “unsiloing” data
- Self-contained querying across databases
facilitates data lake/reservoir concept with
better organization (no data swamp!)
- Zero Copy Clone: create additional environments
at no additional costs
- Performs well on large data sets
- Data Warehouse = Optimized for analytic queries
© 2017 Red Pill Analytics
Streaming: Getting the Data to the Cloud
© 2017 Red Pill Analytics
StreamSets Data Collector Integration
- Low-latency ingest infrastructure tool
- Create continuous data ingest pipelines using a
drag and drop UI
- Open Source and runs on Linux or Mac OS
- Native integrations with AWS, HDFS, SFDC,
Mongo, etc.
- JDBC Connectivity
- Simple to stream into S3 and use bucketed
object stores in lieu of typical DFS solutions
- Natural candidate for deployment via EC2 to
leverage AWS platform (security,CLI, etc.)
© 2017 Red Pill Analytics
Data Through the Looking Glass
© 2017 Red Pill Analytics
Looking at the Data… with Looker
- Data analytics platform
- Uses a logical/semantic modeling layer allowing
for enterprise friendly delivery
- Integrates with GIT for version control
- Available in the Cloud (SaaS) or on-premises
- Allows creation of reports and dashboards for
content distribution among users
- Natively connects directly to the Snowflake
database and supports wide table querying
© 2017 Red Pill Analytics
Would We Build It Again?
© 2017 Red Pill Analytics
Yeah, We Definitely Would
- StreamSets Data Collector is a simple-to-use,
high performing data streaming tool and it is
free
- Snowflake’s S3 storage allowed us to build the
warehouse as and when we saw fit
- Allows for a truly agile approach to development
of a data warehouse
- No need to worry about sizing requirements due
to Snowflake’s flexibility
- Could build the facts, dimensions and views in
Looker as requirements were received and
understood
© 2017 Red Pill Analytics
Review: A Cloud Ecosystem in the Wild
Logging Data
Application DataApplication DataApplication Data
© 2017 Red Pill Analytics
Questions?
© 2017 Red Pill Analytics
© 2017 Red Pill Analytics
WHAT WE DO
CAPACITY ANALYTICS CHECKMATE SINGLE DOSES
BI Development as a Service
Agile Development
Cloud Enabled
Continuous Integration
Support
Check-in & Automate
True Multi-User Development
Full Source Control
Continuous Integration
Hosted or On-Premise
Strategy & Roadmap
Architecture
Prototyping
Infrastructure
Training
© 2017 Red Pill Analytics
Capacity-driven: Add capacity to your team by choosing small, medium,
or large, with flexibility to increase or decrease as needed.
Flexible: You receive an allowance of points each sprint to spend
however you choose.
Agile: We have a complete methodology for how to deliver your
initiatives rapidly.
Fluid: Hire a team, not a person. Tasks are assigned to the right person
with the right skill.
Our approach to rapidly deliver analytics to you.
Capacity Analytics
© 2017 Red Pill Analytics
twitter.com/RedPillA
linkedin.com/company/red-pill-analytics
youtube.com/redpillanalytics
youtube.com/realtimebi
facebook.com/redpillanalytics
redpillanalytics.com
bit.ly/datavizdaily-playlist
Follow us:

Mais conteúdo relacionado

Último

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Destaque

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming LanguageSimplilearn
 

Destaque (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Big Data Tundra: Creating a Flexible Cloud Based Data Ecosystem

  • 1.
  • 2. Big Data Tundra Building Flexible Cloud Data Ecosystems Phil GoerdtMike Fuller
  • 3. © 2017 Red Pill Analytics Agenda - Introductions - Building a Flexible Data Ecosystem - Snowflake Data Warehouse - StreamSets Data Collector - Looker - Questions
  • 4. © 2017 Red Pill Analytics RED PILL ANALYTICS: WHAT WE DO CAPACITY ANALYTICS CHECKMATE SINGLE DOSES BI Development as a Service Agile Development Cloud Enabled Continuous Integration Support Check-in & Automate True Multi-User Development Full Source Control Continuous Integration Hosted or On-Premise Strategy & Roadmap Architecture Prototyping Infrastructure Training
  • 5. © 2017 Red Pill Analytics Mike Fuller - Consultant at Red Pill Analytics - Primarily a Release Lead & Developer - Full-Stack BI - Have been working with data for 9 years - Focused on Business Intelligence for 5 years - Worn several IT and Business hats - Business Analyst, Reporting Analyst, Data Analyst, BI Engineer, BI Developer, Data Architect, Warehouse Architect, Release Lead, Consultant, etc … Just don’t call me late for dinner
  • 6. © 2017 Red Pill Analytics Phil Goerdt - “Full stack” BI and DI guy passionate about using and making data better - Have worked with data for last 8+ years - Last 5+ have been BI focused - Consultant at Red Pill Analytics - Hipster Alert: Have played the part of data scientist… before it was a buzzword! - Find me on LinkedIn, Twitter, Medium and the Red Pill Blog
  • 7. © 2017 Red Pill Analytics What is a Flexible Data Ecosystem?
  • 8. © 2017 Red Pill Analytics Client Engagement - Requirements - Consolidate disparate data to promote conformity and uniformity - One central location for all data to power data-driven decisions for the enterprise - Create a data lake to collect and structure business process data - Use data from the data lake downstream to power reporting and analytics - “How can we prevent our data lake from turning into a data swamp?” - Cloud-based - Limit up-front costs - Follow KISS modus operandi - do not over engineer
  • 9. © 2017 Red Pill Analytics A Flexible Data Ecosystem Will... - Allow developers to work using agile methodology - Allow data to persist in multiple states - Allow data to be consumed in multiple use cases - Be adaptable to future change - Allocate resources (compute/storage) on demand - Accommodate varying data types - Do all of the above while maintaining sensible governance
  • 10. © 2017 Red Pill Analytics How Do We Turn This Into Something Real? Logging Data Application Data Data Lake Reports & Analytics
  • 11. © 2017 Red Pill Analytics A Cloud Ecosystem in the Wild Logging Data Application DataApplication DataApplication Data
  • 12. © 2017 Red Pill Analytics Why Choose Snowflake?
  • 13. © 2017 Red Pill Analytics Snowflakes are Made in Clouds - Snowflake is a truly “cloud engineered” data warehouse solution - Hardware and virtualware constraints have been lifted by storage and compute being separated - This allows for truly scalability on both sides of the cluster
  • 14. © 2017 Red Pill Analytics Snowflake: The Database (Storage) - AWS S3 holds all of the data inserted into Snowflake - Virtually no limit for data! - This allows for querying of non-traditional data types such as JSON or Avro - Queries written in ANSI SQL allow developers to use a language they already know
  • 15. © 2017 Red Pill Analytics Snowflake: The Warehouse (Compute) - Compute is run on AWS EC2 instance clusters - This allows for scaling horizontally (cluster count) and vertically (node count) compute resources cheaply and easily! - This also allows different sized clusters for different needs - Example scenarios: - Data scientists may need large compute for a short period of time - Finance users smaller compute all the time
  • 16. © 2017 Red Pill Analytics Maintenance?! Patching?! Deployment Lifecycle Management?!
  • 17. © 2017 Red Pill Analytics Getting Answers, Not Fighting Fires - Snowflake managing the platform = less headaches (for you and DBAs) - 11 9’s Durable & 4 9’s Available - Compliant for security requirements such as: PII/PHI/HIPAA - OpEx model allows for small initial investment compared to CapEx seen with traditional data warehousing solutions - “Oops” features like Time Travel and Undrop allow for recovery in development
  • 18. © 2017 Red Pill Analytics Performance Anxiety? Not Here! - One Snowflake instance can contain multiple databases - Users can query across databases which is beneficial for “unsiloing” data - Self-contained querying across databases facilitates data lake/reservoir concept with better organization (no data swamp!) - Zero Copy Clone: create additional environments at no additional costs - Performs well on large data sets - Data Warehouse = Optimized for analytic queries
  • 19. © 2017 Red Pill Analytics Streaming: Getting the Data to the Cloud
  • 20. © 2017 Red Pill Analytics StreamSets Data Collector Integration - Low-latency ingest infrastructure tool - Create continuous data ingest pipelines using a drag and drop UI - Open Source and runs on Linux or Mac OS - Native integrations with AWS, HDFS, SFDC, Mongo, etc. - JDBC Connectivity - Simple to stream into S3 and use bucketed object stores in lieu of typical DFS solutions - Natural candidate for deployment via EC2 to leverage AWS platform (security,CLI, etc.)
  • 21. © 2017 Red Pill Analytics Data Through the Looking Glass
  • 22. © 2017 Red Pill Analytics Looking at the Data… with Looker - Data analytics platform - Uses a logical/semantic modeling layer allowing for enterprise friendly delivery - Integrates with GIT for version control - Available in the Cloud (SaaS) or on-premises - Allows creation of reports and dashboards for content distribution among users - Natively connects directly to the Snowflake database and supports wide table querying
  • 23. © 2017 Red Pill Analytics Would We Build It Again?
  • 24. © 2017 Red Pill Analytics Yeah, We Definitely Would - StreamSets Data Collector is a simple-to-use, high performing data streaming tool and it is free - Snowflake’s S3 storage allowed us to build the warehouse as and when we saw fit - Allows for a truly agile approach to development of a data warehouse - No need to worry about sizing requirements due to Snowflake’s flexibility - Could build the facts, dimensions and views in Looker as requirements were received and understood
  • 25. © 2017 Red Pill Analytics Review: A Cloud Ecosystem in the Wild Logging Data Application DataApplication DataApplication Data
  • 26. © 2017 Red Pill Analytics Questions?
  • 27. © 2017 Red Pill Analytics
  • 28. © 2017 Red Pill Analytics WHAT WE DO CAPACITY ANALYTICS CHECKMATE SINGLE DOSES BI Development as a Service Agile Development Cloud Enabled Continuous Integration Support Check-in & Automate True Multi-User Development Full Source Control Continuous Integration Hosted or On-Premise Strategy & Roadmap Architecture Prototyping Infrastructure Training
  • 29. © 2017 Red Pill Analytics Capacity-driven: Add capacity to your team by choosing small, medium, or large, with flexibility to increase or decrease as needed. Flexible: You receive an allowance of points each sprint to spend however you choose. Agile: We have a complete methodology for how to deliver your initiatives rapidly. Fluid: Hire a team, not a person. Tasks are assigned to the right person with the right skill. Our approach to rapidly deliver analytics to you. Capacity Analytics
  • 30. © 2017 Red Pill Analytics twitter.com/RedPillA linkedin.com/company/red-pill-analytics youtube.com/redpillanalytics youtube.com/realtimebi facebook.com/redpillanalytics redpillanalytics.com bit.ly/datavizdaily-playlist Follow us: