SlideShare uma empresa Scribd logo
1 de 31
MIKE DRISCOLL CO-FOUNDER + CTO METAMARKETS @medriscoll making sense of data:  Lessons for start-ups
If it is unmanaged, you will be blind to weaknesses, deaf to new opportunities, and dumb to your customers. data IS SENSORY INPUT
Data is the sensory input that moves through it. your technology stack is your nervous system
Collecting customer data is a way to “get out of the building.” create feedback loops
customers
Complexity lies at the boundaries between systems make etl a priority
Real-Time Daily Weekly sync data latencieswith decision loops
All data models are wrong. Some data models are useful. don’t agonize overdata schemas
Hadoop is a processing layer You also need a query layer hadoop isn’t enough
Embrace a polyglot architecture of formats and data stores there is no‘One True database’
A RESTful query layer will reduce pain of migration. separate query& storage layers
Reduce the barriers to accessing data across systems. make data easy
“Human-time” means that queries return in seconds. make data fast
Human activity is small in size fully instrument your customers
Human activity is small in size. fully instrument your customers
Machine-generated data can quickly overwhelm. selectively instrument your machines
Machine-generated data can quickly overwhelm. selectively instrument your machines
Work backwards from business questions. Don’t let data architecture drive business needs architect aroundbusiness questions
Someone who can munge, model, & visualize data hire a data scientist
Engineers with a thin grasp of statistics beat statisticians with thin grasp of engineering. working code beats theoretical models
Isolated from production systems. Analytics are a different constituency with different needs create an analytics sandbox
Both internal & external obsess about dashboard design
Either by directly monetizing them or enhance customer experience extract value from yourdata assets
YOUR TECHNOLOGYSTACK IS YOUR NERVOUS SYSTEM.YOUR DATA IS YOUR SENSORY INPUT.
MIKE DRISCOLL CO-FOUNDER + CTO METAMARKETS @medriscoll making sense of data:  lessons for start-ups questions?

Mais conteúdo relacionado

Mais procurados

Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Saratoga
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case Muh Saleh
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsFairCom
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big DataTeemu Heikkilä
 
Is Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceIs Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceEdureka!
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKaran Desai
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemstaimur hafeez
 

Mais procurados (20)

BigData Analytics
BigData AnalyticsBigData Analytics
BigData Analytics
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Big data
Big dataBig data
Big data
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
Big data
Big dataBig data
Big data
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical Applications
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big Data
 
Is Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceIs Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystems
 
Big data abstract
Big data abstractBig data abstract
Big data abstract
 
Big data storage
Big data storageBig data storage
Big data storage
 
Case Study mypetstop
Case Study mypetstopCase Study mypetstop
Case Study mypetstop
 
Big Data
Big DataBig Data
Big Data
 
Clustrix Infographic
Clustrix InfographicClustrix Infographic
Clustrix Infographic
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 

Semelhante a Making Sense of Data

Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanLuke Caratan
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartPaul Boal
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of dataHarsha MV
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
Big Data
Big DataBig Data
Big DataNGDATA
 
How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management Abhishek Sood
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera, Inc.
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investmentvijayk23x
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data TipsQubole
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Dana Gardner
 

Semelhante a Making Sense of Data (20)

Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_Caratan
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and Smart
 
The new EDW
The new EDWThe new EDW
The new EDW
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of data
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
Big Data
Big DataBig Data
Big Data
 
How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
 
The Smarter Way To Manage Data
The Smarter Way To Manage DataThe Smarter Way To Manage Data
The Smarter Way To Manage Data
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
SegmentOfOne
SegmentOfOneSegmentOfOne
SegmentOfOne
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data Tips
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Making Sense of Data

Notas do Editor

  1. Feedback loops.
  2. Over the next set of slides, I’m going discuss some lessons as data moves through a start-ups organization...
  3. So this is how we frame our technology stack at my start-up, Metamarkets. It’s a four-tiered stack. I believe that many start-ups have similar stacks when they think about how data moves through them.But there’s something important missing here: your technology stack doesn’t exist in a vacuum.
  4. Over the next set of slides, I’m going discuss some lessons as data moves through a start-ups organization...
  5. To be successful, we’ve got to incorporate feedback, both from customers, and the larger world.Feedback is critical. Steve Blank and Eric Ries have talked about not iterating in a vacuum.The feedback you can achieve by managing your data can be incredibly important.
  6. Which begins at ingestion, and ends at the top with products.
  7. ETL often gets a bad wrap. Nothing could be more important to your company than moving data between systems.That is what ETL does. It should be a first class piece of your architecture, you should put one of top engineers at this layer of the stack.(At Metamarkets, we have a former VP of BlackRock working on ETL, and he’s been outstanding).When our ETL breaks down, the data stops flowing, and our business stops moving.
  8. * Don’t invest in real-time data if you’re making weekly decisions.* Moving away from batch systems is hard work.Alternatively, some systems – such as those required for monitoring – may need sub-millisecond response times.But as a general rule, reducing latency in systems creates value in unexpected ways.
  9. Don’t get bogged down in discussions of the perfect data format for your company. “All models are wrong, some models are useful.”There is no such thing.
  10. Which begins at ingestion, and ends at the top with products.
  11. You will likely end up using a variety of data stores in your organization.So don’t agonize over your data store choices.
  12. As you scale and grow, you will have to change storage layers.We went through three different versions, first Postgres, then Greenplum, then HBase, before developing on our own version.
  13. embrace standardssimple, flat formats wherever possible (XML is the clamshell packaging of data)We recently onboarded a client who gave us JSON data. It’s a beautiful thing.Everyone knows SQL: Cloudera found that Hadoop cluster use went up 10x when HIVE was installed.
  14. But HIVE isn’t going to cut it for getting quick insights into their data. No wants to wait 15 minutes for answers.Put in ETL flows that summarize data, and keep a core set of key business metrics in a “hot” database, one that can be queried in real-time.
  15. Feedback loops.
  16. Requirements for systems should be driven by their business needs.
  17. Which begins at ingestion, and ends at the top with products.
  18. but remember...
  19. 4sq explorepymkkaggle winnerswritten by individuals who were engineers first, statisticians second.when hiring folks to do your analytics, you want those who can roll up their sleaves and actually code the models themselvees.
  20. don’t make your analytics team compete for resources, or jeopardize production systemsthey will only get burned and then cut outset up systems where analytics folks can play with data, safelyanalytics often falls into the class of problems that are important, but not urgent. don’t let this happen to your organization.
  21. Which begins at ingestion, and ends at the top with products.
  22. Data represents the totality of a start-up’s sensory experiences.Absent a well-developed digital nervous system to respond to these inputs, you are blind to your deficiencies, deaf to your customers, and dumb to your opportunities.
  23. Either externally, as Klout,Flightcaster, and BillGuard have done.4SQ’s Explore and LinkedIn’s PYMK, has both improved User Experience.Having strong analytical talent in your organization is critical to success here.
  24. Feedback loops.