Enviar pesquisa
Carregar
Making Big Data Work
•
0 gostou
•
522 visualizações
Lewis Crawford
Seguir
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 23
Baixar agora
Baixar para ler offline
Recomendados
Bootstrap Big Data Webinar
Bootstrap Big Data Webinar
Jane Truch
Data Engineering @ Patistic Innovations
Data Engineering @ Patistic Innovations
Patistic Innovations
Making big data work
Making big data work
Ed Thewlis
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
huguk
Reinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapR
Lilia Gutnik
Big Data – From Strategy to Production
Big Data – From Strategy to Production
Semantic Web Company
De groote de man Ingrid de Poorter
De groote de man Ingrid de Poorter
BigDataExpo
Microsoft jeroen ter heerdt
Microsoft jeroen ter heerdt
BigDataExpo
Recomendados
Bootstrap Big Data Webinar
Bootstrap Big Data Webinar
Jane Truch
Data Engineering @ Patistic Innovations
Data Engineering @ Patistic Innovations
Patistic Innovations
Making big data work
Making big data work
Ed Thewlis
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
huguk
Reinventing the Modern Information Pipeline: Paxata and MapR
Reinventing the Modern Information Pipeline: Paxata and MapR
Lilia Gutnik
Big Data – From Strategy to Production
Big Data – From Strategy to Production
Semantic Web Company
De groote de man Ingrid de Poorter
De groote de man Ingrid de Poorter
BigDataExpo
Microsoft jeroen ter heerdt
Microsoft jeroen ter heerdt
BigDataExpo
Talend mike hirt
Talend mike hirt
BigDataExpo
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
Dataconomy Media
Anchormen corne versloot
Anchormen corne versloot
BigDataExpo
Big data
Big data
promediakw
3D Data Strategy Framework
3D Data Strategy Framework
Daniel Ren
Big Data
Big Data
Kiran Jamil
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
Infinit-O Global, Limited
Multi Cloud Data Integration- Retail
Multi Cloud Data Integration- Retail
alanwaler
Alliander robin hagemans daniel peyron
Alliander robin hagemans daniel peyron
BigDataExpo
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft
Data is cheap; strategy still matters by Jason Lee
Data is cheap; strategy still matters by Jason Lee
Data Con LA
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Hortonworks
Big data, your data, all data - Frederik Vandeputte
Big data, your data, all data - Frederik Vandeputte
InspireX
Mastech digital to acquire infoTrellis
Mastech digital to acquire infoTrellis
Mastech Digital
The truth is out there
The truth is out there
Mike Davis
London Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentation
KETL Limited
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Jari Koister
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
ibi
Talend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK event
KETL Limited
Big Data API’s and Analytics
Big Data API’s and Analytics
Andy Brauer
Data Mining - The Big Picture!
Data Mining - The Big Picture!
Khalid Salama
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
Mais conteúdo relacionado
Mais procurados
Talend mike hirt
Talend mike hirt
BigDataExpo
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
Dataconomy Media
Anchormen corne versloot
Anchormen corne versloot
BigDataExpo
Big data
Big data
promediakw
3D Data Strategy Framework
3D Data Strategy Framework
Daniel Ren
Big Data
Big Data
Kiran Jamil
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
Infinit-O Global, Limited
Multi Cloud Data Integration- Retail
Multi Cloud Data Integration- Retail
alanwaler
Alliander robin hagemans daniel peyron
Alliander robin hagemans daniel peyron
BigDataExpo
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft
Data is cheap; strategy still matters by Jason Lee
Data is cheap; strategy still matters by Jason Lee
Data Con LA
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Hortonworks
Big data, your data, all data - Frederik Vandeputte
Big data, your data, all data - Frederik Vandeputte
InspireX
Mastech digital to acquire infoTrellis
Mastech digital to acquire infoTrellis
Mastech Digital
The truth is out there
The truth is out there
Mike Davis
London Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentation
KETL Limited
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Jari Koister
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
ibi
Talend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK event
KETL Limited
Big Data API’s and Analytics
Big Data API’s and Analytics
Andy Brauer
Mais procurados
(20)
Talend mike hirt
Talend mike hirt
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
Anchormen corne versloot
Anchormen corne versloot
Big data
Big data
3D Data Strategy Framework
3D Data Strategy Framework
Big Data
Big Data
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
A Comprehensive Guide to Data Management for Businesses by Infinit Datum
Multi Cloud Data Integration- Retail
Multi Cloud Data Integration- Retail
Alliander robin hagemans daniel peyron
Alliander robin hagemans daniel peyron
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Microsoft Next 2014 - Insights session 2 - Turning data into a business advan...
Data is cheap; strategy still matters by Jason Lee
Data is cheap; strategy still matters by Jason Lee
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Big data, your data, all data - Frederik Vandeputte
Big data, your data, all data - Frederik Vandeputte
Mastech digital to acquire infoTrellis
Mastech digital to acquire infoTrellis
The truth is out there
The truth is out there
London Jaspersoft Community User Group Event 2 KETL presentation
London Jaspersoft Community User Group Event 2 KETL presentation
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Talk at IEEE Big Data/Cloud conference in Santa Clara, June 28th, 2013.
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
Talend community user group Bristol & SW UK event
Talend community user group Bristol & SW UK event
Big Data API’s and Analytics
Big Data API’s and Analytics
Destaque
Data Mining - The Big Picture!
Data Mining - The Big Picture!
Khalid Salama
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
Big Tools for Big Data
Big Tools for Big Data
Lewis Crawford
Introduction to Big Data
Introduction to Big Data
Srinath Perera
Big Data v Data Mining
Big Data v Data Mining
University of Hertfordshire
What is Big Data?
What is Big Data?
Bernard Marr
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Luminary Labs
Destaque
(7)
Data Mining - The Big Picture!
Data Mining - The Big Picture!
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
Big Tools for Big Data
Big Tools for Big Data
Introduction to Big Data
Introduction to Big Data
Big Data v Data Mining
Big Data v Data Mining
What is Big Data?
What is Big Data?
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Semelhante a Making Big Data Work
Big analytics best practices @ PARC
Big analytics best practices @ PARC
Jim Kaskade
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
☁Jake Weaver ☁
ZEDventures-highres
ZEDventures-highres
Jeremy Stierwalt
Big Data at a Glance
Big Data at a Glance
Softweb Solutions
IP&A109 Next-Generation Analytics Architecture for the Year 2020
IP&A109 Next-Generation Analytics Architecture for the Year 2020
Anjan Roy, PMP
Key note big data analytics ecosystem strategy
Key note big data analytics ecosystem strategy
IBM Sverige
BigInsights BigData Study 2013 - Exec Summary
BigInsights BigData Study 2013 - Exec Summary
BigInsights
MWLUG2017 - The Data & Analytics Journey 2.0
MWLUG2017 - The Data & Analytics Journey 2.0
John Head
Data as a Service (DaaS): The What, Why, How, Who, and When
Data as a Service (DaaS): The What, Why, How, Who, and When
RocketSource
Data Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation Analytics
Denodo
Keyrus US Information
Keyrus US Information
Devon Ziegenfuss
Keyrus US Information
Keyrus US Information
Julian Tong
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Caserta
Big agendas for big data analytics projects
Big agendas for big data analytics projects
The Marketing Distillery
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
The Art of Data Science - event slides
The Art of Data Science - event slides
RedPixie
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information Access
Inside Analysis
Getting down to business on Big Data analytics
Getting down to business on Big Data analytics
The Marketing Distillery
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
DATAVERSITY
Semelhante a Making Big Data Work
(20)
Big analytics best practices @ PARC
Big analytics best practices @ PARC
Accelerating Time to Success for Your Big Data Initiatives
Accelerating Time to Success for Your Big Data Initiatives
ZEDventures-highres
ZEDventures-highres
Big Data at a Glance
Big Data at a Glance
IP&A109 Next-Generation Analytics Architecture for the Year 2020
IP&A109 Next-Generation Analytics Architecture for the Year 2020
Key note big data analytics ecosystem strategy
Key note big data analytics ecosystem strategy
BigInsights BigData Study 2013 - Exec Summary
BigInsights BigData Study 2013 - Exec Summary
MWLUG2017 - The Data & Analytics Journey 2.0
MWLUG2017 - The Data & Analytics Journey 2.0
Data as a Service (DaaS): The What, Why, How, Who, and When
Data as a Service (DaaS): The What, Why, How, Who, and When
Data Virtualization - Enabling Next Generation Analytics
Data Virtualization - Enabling Next Generation Analytics
Keyrus US Information
Keyrus US Information
Keyrus US Information
Keyrus US Information
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Big agendas for big data analytics projects
Big agendas for big data analytics projects
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
The Art of Data Science - event slides
The Art of Data Science - event slides
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information Access
Getting down to business on Big Data analytics
Getting down to business on Big Data analytics
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
Making Big Data Work
1.
Making Big Data
work Lewis Crawford Principal Architect @ the DataShed thedatashed.co.uk Lewis@thedatashed.co.uk © the DataShed Limited 2015
2.
intro
3.
Who am I? •
For the last 3 years, the DataShed has been providing consultancy services to a vast array of large clients. Our primary focus is ensuring that technology and analytical strategies are truly aligned so that businesses can leverage the latest and greatest in technology to model, mine and describe their data asset. • We were working with Big Data technology before the term was coined, we have experience delivering analytical systems driven by Petabyte data sets, and have designed, implemented and supported one of the largest real-time data integration and predictive analytics platforms in the aviation world. • Our model is based on using a small number of exceptionally highly skilled individuals to deliver disruptive and innovative solutions in an agile and delivery-focused manner. © the DataShed Limited 2015
4.
So what is
‘Big Data’? © the DataShed Limited 2015
5.
6.
Why do Big
Data projects fail? Too many people think that Big Data is: “The belief that the more data you have, the more insights and answers will rise automatically from the pool of ones and zeros.” Gill Press, Forbes.com © the DataShed Limited 2015
7.
How to make
Big Data work? 1. Understand your problem 2. Apply appropriate tools 3. Automate everything. © the DataShed Limited 2015
8.
Real-time data © the DataShed Limited 2015
9.
© the DataShed Limited 2015
10.
11.
© the DataShed Limited 2015
12.
Continuous Integration Demo © the DataShed Limited
2015
13.
How to make
Big Data work? 1. Understand your problem 2. Apply appropriate tools 3. Automate everything. © the DataShed Limited 2015
14.
Little Big Data © the DataShed Limited
2015
15.
A problem closer
to home… • Every business needs to understand: • Their potential customers and market • Current customers • Their products and sales • How and when they engage prospects and customers • Analytics and data are expensive • Many of the mandatory elements are very similar for everyone • The DataShed is Analytics as a Service and Single Customer View as a Service. © the DataShed Limited 2015
16.
The deduplication problem… •
SME has 250,000 customers (two systems of record) • To identify duplicates brute force approach: 31,249,875,000 comparisons • Building a system to process a minimum of 100 clients a day… • 3.1 trillion records to compare using > 10 different algorithms • Traditional scale up approach would be expensive, and makes large assumptions around blocking and partitioning rules • A small data problem but a big data solution? Title First Name Surname Address 1 Address 2 Address 3 Dr R J Smith TwoOaks 112 Old St. County Durham Mrs Robyn Smith 112 Old Street Durham DH1 5YJ © the DataShed Limited 2015
17.
© the DataShed Limited 2015
18.
The Shed demo © the DataShed Limited
2015
19.
How to make
Big Data work? 1. Understand your problem 2. Apply appropriate tools 3. Automate everything. © the DataShed Limited 2015
20.
How to make
Big Data work? 1. Understand your problem • ’Big Data’ challenges aren’t necessarily new, however much of the technology is • Articulate and communicate – focus on distilling your problem down • Incremental improvement not wholesale replacement 2. Apply appropriate tools • Understand the economics as well as the technology • New technologies need to be evaluated within the context of your problem scope • New technologies are enablers not deliverables (#datalake) • ’Big Data’ technology should be seen as complementary to existing technology 3. Automate everything • Continuous integration to include all testing • Containerise where possible • Measure everything © the DataShed Limited 2015
21.
If you really
want to get involved… © the DataShed Limited 2015
22.
Get your hands
dirty If you’re interested in learning more, we’ll be hosting a hands-on labs event in the near future. Send your details to: Email: hello@thedatashed.co.uk Twitter: @thedatashed © the DataShed Limited 2015
23.
Any questions? © the DataShed Limited 2015 Lewis
Crawford Principal Architect @ the DataShed thedatashed.co.uk Lewis@thedatashed.co.uk
Baixar agora