SlideShare uma empresa Scribd logo
1 de 47
Baixar para ler offline
BIG DATA! 
Great! Now what? 
Ricard Clau 
SymfonyCon 2014
HELLO WORLD! 
• Ricard Clau, born and grown up in Barcelona 
• Server engineer at Another Place Productions 
• Symfony2 lover and PHP believer (sometimes…) 
• Open-source contributor, sometimes I give talks 
• Twitter (@ricardclau) / Gmail ricard.clau@gmail.com
WE WILL TALK ABOUT… 
• Where / How to store / query our “BIG” DATA 
• SQL vs NoSQL, why we ended up here? 
• Strengths and weaknesses of both approaches 
• PHP / Symfony Status with these technologies 
• Some war stories and recommendations
QUICK DISCLAIMERS 
• Not your average PHP talk, not sure if you will 
be able to use this next week at work 
• Continuous learner about all these technologies 
• 100M records is NOT BIG DATA
“Big data is like teenage sex; 
everyone talks about it, 
nobody really knows how to do it, 
everyone thinks everyone else is doing it, 
so everyone claims they are doing it”. 
Dan Ariely, Duke University
2 BIG PROBLEMS
PROBLEM 1: STORAGE
PROBLEM 2: QUERYING
A BIT OF HISTORY 
Maybe we have not learnt so much…
A (NOT SO) LONG TIME AGO 
• Programmers processed files directly 
• Lots of people doing the same, first 
databases appeared, different APIs, 
strengths and weaknesses 
• In the early 70s IBM came with the 
SEQUEL (Structured English Query 
Language) idea, and the rest is story
WHY NOSQL EXISTS? 
• RDBMS are not brilliant to scale horizontally 
• Google, Amazon, Facebook, etc… started building 
their own solutions to meet their unique needs 
• When your data does not fit in one box, you need to 
give up consistency or availability 
• Some problems need a different approach
THE CURRENT CHAOS
RDBMS SYSTEMS 
Old rockers never die
SQL 
• A “common” query language 
• We can normalise data and query it 
• Easy to do joins, filters, aggregations 
• We don’t need to know in advance how we access data 
• We rely on each database server’s query optimiser (and 
sometimes we need a DBA)
ACID PROPERTIES 
A C I D 
Atomicity 
Transactions 
are all or 
nothing 
Consistency 
A transaction 
is subject to a 
set of rules 
Isolation 
Transactions 
do not affect 
each other 
Durability 
Written data 
will not get 
lost
WE NEED ACID 
• Banking, logistics, finance, e-commerce,… 
• Systems we started building 30 years ago… and we 
still work on them generating millions of $ daily! 
• There are many applications that still fit the relational 
model and have structured data
USUAL PROBLEMS 
• You can painfully achieve sharding, but 
you need to give up some ACID goods 
• Tricky for unstructured data 
• Not great for small read / write ratio 
• Some data structures
TRICKY SCENARIOS 
• Geospatial queries for augmented reality 
• Leaderboards for social activity, Sets operations 
• Columnar aggregations on big tables 
• Graph data traversing to analyse your customers 
• Search engines over big chunks of text
NOSQL SYSTEMS 
Different problems, different solutions
BASE PROPERTIES 
• Basically Available: appears 
to work most of the time 
• Soft state: state of the 
system may change even 
without a query 
• Eventual consistency
CAP THEOREM 
• A shared-data system cannot guarantee 
simultaneously: 
• Consistency: All clients have the same view of the data 
• Availability: Each client can always read and write 
• Partition tolerance: The system works well even 
when there are network partitions
“During a network partition, a 
distributed system must choose 
between either Consistency or 
Availability”
Availability 
Consistency 
Partition 
Tolerance 
Single Node, 
mostly RDBMS 
(MySQL, PostgreSQL, 
DB2, SQLite…) 
All nodes same role 
(Cassandra, Riak, 
DynamoDB…) 
Special nodes (Zookeeper, HBase, 
MongoDB, Redis…)
CONSISTENT HASHING
I TOTALLY NEED ACID! 
Are you sure about that?
EVENTUAL CONSISTENCY 
If you are using master-slave replication, 
you already have eventual consistency in your reads
ANALYTICS / STATS 
We can possibly afford losing a small % of the data
TRANSACTIONS 
Bank transfers happen asynchronously as well!
WHAT ABOUT PHP & SYMFONY? 
Is there any hope for us?
PHP: BEST WEB PLATFORM? 
• PHP is still heavily used, despite its many quirks 
• Mature, actively maintained libraries for everything 
• Composer makes things much easier these days 
• Symfony bundles for almost everything 
• Some databases consider PHP a second class citizen
Key-value Graph 
Column Document
KEY-VALUE STORES 
• Simple APIs, easy to install and use. You are 
already using them for caching, sessions, etc… 
• PHP Extensions: memcached, phpredis 
• Libraries: nrk/predis, basho/riak, aws/aws-sdk-php 
• Bundles: snc/redis-bundle, leaseweb/memcache-bundle, 
kbrw/riak-bundle
GRAPH DATABASES 
• Very verbose queries, access via REST APIs 
• Maybe not mature enough for source of truth 
• Libraries: everyman/neo4jphp 
• Bundles: klaussilveira/neo4j-ogm-bundle 
• IMHO, one of the next big things
CYPHER QUERY EXAMPLES 
Top 5 Sushi restaurants 
in New York for 
Philip’s friends 
2nd degree co-actors 
who have never acted 
with Tom Hanks
COLUMN-BASED STORAGES 
• Possibly the most suitable for Big Data 
• Redshift supports SQL in a petabyte scale 
database 
• Libraries: thobbs/phpcassa, pop/pop_hbase, 
PDO for Redshift (with some quirks) 
• IMHO, Cassandra will become THE database
DOCUMENT DATABASES 
• MongoDB and Couchbase look very shiny… but the 
Internet is FULL of horror scaling stories 
• PHP Extensions: mongodb, couchbase 
• Libraries: doctrine/mongodb 
• Bundles: doctrine/mongodb-odm-bundle
SEARCH ENGINES 
• Mostly Lucene based 
• PHP Extensions: solr, sphinx 
• Libraries: solarium/solarium, elasticsearch/ 
elasticsearch 
• Bundles: nelmio/solarium-bundle, 
friendsofsymfony/elastica-bundle
DATA ANALYSIS 
All businesses need this!
QUERY VS PROCESSING 
• SQL is great because we can query by any field 
• There is no standard in NoSQL databases 
• NoSQL systems are more limited, only keys (some 
allow secondary indexes) or complex graph syntax 
• We sometimes need processing for complex queries
MAP-REDUCE
HADOOP VS SPARK 
• Techniques to extract subsets of the data (MAP) and 
operate them in parallel before aggregating (REDUCE) 
• Not real time, Hadoop the most popular 
• Apache Spark opens a new paradigm for near real-time 
• You need other languages for these techniques
FINAL THOUGHTS 
Now what?
ENGINEERING CHALLENGES 
• The Internet of things will generate real BIG DATA 
• SQL / ACID technologies are not going anywhere 
• Be very careful when using NoSQL in production 
• Databases… and life… are full of tradeoffs 
• The next decade will be fascinating for the industry
READ CAREFULLY THE DOCS
CHOOSE THE RIGHT TOOL
QUESTIONS? 
• Twitter: @ricardclau 
• E-mail: ricard.clau@gmail.com 
• Github: https://github.com/ricardclau 
• Please rate the talk at https://joind.in/talk/view/12958

Mais conteúdo relacionado

Mais procurados

(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep DiveAmazon Web Services
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive TutorialSandeep Patil
 
AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!Chris Taylor
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databasesAshwani Kumar
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나Amazon Web Services Korea
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMike Dirolf
 
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기2017 AWS DB Day | Amazon Aurora 자세히 살펴보기
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기Amazon Web Services Korea
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational DatabasesChris Baglieri
 
Postgresql Database Administration Basic - Day1
Postgresql  Database Administration Basic  - Day1Postgresql  Database Administration Basic  - Day1
Postgresql Database Administration Basic - Day1PoguttuezhiniVP
 
Amazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Web Services Korea
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture ForumChristopher Spring
 
Heterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTHeterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTAmazon Web Services
 

Mais procurados (20)

NoSql
NoSqlNoSql
NoSql
 
(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive(DAT401) Amazon DynamoDB Deep Dive
(DAT401) Amazon DynamoDB Deep Dive
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Apache Hive Tutorial
Apache Hive TutorialApache Hive Tutorial
Apache Hive Tutorial
 
AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!AWS Glue - let's get stuck in!
AWS Glue - let's get stuck in!
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Amazon DynamoDB 키 디자인 패턴
Amazon DynamoDB 키 디자인 패턴Amazon DynamoDB 키 디자인 패턴
Amazon DynamoDB 키 디자인 패턴
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
[AWS Builders] Effective AWS Glue
[AWS Builders] Effective AWS Glue[AWS Builders] Effective AWS Glue
[AWS Builders] Effective AWS Glue
 
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나
AWS Lake Formation을 통한 손쉬운 데이터 레이크 구성 및 관리 - 윤석찬 :: AWS Unboxing 온라인 세미나
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기2017 AWS DB Day | Amazon Aurora 자세히 살펴보기
2017 AWS DB Day | Amazon Aurora 자세히 살펴보기
 
Non Relational Databases
Non Relational DatabasesNon Relational Databases
Non Relational Databases
 
Postgresql Database Administration Basic - Day1
Postgresql  Database Administration Basic  - Day1Postgresql  Database Administration Basic  - Day1
Postgresql Database Administration Basic - Day1
 
Amazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB Day
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Redis overview for Software Architecture Forum
Redis overview for Software Architecture ForumRedis overview for Software Architecture Forum
Redis overview for Software Architecture Forum
 
Heterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCTHeterogenous Migration with DMS & SCT
Heterogenous Migration with DMS & SCT
 

Semelhante a Big Data! Great! Now What? #SymfonyCon 2014

Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Ricard Clau
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?CQD
 
Scaling with Symfony - PHP UK
Scaling with Symfony - PHP UKScaling with Symfony - PHP UK
Scaling with Symfony - PHP UKRicard Clau
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]Huy Do
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseAndreas Jung
 
Redis Everywhere - Sunshine PHP
Redis Everywhere - Sunshine PHPRedis Everywhere - Sunshine PHP
Redis Everywhere - Sunshine PHPRicard Clau
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Speed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisSpeed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisRicard Clau
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudChris Dagdigian
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling SoftwareAbdelmonaim Remani
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remanijaxconf
 

Semelhante a Big Data! Great! Now What? #SymfonyCon 2014 (20)

Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015
 
What ya gonna do?
What ya gonna do?What ya gonna do?
What ya gonna do?
 
Scaling with Symfony - PHP UK
Scaling with Symfony - PHP UKScaling with Symfony - PHP UK
Scaling with Symfony - PHP UK
 
NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]NoSQL for great good [hanoi.rb talk]
NoSQL for great good [hanoi.rb talk]
 
Why we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL DatabaseWhy we love ArangoDB. The hunt for the right NosQL Database
Why we love ArangoDB. The hunt for the right NosQL Database
 
Redis Everywhere - Sunshine PHP
Redis Everywhere - Sunshine PHPRedis Everywhere - Sunshine PHP
Redis Everywhere - Sunshine PHP
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Speed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisSpeed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with Redis
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Why ruby and rails
Why ruby and railsWhy ruby and rails
Why ruby and rails
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the Cloud
 
Be faster then rabbits
Be faster then rabbitsBe faster then rabbits
Be faster then rabbits
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
The Economies of Scaling Software
The Economies of Scaling SoftwareThe Economies of Scaling Software
The Economies of Scaling Software
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
The economies of scaling software - Abdel Remani
The economies of scaling software - Abdel RemaniThe economies of scaling software - Abdel Remani
The economies of scaling software - Abdel Remani
 
Database Technologies
Database TechnologiesDatabase Technologies
Database Technologies
 
Revision
RevisionRevision
Revision
 

Mais de Ricard Clau

NoEresTanEspecial-PulpoCon22.pdf
NoEresTanEspecial-PulpoCon22.pdfNoEresTanEspecial-PulpoCon22.pdf
NoEresTanEspecial-PulpoCon22.pdfRicard Clau
 
DevOps & Infraestructura como código: Promesas Rotas
DevOps & Infraestructura como código: Promesas RotasDevOps & Infraestructura como código: Promesas Rotas
DevOps & Infraestructura como código: Promesas RotasRicard Clau
 
DevOps Barcelona Conference 2018 - Intro
DevOps Barcelona Conference 2018 - IntroDevOps Barcelona Conference 2018 - Intro
DevOps Barcelona Conference 2018 - IntroRicard Clau
 
Hashicorp at holaluz
Hashicorp at holaluzHashicorp at holaluz
Hashicorp at holaluzRicard Clau
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOpsRicard Clau
 
Building a bakery of Windows servers with Packer - London WinOps
Building a bakery of Windows servers with Packer - London WinOpsBuilding a bakery of Windows servers with Packer - London WinOps
Building a bakery of Windows servers with Packer - London WinOpsRicard Clau
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP LondonRicard Clau
 
Escalabilidad y alto rendimiento con Symfony2
Escalabilidad y alto rendimiento con Symfony2Escalabilidad y alto rendimiento con Symfony2
Escalabilidad y alto rendimiento con Symfony2Ricard Clau
 
Betabeers Barcelona - Buenas prácticas
Betabeers Barcelona - Buenas prácticasBetabeers Barcelona - Buenas prácticas
Betabeers Barcelona - Buenas prácticasRicard Clau
 
Desymfony - Servicios
Desymfony  - ServiciosDesymfony  - Servicios
Desymfony - ServiciosRicard Clau
 

Mais de Ricard Clau (12)

devopsbcn23.pdf
devopsbcn23.pdfdevopsbcn23.pdf
devopsbcn23.pdf
 
devopsbcn22.pdf
devopsbcn22.pdfdevopsbcn22.pdf
devopsbcn22.pdf
 
NoEresTanEspecial-PulpoCon22.pdf
NoEresTanEspecial-PulpoCon22.pdfNoEresTanEspecial-PulpoCon22.pdf
NoEresTanEspecial-PulpoCon22.pdf
 
DevOps & Infraestructura como código: Promesas Rotas
DevOps & Infraestructura como código: Promesas RotasDevOps & Infraestructura como código: Promesas Rotas
DevOps & Infraestructura como código: Promesas Rotas
 
DevOps Barcelona Conference 2018 - Intro
DevOps Barcelona Conference 2018 - IntroDevOps Barcelona Conference 2018 - Intro
DevOps Barcelona Conference 2018 - Intro
 
Hashicorp at holaluz
Hashicorp at holaluzHashicorp at holaluz
Hashicorp at holaluz
 
What we talk about when we talk about DevOps
What we talk about when we talk about DevOpsWhat we talk about when we talk about DevOps
What we talk about when we talk about DevOps
 
Building a bakery of Windows servers with Packer - London WinOps
Building a bakery of Windows servers with Packer - London WinOpsBuilding a bakery of Windows servers with Packer - London WinOps
Building a bakery of Windows servers with Packer - London WinOps
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP London
 
Escalabilidad y alto rendimiento con Symfony2
Escalabilidad y alto rendimiento con Symfony2Escalabilidad y alto rendimiento con Symfony2
Escalabilidad y alto rendimiento con Symfony2
 
Betabeers Barcelona - Buenas prácticas
Betabeers Barcelona - Buenas prácticasBetabeers Barcelona - Buenas prácticas
Betabeers Barcelona - Buenas prácticas
 
Desymfony - Servicios
Desymfony  - ServiciosDesymfony  - Servicios
Desymfony - Servicios
 

Último

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Último (20)

Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Big Data! Great! Now What? #SymfonyCon 2014

  • 1. BIG DATA! Great! Now what? Ricard Clau SymfonyCon 2014
  • 2. HELLO WORLD! • Ricard Clau, born and grown up in Barcelona • Server engineer at Another Place Productions • Symfony2 lover and PHP believer (sometimes…) • Open-source contributor, sometimes I give talks • Twitter (@ricardclau) / Gmail ricard.clau@gmail.com
  • 3. WE WILL TALK ABOUT… • Where / How to store / query our “BIG” DATA • SQL vs NoSQL, why we ended up here? • Strengths and weaknesses of both approaches • PHP / Symfony Status with these technologies • Some war stories and recommendations
  • 4. QUICK DISCLAIMERS • Not your average PHP talk, not sure if you will be able to use this next week at work • Continuous learner about all these technologies • 100M records is NOT BIG DATA
  • 5. “Big data is like teenage sex; everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it”. Dan Ariely, Duke University
  • 9. A BIT OF HISTORY Maybe we have not learnt so much…
  • 10. A (NOT SO) LONG TIME AGO • Programmers processed files directly • Lots of people doing the same, first databases appeared, different APIs, strengths and weaknesses • In the early 70s IBM came with the SEQUEL (Structured English Query Language) idea, and the rest is story
  • 11.
  • 12. WHY NOSQL EXISTS? • RDBMS are not brilliant to scale horizontally • Google, Amazon, Facebook, etc… started building their own solutions to meet their unique needs • When your data does not fit in one box, you need to give up consistency or availability • Some problems need a different approach
  • 14. RDBMS SYSTEMS Old rockers never die
  • 15. SQL • A “common” query language • We can normalise data and query it • Easy to do joins, filters, aggregations • We don’t need to know in advance how we access data • We rely on each database server’s query optimiser (and sometimes we need a DBA)
  • 16. ACID PROPERTIES A C I D Atomicity Transactions are all or nothing Consistency A transaction is subject to a set of rules Isolation Transactions do not affect each other Durability Written data will not get lost
  • 17. WE NEED ACID • Banking, logistics, finance, e-commerce,… • Systems we started building 30 years ago… and we still work on them generating millions of $ daily! • There are many applications that still fit the relational model and have structured data
  • 18. USUAL PROBLEMS • You can painfully achieve sharding, but you need to give up some ACID goods • Tricky for unstructured data • Not great for small read / write ratio • Some data structures
  • 19. TRICKY SCENARIOS • Geospatial queries for augmented reality • Leaderboards for social activity, Sets operations • Columnar aggregations on big tables • Graph data traversing to analyse your customers • Search engines over big chunks of text
  • 20. NOSQL SYSTEMS Different problems, different solutions
  • 21. BASE PROPERTIES • Basically Available: appears to work most of the time • Soft state: state of the system may change even without a query • Eventual consistency
  • 22. CAP THEOREM • A shared-data system cannot guarantee simultaneously: • Consistency: All clients have the same view of the data • Availability: Each client can always read and write • Partition tolerance: The system works well even when there are network partitions
  • 23. “During a network partition, a distributed system must choose between either Consistency or Availability”
  • 24. Availability Consistency Partition Tolerance Single Node, mostly RDBMS (MySQL, PostgreSQL, DB2, SQLite…) All nodes same role (Cassandra, Riak, DynamoDB…) Special nodes (Zookeeper, HBase, MongoDB, Redis…)
  • 26. I TOTALLY NEED ACID! Are you sure about that?
  • 27. EVENTUAL CONSISTENCY If you are using master-slave replication, you already have eventual consistency in your reads
  • 28. ANALYTICS / STATS We can possibly afford losing a small % of the data
  • 29. TRANSACTIONS Bank transfers happen asynchronously as well!
  • 30. WHAT ABOUT PHP & SYMFONY? Is there any hope for us?
  • 31. PHP: BEST WEB PLATFORM? • PHP is still heavily used, despite its many quirks • Mature, actively maintained libraries for everything • Composer makes things much easier these days • Symfony bundles for almost everything • Some databases consider PHP a second class citizen
  • 33. KEY-VALUE STORES • Simple APIs, easy to install and use. You are already using them for caching, sessions, etc… • PHP Extensions: memcached, phpredis • Libraries: nrk/predis, basho/riak, aws/aws-sdk-php • Bundles: snc/redis-bundle, leaseweb/memcache-bundle, kbrw/riak-bundle
  • 34. GRAPH DATABASES • Very verbose queries, access via REST APIs • Maybe not mature enough for source of truth • Libraries: everyman/neo4jphp • Bundles: klaussilveira/neo4j-ogm-bundle • IMHO, one of the next big things
  • 35. CYPHER QUERY EXAMPLES Top 5 Sushi restaurants in New York for Philip’s friends 2nd degree co-actors who have never acted with Tom Hanks
  • 36. COLUMN-BASED STORAGES • Possibly the most suitable for Big Data • Redshift supports SQL in a petabyte scale database • Libraries: thobbs/phpcassa, pop/pop_hbase, PDO for Redshift (with some quirks) • IMHO, Cassandra will become THE database
  • 37. DOCUMENT DATABASES • MongoDB and Couchbase look very shiny… but the Internet is FULL of horror scaling stories • PHP Extensions: mongodb, couchbase • Libraries: doctrine/mongodb • Bundles: doctrine/mongodb-odm-bundle
  • 38. SEARCH ENGINES • Mostly Lucene based • PHP Extensions: solr, sphinx • Libraries: solarium/solarium, elasticsearch/ elasticsearch • Bundles: nelmio/solarium-bundle, friendsofsymfony/elastica-bundle
  • 39. DATA ANALYSIS All businesses need this!
  • 40. QUERY VS PROCESSING • SQL is great because we can query by any field • There is no standard in NoSQL databases • NoSQL systems are more limited, only keys (some allow secondary indexes) or complex graph syntax • We sometimes need processing for complex queries
  • 42. HADOOP VS SPARK • Techniques to extract subsets of the data (MAP) and operate them in parallel before aggregating (REDUCE) • Not real time, Hadoop the most popular • Apache Spark opens a new paradigm for near real-time • You need other languages for these techniques
  • 44. ENGINEERING CHALLENGES • The Internet of things will generate real BIG DATA • SQL / ACID technologies are not going anywhere • Be very careful when using NoSQL in production • Databases… and life… are full of tradeoffs • The next decade will be fascinating for the industry
  • 47. QUESTIONS? • Twitter: @ricardclau • E-mail: ricard.clau@gmail.com • Github: https://github.com/ricardclau • Please rate the talk at https://joind.in/talk/view/12958